Inflated pageview numbers from repeated (not intentionally refreshed) pageviews

Hi all,

I was tasked to look into an ongoing issue we have: that occasionally some visitors have multiple (sometimes hundreds of) pageviews per visit for the same page/action. The records seem suspicious as the delay between the repeats seem to be about 15-19 seconds - it looks as if the users open a tab and leave it open (you can imagine that the pageview numbers tend to be high when left overnight). It occurs mainly for Safari users, though I have observed some unusual cases with Chrome and Firefox (on Mac and Windows 7).

Some other details:

  • Highly probable for Safari users that have a session more than 10 minutes
  • Seen on mobile (Safari) versions too
  • The tracker is on a page with a number of other scripts and external trackers
  • Currently the Javascript tracker code used is the old document.write version
  • Using version 1.11 at the moment, though the problem has persisted since we started tracking (1.10, though maybe since 1.09)
  • The tracker is on a fairly busy site, with doctype XHTML 1.0 Transitional

The symptoms look the same as if you would put the tracker code on a page that refreshes every 15 s. I’ve been focusing on browser-specific quirks and though the old auto-refresh issues with the Webkit implementation (Stop Safari Auto-Refreshing Web Pages in Mac OS X Lion) as I’ve seen it affect Chrome, but I’m not so sure.

I would like to know:

  • Is this a common issue? Has anyone seen such a pattern? I’ve hunted around and not found any other reports like this
  • Can anyone confirm browser quirks when document.write and DOM-append are combined
  • I’ve tried to test with a minimal web page setup but not been able to replicate the problem. What other ways can I introduce to the test page to try and emulate the conditions that cause this?

If you could also add info on server setup and versioning, cms setup of pages created, it might help in identifying any issues. Also take a look at web logs and see if there is any errors there that might point to things.

regards

Thanks for the swift reply!

I was afraid of that. I’m not too familiar with the live server setup and asking for web logs is rather painful (and I probably can’t reveal too much details with the implementations). From what I have seen of the web logs multiple requests have been fired, coinciding with the visitor logs.

Mainly, I was interested if anyone had found similar issues - and if there was a work around. As it affects mainly Safari users, it seems to be an browser issue. With a range of third-party scripts on the page I’m starting to edge towards the hunch that some ‘inbetween-state’ causes the browser to refresh (for a different case of unexpected refreshes: I found that empty img tags can cause duplicate requests).

I am pushing for the use of the latest _paq method, though it would be good if it also magically solved this issue.

Edit: I’ve considered using the heartbeat, but I have been expressly been instructed to avoid extra requests… bleeeh
Also, I wondered if I store a custom variable on user interaction (scroll, focus, mouse, etc) so it can flag that on exit that a pageview was not ‘interactive enough’… Not sure how/if it can be done

Have seen this problem very rarely, maybe a browser addon or similar…
Did you notice if there is always 15s between each refresh?

I have encountered ways in the past where you can unintentionally repeat a page request but I haven’t seen evidence in this case. Hence I have my suspicions on WebKit (on Mac) and memory management and/or plugins - and since it seems to be almost all Safari visits (5.1, 6.0 on desktop and mobile devices) that show the symptoms. It also doesn’t matter if they are a returning or new visitor.

Yes, the refresh/reloads are fairly consistent though not constant: the actual times I’ve seen most delays are closer to 20 seconds (19-20s) depending on the time of day, though I have seen a minimum of 15 through to 22.

Edit: A typical sequence from the live API:

Mon 8 Jul 22:59:3918

Mon 8 Jul 22:59:5722

Mon 8 Jul 23:00:1919

This is part of a 48-action visit, with only 2 unique pages that lasted 13 minutes and 52 seconds.

One other thing at one point was the “Empty tab” which loaded the websites showed in the “empty tab/start tab” in the background… but Piwik should now exclude those from tracking

“Empty tab”? You’ve lost me there - I have seen that there is code for ignoring prerendered views, sure.

I’ve not been able to replicate the issue myself yet, so I’m not completely certain it is a Webkit-exclusive issue on Mac OS (as the records show there are rare cases of IE, Chrome and Firefox but I’m willing to just ignore these). As this there is client-specified code I have no control over (like other tracking scripts!) there are quite a few variables to consider.

Just a question: does the tracker need to send the final on unload ‘ping’ to complete a registered action - ie: if there are multiple requests for the same page but they don’t send the final on-unload request will they all be considered ‘one action’? Nevermind, I thought there was an unload submission - I don’t use tracker plugins so I don’t do anything particular for unload.

Let me know if you find anything, anything we can do to improve tracking accuracy (and tracking only “humans”) is important.

Ok, a minor update. I’ve found another page on the same website (but under a different Piwik site id) with Piwik tracking - but it doesn’t suffer the same ‘reload’ issue. Here are the interesting differences:

  • Both share a set of the same Javascript files, some minor differences with inline script
  • The non-affected pages have more (!) Javascript files imported
  • Both run the Piwik tracking code through a separate function(s), defined like the old document.write code at the end of the and run on the onload event from a script defined in
  • The non-affected pages are much faster to load - almost 1/2 the time it takes for the affected pages

This has been strengthening my idea that one of the scripts leave the page’s DOM in an unstable state that causes it to reload. Given that the affected pages take twice as long, my theory is that at some point one or more of the many scripts are not downloaded completely prompting the refresh - and as the browser caches this/these script(s) so the reload occurs again and again.

Apologies that this is starting to sound like a question on how to debug complex Javascript: but has anyone come across similar issues and if so is the best way to go around it is to reduce the number of external Javascript files?

I’ll return with updates - I hope to try to see if I can find breakpoints that could generate the refresh.

(Edit) Minor minor update from above: just discovered that both affected and non-affected pages do have the same number of external files.

However, the scripts are different: the affected pages contain (yet more) additional tracking with what I believe is Omniture that has these suspicious/interesting functions which timeout a call to set window.location. Still looking at how to replicate…

Im just curious is the piwik tracking code being inserted via any sort of include? If it is could you put the code on directly on the pages or page templates to see if that helps?

I’ve not got the exact code in front of me, but from memory the structure is roughly as follows:

  1. At the bottom of the HTML, there is a bunch of script which includes the old Piwik tracker code, but modified:
  1. There is a script included in the of the page that calls doTrackingStuff() on document.onload

I can see why things were set this way - it was to be compliant with the EU cookie laws (heh, so much for that really)

I am not sure if any of that helps