Hi everybody,
I’ve been using Piwik for the last ten years (!) and I still love it. There is, however, something I never quite managed to figure out and solve: the wildly overreported (or misclassified) “Direct entries” phenomenon.
I did read FAQ entry no. 51 as well as the campaigns tracking documentation, in addition to every thread mentioning “Direct Entries” in this forum. It has been mentioned in on way or another in dozens of threads (including 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, etc.), but the discussion usually ends up nowhere close to a satisfying answer, either with the poster saying “oh I found the issue”, or “ah I actually had redirects in place” (not my case AFAIK, see further below) or “try a newer version” (and then the original poster disappears) or in some cases the thread received no answers at all… so I’m trying my luck here, with some thorough data to back me up (multiple websites, multiple servers, years of data)
Summary of the problem (TL;DR): on almost all the websites I maintain (except one or two), I get 30 to 50% of the visits being accounted as “Direct Entries” by Matomo—no matter the time of the year, no matter the server or what runs the website. On two different Matomo servers, running different Matomo versions, this strange phenomenon has always been there.
Since a picture/table is worth a thousand words, let me share a summary of statistics, comparing various websites on various servers:
This sample of data is made over the entire period between 2016-08-09 and 2019-02-07, so it’s covering 2.5 years, and I trust that’s enough to have some sort of data significance.
What you can notice there:
- for almost all my websites, there is an insane (usually between 40 and 50%) “direct entry” rate (sometimes as crazy as 80%), whereas on Google Analytics you get maybe 7-10% being accounted as direct entry (unfortunately for this comparison, I don’t run GA on most websites because, well, it’s evil
- my personal blog is an exception, the percentage of “Direct Entries” is lower because it receives an extraordinary (compared to other sites) amount of referrals/SEO (I’m not sure why, but oh well), so it is an outlier.
Notes:
- Most websites are HTTP, some (my personal blog and “Org. P”) are HTTPS, and none of the websites weird redirections going on (as far as I can tell from using wget or some the random online “redirection check” tools, unless you have a better recommendation for a command to check this on Linux)
- Beyond the notion of relative percentage, the absolute value of the number of direct entries stays very consistent from year to year. Some year you might get less referral visits, yet the direct entries remain pretty much the same.
Considering I’ve done no TV/radio advertising (and that in the very rare cases where I do online advertising, I use piwik tracking links), I have a really hard time imagining that nearly half of my hundreds of thousands of visitors type my website URL into their browser instead of coming from another website, from a search engine, or social media. My websites are not well-known public brands, it’s not Apple’s website, nor Amazon or YouTube… For the “Org. P” website, in December and January I have run a couple of social media advertising campaigns (with the ?pk_campaign=some_campaign_name_here tracking links, of course), and the results were mostly the same: still roughly the same percentage of direct entries (between 40 and 50%), plus about a thousand visits tracked as being part of my ad “campaigns”, and the rest is just normal traffic.
Interestingly enough, the fact that I did have a couple thousand “campaign” visits, amounting to 7% of visits, separately from “social networks” visits, tells me that it’s not a case of “something on my website/server is stripping off Matomo’s campaign URL parameters”, because if that was the case then it would be 0%.
So, at this point, I’m wondering what the “direct entries” truly mean, it’s making me doubt the validity of Matomo’s data, as I don’t believe for a second that I’m getting 60 thousand people (out of 140 thousand) entering the website address “manually” into their browser each month. It defies common sense and social psychology; I’m quite sure people are lazier than that! I can see three possibilities then:
- I am wrong, and truly half of the trackable human population knows the brand of most of my websites (unlikely) and bothers typing it or accessing it from a bookmark (unlikely), month over month. Oh, and that also means Google Analytics is wrong.
- There is a bug in all Piwik/Matomo versions and it consistently misclassifies a significant portion of traffic as being “Direct Entries”
- The Internet is full of bots that ping every website out there all the time and they get considered as being “Direct entry” (but then: why would the traffic spike at the same time as the referral traffic, and how does Google Analytics deal with it and Matomo does not, and why would those bots be running the page’s javascript, etc.?), in which case that means I should completely ignore (segment out) all visits that are “direct entry” when looking at my website stats?
Hypotheses I’ve considered and discarded:
- Could “something” (a redirection, a script, etc.) be “stripping off” the referrer information or stripping off campaign tracking URL parameters? Nope, because:
- I did get 7% visits from the campaigns I ran recently, and if it was stripping it off then it should be 0%
- I didn’t find anything odd going on with wget (or page redirection checking tools online)
- During that time the “Direct entries” didn’t change, still between 40 and 50%.
- Is it because of non-javascript visitors, or adblockers? No, because then I wouldn’t be seeing the “direct entries” numbers in Matomo any more than the other numbers… Note that Piwik/Matomo being blocked by EasyList is a fact of life since 2013 or so, I always presume visits on the website are 2-3x higher than reported, and so far I’ve been estimating that 80% of the visitors on the “Org. P” website are unseen by Matomo because they are using adblockers or some other technique.
So, yeah… I am totally puzzled by this. Why would “Direct Entries” represent such a big (>20%) portion of visitors on various websites? Did any of you encounter this problem, know other things to investigate or other possible explanations?