Is idvisitor unique?

Hello,

I’m having some difficulties understanding the Piwik’s mysql schema.

I’ve tried to extract the max time between two actions for a specific visitor, as follow:


mysql> select
           max(server_time) as max,
           min(server_time) as min,
           timediff(max(server_time),min(server_time)) as diff
       from piwik_log_link_visit_action
       where idsite=3 
       group by idvisitor
       order by diff desc
       limit 4;


+---------------------+---------------------+----------+
| max                 | min                 | diff     |
+---------------------+---------------------+----------+
| 2012-11-04 18:17:48 | 2012-11-04 12:39:36 | 05:38:12 |
| 2012-11-06 00:51:52 | 2012-11-05 19:46:44 | 05:05:08 |
| 2012-10-15 19:07:30 | 2012-10-15 14:48:46 | 04:18:44 |
| 2012-10-14 11:53:29 | 2012-10-14 08:56:16 | 02:57:13 |
+---------------------+---------------------+----------+

5 hours as the maximum interval between two actions for a visitor was a surprising result, so I experimented a similar query, but I used the IP in place of idvisitor to identify unique visitors:


mysql> select
         max(server_time) as max,
         min(server_time) as min,
         timediff(max(server_time),min(server_time)) as diff
       from piwik_log_link_visit_action lva
       LEFT JOIN  piwik_log_visit
       using (idvisitor)
       where lva.idsite=3
       group by location_ip
       order by diff desc
       limit 4 ;


+---------------------+---------------------+-----------+
| max                 | min                 | diff      |
+---------------------+---------------------+-----------+
| 2012-12-05 13:15:16 | 2012-09-28 15:05:07 | 838:59:59 |
| 2012-11-20 09:50:45 | 2012-09-28 15:05:31 | 838:59:59 |
| 2012-11-24 20:01:38 | 2012-09-28 15:05:38 | 838:59:59 |
| 2012-12-05 08:52:55 | 2012-09-28 15:06:43 | 838:59:59 |
+---------------------+---------------------+-----------+

(838:59:59 is the max value for the TIME type.)
Such a difference seems to me very strange. It suggests that “idvisitor” is not unique for a given visitor(!). Am I wrong somewhere?..

Thank you in advance for your help and time.

What is the reporting result you are after? What are you trying to quantify in relation to that report?

Hello,

We want to datamine the SQL database to find out:

  • how many times does a unique visitor come back to our site before converting (buying)
  • how long does it take to do so
  • which actions have been perform in between.

That’s why I started with the first query, and saw that there was 5 hours max between a first action and the last one on our client’s website, which is impossible.

Do you have an idea on why is it so?

Ahhh i see… No i dont but your methodology i would assume will be tricky as what happens when an idvisitor accesses your site from 2 different machines, or clears cookies, or changes browsers etc… I would be careful as this type of dupolication is imposible to always account for.

Maybe higher level analysis like looking at total UVs over a time frame compared to the number of completed buys.

When data mining sometimes getting too granular makes you miss a larger picture so to speak.

I am also guessing setting up goals wherebuy how many pages are viewed before a buy is performed will help as well.

good luck

Thank you for your insights! I will try to set such a goal.

In fact, my only guess is that a new idvisitor is created after each known visits, but from what I can understand of the code (Visit.php), it is not the case.

It’s a bit worrying: I hope our stats about unique visitors are not inaccurate…