Data integration question

Hi All,

I am tracing some user behavior through database and found something strange.
I was wondering if anyone know, for any ‘idvisitor’ data recorded in table piwik_log_link_visit_action, is there a mapped idvisitor data in table ‘piwik_log_visit’.

I think the answer should be yes? Since every visit should have a record in piwik_log_visit to store the visit_first_action_time and visit_last_action_time with its idvisitor value, and piwik_log_link_visit_action is the detail actions. Please correct me if I am wrong, thank you.

Best Regards,

Treker

Yes the idvisitor should map between these 2 tables

Hi Matt,

Thanks for the answer. So when I executed

SELECT COUNT( DISTINCT (idvisitor) ) FROM piwik_log_link_visit_action got 197426 records. (piwik_log_link_visit_action has 2,677,461 rows)

and

SELECT COUNT( DISTINCT (idvisitor) ) FROM piwik_log_visit got 65608 records. (piwik_log_visit has 142,867 rows)

Does that mean I have many visitor log lost? Any suggestion?

Thank you.

I run it on the demo and got: 3311260 VS 3255891
so there is not that much difference but still some difference
Then I restricted only to the last 1 million visits, and it looks much better


SELECT COUNT( DISTINCT (idvisitor) ) FROM `piwik_log_link_visit_action` where idvisit > 4224089;
598804 

SELECT COUNT( DISTINCT (idvisitor) ) FROM `piwik_log_visit` where idvisit > 4224089;
598810

So i would say “works for me”. The small difference can be explained by missing INSERTs… can you try to restrict to recent visits?

I have 142867 records in piwik_log_visit, so I tried to limit for > 140000 and executes the following sql statement.


mysql> SELECT COUNT( DISTINCT (idvisitor) ) FROM  `piwik_log_link_visit_action` WHERE idvisit > 140000;
+-------------------------------+
| COUNT( DISTINCT (idvisitor) ) |
+-------------------------------+
|                          4068 |
+-------------------------------+
1 row in set (0.04 sec)

mysql> SELECT COUNT( DISTINCT (idvisitor) ) FROM `piwik_log_visit` where idvisit > 140000;
+-------------------------------+
| COUNT( DISTINCT (idvisitor) ) |
+-------------------------------+
|                          1803 |
+-------------------------------+
1 row in set (0.00 sec)

Looks like the difference is still quite a lot, comment?

Do you use the standard JS code? how many websites do you track? anything special to say about the way you track?

See our new page, which lists the various limits on historical data, number of websites, users, database size, when using Piwik: Data Limits Analytics