Duplicated data in archive numeric tables


#1

Hi,

I’m doing a script to read data from archive tables and I have found that some data is duplicated.


mysql> describe piwik_archive_numeric_2014_05;
+-------------+---------------------+------+-----+---------+-------+
| Field       | Type                | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+-------+
| idarchive   | int(10) unsigned    | NO   | PRI | NULL    |       |
| name        | varchar(255)        | NO   | PRI | NULL    |       |
| idsite      | int(10) unsigned    | YES  | MUL | NULL    |       |
| date1       | date                | YES  |     | NULL    |       |
| date2       | date                | YES  |     | NULL    |       |
| period      | tinyint(3) unsigned | YES  | MUL | NULL    |       |
| ts_archived | datetime            | YES  |     | NULL    |       |
| value       | double              | YES  |     | NULL    |       |
+-------------+---------------------+------+-----+---------+-------+
8 rows in set (0.00 sec)

And the query:


mysql> select * from piwik_archive_numeric_2014_06 WHERE idsite=34 AND period=1 AND date1='2014-06-23' order by name;
+-----------+----------------------------------------------------+--------+------------+------------+--------+---------------------+-----------+
| idarchive | name                                               | idsite | date1      | date2      | period | ts_archived         | value     |
+-----------+----------------------------------------------------+--------+------------+------------+--------+---------------------+-----------+
|    104785 | Actions_nb_hits_with_time_generation               |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:26 |     71251 |
|    104785 | Actions_nb_keywords                                |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:26 |        23 |
|    104785 | Actions_nb_pageviews                               |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:26 |    105546 |
|    104785 | Actions_nb_searches                                |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:26 |        25 |
|    104785 | Actions_nb_uniq_pageviews                          |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:26 |     64012 |
|    104785 | Actions_sum_time_generation                        |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:26 | 159635.05 |
|    104785 | bounce_count                                       |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:36:51 |     10036 |
|    104786 | bounce_count                                       |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:38:07 |      7554 |
|    104785 | done                                               |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:38:06 |         1 |
|    104786 | donefea44bece172bc9696ae57c26888bf8a.VisitsSummary |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:38:07 |         1 |
|    104785 | max_actions                                        |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:36:51 |       670 |
|    104786 | max_actions                                        |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:38:06 |       670 |
|    104785 | nb_actions                                         |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:36:51 |    105752 |
|    104786 | nb_actions                                         |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:38:06 |     79774 |
|    104785 | nb_uniq_visitors                                   |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:36:51 |     20450 |
|    104786 | nb_uniq_visitors                                   |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:38:06 |     14342 |
|    104785 | nb_visits                                          |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:36:51 |     28978 |
|    104786 | nb_visits                                          |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:38:06 |     22112 |
|    104785 | Referrers_distinctCampaigns                        |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:27 |         1 |
|    104785 | Referrers_distinctKeywords                         |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:27 |         1 |
|    104785 | Referrers_distinctSearchEngines                    |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:27 |         1 |
|    104785 | Referrers_distinctWebsites                         |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:27 |         1 |
|    104785 | Referrers_distinctWebsitesUrls                     |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:27 |       657 |
|    104785 | sum_visit_length                                   |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:36:51 |   9050750 |
|    104786 | sum_visit_length                                   |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:38:07 |   7000751 |
|    104785 | UserCountry_distinctCountries                      |     34 | 2014-06-23 | 2014-06-23 |      1 | 2014-06-23 22:37:36 |        76 |
+-----------+----------------------------------------------------+--------+------------+------------+--------+---------------------+-----------+

As you can see, some fields like nb_visits are duplicated and I don’t know why neither what is the right value.

Why are these fields duplicated?


(Matthieu Aubry) #2

We recommend to only use APIs to read the data, rather than read directly from the database.