Greetings,
In Piwik 2.3, I have my logs set to import once a day with import_logs.py, followed by archiving, as a cron job. Although the unique visits aren’t too high (in the high hundreds per day) and although the archiving is not being performed at the time of viewing the visitor log pages in Piwik, sometimes it can take several minutes for any visitor log page to load for a single day, or trying to load a visitor log page for a single day can even time out in the “oops…” etc error.
When these slow or failed loading times are happening, a concomitant symptom is that the page views in the log are duplicated dozens or hundreds of times in the visitor log display. I’m attaching a screenshot showing this so you can see the numbers in question.
Is this expected? This server is a few years old but it has plenty of RAM and HD space and it is performing all of its other web service related work very well. I’ve lengthened the mysql timeout which has helped with the fact that this situation always used to end with a mysql timeout, but that wasn’t the underlying issue causing the slow loads/no loads and heavily duplicated pageviews.
Piwik is working great for me in every other respect so it would be great to get some help figuring out what is going on with the slowness and duplication, since it makes it not possible for me to use Piwik right now when it’s hard to load pages at all and then they have wrong pageview numbers on them. I don’t see any error output in the Piwik logs or in my archive.log that the archive call outputs to, but perhaps I’m looking in the wrong place.
Thank you for your help! Here is my script that is called by cron to do the log import and archive:
#!/bin/bash
#
/path_to_public_html/piwik/misc/log-analytics/import_logs.py \
--url=http://www.site.com/piwik \
--enable-reverse-dns \
--idsite=1 \
--exclude-path="/wp-login.php?action=register" \
--exclude-path="/wp-login.php" \
--exclude-path="/wp-admin/admin-ajax.php" \
--exclude-path="/robots.txt" \
--exclude-path="/wp-cron.php" \
--exclude-path="/ascript1.php" \
--exclude-path="/ascript2.php" \
--exclude-path="/ascript3.php" \
--exclude-path="/anrssfeedurl/" \
--exclude-path="/anotherrssfeedurl/" \
--exclude-path="/piwik/*" \
--exclude-path="/ascript4.php" \
--exclude-path="/somestuffnoonecaresabout/*" \
--exclude-path="/someotherstuffnoonecaresabout/*" \
--exclude-path="/wp-admin/*" \
/var/log/httpd/httpdlog-$(date -d 'yesterday' "+%Y%m%d").gz
php /pathtopiwik/console core:archive \
--url=http://www.site.com/piwik/ \
> /var/log/piwik/piwik-archive.log