Hi,
I’m using Piwik to analyze 300+ sites at the moment (most of them low traffic, but about 5 with medium to high traffic).
I disabled the default archiving and use the cronjob instead.
So far so good… but because of the great number of websites and the database load while archiving I only start the cronjob every 6 hours. The script takes about 20 minutes to archive all the sites.
For most of the sites that frequency is okay, but for some e-commerce sites it might be too long.
My question:
Is it possible to archive single sites without destroying any collected data or does the archiving process only work if all the sites at once are processed? Could it for example be done with a modified version of the cronjob script? I’m thinking of running the archive script for all sites every 6 hours and on top of that start a modified version in smaller intervals that uses a “idsite” argument to just archive a single site.
Something like:
archive all 300+sites every 6 hours
0 */6 * * * /path_to_piwik/misc/cron/archive.sh > /dev/null
archive site 123 more often because we need the latest data
20,30,40,50 * * * * /path_to_piwik/misc/cron/archive.sh 123 > /dev/null
Modifiying the script to take a site-Id as an argument (or if none given use the API to get all sites, like it is done at the moment) seems not to be the problem, but I’m not sure if that would mess with archiving process itself because it probably has to process all the sites at once or some data won’t be archived at all.
I did not make a feature request out of this yet because it might be easier than I imagine or impossible at all to archive single sites.
Thanks in advance for your replies,
Ruediger