Log-Analytics mode 3 Basic Questions

Hi all,

I’ve reviewed all the reading I can find on the subject of importing Apache logs into Piwik, but there’s a few basic things I’m still unclear on.

I’m trying to setup one Piwik website using Javascript mode, and one or more additional using log analytics mode.

  1. Re-loading logs: To automate the loading of logs, is it safe to re-import the same log multiple times? For example, if I want to do a daily import of access.log but only do logrotate weekly, is Piwik smart enough to skip records already imported? Or will I end up with duplicate records in the PIwik logs tables and inflated/incorrect stats? Of course, I can test this out, but someone must know the actual logic used during the import.

  2. Minimal/Verbose Piwik website definitions: I’m thinking I might want more than one log-mode piwik website so that I can run one with default/minimal settings, and one with more complete logs (using --enable-bots --enable-static --enable-http-errors --enable-http-redirects). But is this really necessary? It would be more efficient, seems to me, if I could just load the full verbose logs and then define various filters within Piwik to show the detail level wanted.

  3. Excluding paths after load: Somewhat related to 2), if after importing I see various paths that I would rather hide in the visitor actions lists, is there a way to filter those out after the fact? Or do I have to purge everything for that piwik website and reload from scratch using --exclude-path / --exclude-path-from?

Thanks.

  1. Re-loading logs: To automate the loading of logs, is it safe to re-import the same log multiple times?

No, it isn’t safe to re-import logs multiple times.

  1. Minimal/Verbose Piwik website definitions:

Yes you can do either of those. You could define filters to exclude “bots + static files + redirect” using Custom Segment: http://piwik.org/docs/segmentation/

Or do I have to purge everything for that piwik website and reload from scratch using --exclude-path / --exclude-path-from?

There is no way to delete partly some logs. You would have to reload from scratch, and also Reprocess archived reports