Multiple points of log import


#1

I have a setup with several reverse proxies in front of actual web and app servers. I want to import logs from them into piwik. As I see in docs regarding the import script, log file should be sorted by time before importing. Also, the import job is getting split between recorders based on the client address.

Does it mean that if I wish to import log files from several proxies (for the same domain), I should first consolidate files from all proxies into one file, sort it and only then run import on this log? It seems that if I run import in parallel from different hosts for the same domain I get strange results (for example, duplicate sites; I run import_logs.py with --add-sites-new-hosts.

As a side question - it seems that apache, for example, logs requests in order of completion but timestamps them with time of the request thus order of lines in log file is not always sorted by time (longer served requests that came earlier get logged later that simple requests for static objects that are serverd momentarily). Is that a problem for piwik? Should the logs be re-sorted before importing because of this?

Best Regards
Mariusz


(Matthieu Aubry) #2

I would recommend consolidate in one log file indeed, and order by time.

Side question: not a problem as tracking requests are always fast to respond, would not cause much inaccuracy (if you can sort by time do it, it cannot hurt)