obviously Piwik is not very fast importing apache access log files.
I tried to import a 350 MB access.log and after 8 hours import_logs.py did not finish yet.
I wonder what happens when you stop the import_logs.py process and start it again (with exact the same parameters).
Does import_logs.py “recognize” how many lines it already imported in order to continue with the rest?
Which parameters and values do you use to invoke the log-importer script? Which version of PHP is running on your server? How many rows per second does the script import?
Unfortunately, I can’t answer you question since I didn’t test the script in detail. But I’m sure, somebody will answer this one soon.
The log importer is very resource intensive. How many cores does your server have? How much memory have you given to mysql and PHP?
If you have enough memory- you will generally get about 200 lines/sec imported into the DB, depending what you have apache logging that should take anywhere from 30 minutes to 2 hours for one processor to import.
I suspect your system probably doesn’t have enough RAM- probably for PHP. I get about 1.1k lines a second giving PHP about 8GB of RAM and 5 cores. It takes me about an hour to import a 1.5GB apache log.
Theoretically if the import fails- piwik should purge the lines it imported from that log and everything will start new.