Logfile Analyzer import / Performance

Hi,

obviously Piwik is not very fast importing apache access log files.
I tried to import a 350 MB access.log and after 8 hours import_logs.py did not finish yet.

I wonder what happens when you stop the import_logs.py process and start it again (with exact the same parameters).
Does import_logs.py “recognize” how many lines it already imported in order to continue with the rest?

Thanks and best regards
piwiktestit

Unfortunately obviously nobody´s here who can answer such a basic question!

I thought that at least someone of the developers would be able to answer to such a question.

Best regards
piwiktestit

Did you read the docs about the log-importer? - Are there any errors in the logs? How many rows does the file have?

Hi,

the logfile I tested Piwik the first time with has 350 MB.
After several hours the data were imported.

For me the most important question is:
"Does import_logs.py “recognize” how many lines it already imported in order to CONTINUE WITH THE REST? "

Yes, I read information I could find about import_logs.py!

I have an access.log and an error.log for each site. But I could not find out how to pass BOTH logs to import_logs.py.

Thanks and best regards
piwiktestit

Which parameters and values do you use to invoke the log-importer script? Which version of PHP is running on your server? How many rows per second does the script import?

Unfortunately, I can’t answer you question since I didn’t test the script in detail. But I’m sure, somebody will answer this one soon.

The log importer is very resource intensive. How many cores does your server have? How much memory have you given to mysql and PHP?
If you have enough memory- you will generally get about 200 lines/sec imported into the DB, depending what you have apache logging that should take anywhere from 30 minutes to 2 hours for one processor to import.

I suspect your system probably doesn’t have enough RAM- probably for PHP. I get about 1.1k lines a second giving PHP about 8GB of RAM and 5 cores. It takes me about an hour to import a 1.5GB apache log.

Theoretically if the import fails- piwik should purge the lines it imported from that log and everything will start new.