Logfile Analyzer import / Performance


#1

Hi,

obviously Piwik is not very fast importing apache access log files.
I tried to import a 350 MB access.log and after 8 hours import_logs.py did not finish yet.

I wonder what happens when you stop the import_logs.py process and start it again (with exact the same parameters).
Does import_logs.py “recognize” how many lines it already imported in order to continue with the rest?

Thanks and best regards
piwiktestit


#2

Unfortunately obviously nobody´s here who can answer such a basic question!

I thought that at least someone of the developers would be able to answer to such a question.

Best regards
piwiktestit


(Peterbo) #3

Did you read the docs about the log-importer? - Are there any errors in the logs? How many rows does the file have?


#4

Hi,

the logfile I tested Piwik the first time with has 350 MB.
After several hours the data were imported.

For me the most important question is:
"Does import_logs.py “recognize” how many lines it already imported in order to CONTINUE WITH THE REST? "

Yes, I read information I could find about import_logs.py!

I have an access.log and an error.log for each site. But I could not find out how to pass BOTH logs to import_logs.py.

Thanks and best regards
piwiktestit


(Peterbo) #5

Which parameters and values do you use to invoke the log-importer script? Which version of PHP is running on your server? How many rows per second does the script import?

Unfortunately, I can’t answer you question since I didn’t test the script in detail. But I’m sure, somebody will answer this one soon.


#6

The log importer is very resource intensive. How many cores does your server have? How much memory have you given to mysql and PHP?
If you have enough memory- you will generally get about 200 lines/sec imported into the DB, depending what you have apache logging that should take anywhere from 30 minutes to 2 hours for one processor to import.

I suspect your system probably doesn’t have enough RAM- probably for PHP. I get about 1.1k lines a second giving PHP about 8GB of RAM and 5 cores. It takes me about an hour to import a 1.5GB apache log.

Theoretically if the import fails- piwik should purge the lines it imported from that log and everything will start new.