Import Logs: Visits / Actions Count differs from script output

Hello everyone !

I have one question about the import logs python script in Piwik.

I want to import an apache access log with 107.900 lines in Piwik with the import_logs.py script. After the import process the script says:

107900 requests imported successfully
0 requests were downloads
5 requests ignored:
0 HTTP errors
0 HTTP redirects
0 invalid log lines
5 requests did not match any known site
0 requests did not match any --hostname
0 requests done by bots, search engines…
0 requests to static resources (css, js, images, ico, ttf…)
0 requests to file downloads did not match any --download-extensions

This is okay, the log file contains exactly 107.900 lines.

But when I count all records in the database tables for actions and visits, the record count differs from the log lines and the script output.
The piwik_log_link_visit_action table stores only about ~64.000.

Especially the actions should contain the same number of records as the apache log file.
Is this correct ??
Every request forms a new record the actions table ??

I don’t know if there’s a problem or not. I think that the records in the table should store the same number as the logfile. But this is not the case here.

Can someone help please ?
Thanks a lot !!

Can no one help ??

I think here’s a problem in importing the logs.

I have never used the log import, so I may be wrong, but i think everything is working as intended:

You may look into this chapter:
https://piwik.org/docs/log-analytics-tool-how-to/#how-to-import-more-data-including-bots-static-files-and-http-errors-tracking

By default the log importer ignores Bots and static files (CSS, JS, images, …)

Especially the latter can make up a large portion of the log file, so that may explain the difference.

This makes absolutely sense !
Thanks for your response.

But the script itself generates this output:

107900 requests imported successfully
0 requests were downloads
5 requests ignored:
0 HTTP errors
0 HTTP redirects
0 invalid log lines
5 requests did not match any known site
0 requests did not match any --hostname
0 requests done by bots, search engines…
0 requests to static resources (css, js, images, ico, ttf…)
0 requests to file downloads did not match any --download-extensions

This means that only 10 requests cannot be recorded right ?
The major part should be imported into the DB.
And every log line contains in the path “GET /piwik.php”.

Can there be another problem ?

A last push up of this fred…

If this is intended behaviour then there is no problem. But I need a confirmation… :wink: