import_logs.py - Error when connecting to Piwik

Do you have a 2-lines log file that reproduces the error? the script works fine for hundreds of users so we need to know why it doesnt for you

Hi

Yes I sent a log to Fabian.

Thanks

Ed

Any news on this yet ??

:o

This is still happening on the latest version of import_logs.py from the development build.

Do you have a 2-lines log file that reproduces the error? the script works fine for hundreds of users so we need to know why it doesnt for you

please post here

Have added this to the github, under tests as ‘NCSA Import issue’.

Edit: this problems occurs in Piwik 1.11.1 import script as well as in the git version of it

I have the same problem. Output of import script:

<…>/piwik/misc/log-analytics/import_logs_git.py --idsite=4 --recorders=1 --show-progress --url= -dddd --disable-bulk-tracking --enable-static --enable-bots --enable-http-errors --enable-http-redirects my.access.log.201302-500
2013-04-03 12:54:34,996: [DEBUG] Accepted hostnames: all
2013-04-03 12:54:34,997: [DEBUG] Piwik URL is:
2013-04-03 12:54:34,997: [DEBUG] No token-auth specified
2013-04-03 12:54:34,997: [DEBUG] No credentials specified, reading them from “<…>/piwik/config/config.ini.php"
2013-04-03 12:54:34,998: [DEBUG] Using credentials: (login = piwikadm, password = …)
2013-04-03 12:54:35,177: [DEBUG] Authentication token token_auth is: f73539f1f91a…
2013-04-03 12:54:35,178: [DEBUG] Resolver: static
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-04-03 12:54:35,345: [DEBUG] Launched recorder
Parsing log my.access.log.201302-500…
2013-04-03 12:54:35,346: [DEBUG] Detecting the log format
2013-04-03 12:54:35,346: [DEBUG] Format iis does not match
2013-04-03 12:54:35,346: [DEBUG] Format s3 does not match
2013-04-03 12:54:35,347: [DEBUG] Format common_complete does not match
2013-04-03 12:54:35,347: [DEBUG] Format common matches
2013-04-03 12:54:35,347: [DEBUG] Format common_vhost does not match
2013-04-03 12:54:35,347: [DEBUG] Format ncsa_extended matches
2013-04-03 12:54:35,348: [DEBUG] Format ncsa_extended is the best match
2013-04-03 12:54:35,565: [DEBUG] Error when connecting to Piwik:
500 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
500 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-04-03 12:54:37,628: [DEBUG] Error when connecting to Piwik:
500 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
500 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-04-03 12:54:39,688: [DEBUG] Error when connecting to Piwik:
Fatal error: None
You can restart the import of " my.access.log.201302-500” from the point it failed by specifying --skip=0 on the command line.

BTW: How am I supposed to provide an example log file when the only valid attachments are images or pdf ? (just created a pdf of the log file)

[quote=Mako77]
Have added this to the github, under tests as ‘NCSA Import issue’.[/quote]

Can you post the url? The issue I’m experiencing is similar and I have found to get the script to resume you must specify the number of lines parsed as last reported. The number reported for skip is incorrect and fails on subsequent attempts.

[quote=sfreiberg]
Edit: this problems occurs in Piwik 1.11.1 import script as well as in the git version of it

I have the same problem. Output of import script:

<…>/piwik/misc/log-analytics/import_logs_git.py --idsite=4 --recorders=1 --show-progress --url= -dddd --disable-bulk-tracking --enable-static --enable-bots --enable-http-errors --enable-http-redirects my.access.log.201302-500
2013-04-03 12:54:34,996: [DEBUG] Accepted hostnames: all
2013-04-03 12:54:34,997: [DEBUG] Piwik URL is:
2013-04-03 12:54:34,997: [DEBUG] No token-auth specified
2013-04-03 12:54:34,997: [DEBUG] No credentials specified, reading them from “<…>/piwik/config/config.ini.php"
2013-04-03 12:54:34,998: [DEBUG] Using credentials: (login = piwikadm, password = …)
2013-04-03 12:54:35,177: [DEBUG] Authentication token token_auth is: f73539f1f91a…
2013-04-03 12:54:35,178: [DEBUG] Resolver: static
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-04-03 12:54:35,345: [DEBUG] Launched recorder
Parsing log my.access.log.201302-500…
2013-04-03 12:54:35,346: [DEBUG] Detecting the log format
2013-04-03 12:54:35,346: [DEBUG] Format iis does not match
2013-04-03 12:54:35,346: [DEBUG] Format s3 does not match
2013-04-03 12:54:35,347: [DEBUG] Format common_complete does not match
2013-04-03 12:54:35,347: [DEBUG] Format common matches
2013-04-03 12:54:35,347: [DEBUG] Format common_vhost does not match
2013-04-03 12:54:35,347: [DEBUG] Format ncsa_extended matches
2013-04-03 12:54:35,348: [DEBUG] Format ncsa_extended is the best match
2013-04-03 12:54:35,565: [DEBUG] Error when connecting to Piwik:
500 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
500 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-04-03 12:54:37,628: [DEBUG] Error when connecting to Piwik:
500 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
500 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-04-03 12:54:39,688: [DEBUG] Error when connecting to Piwik:
Fatal error: None
You can restart the import of " my.access.log.201302-500” from the point it failed by specifying --skip=0 on the command line.

BTW: How am I supposed to provide an example log file when the only valid attachments are images or pdf ? (just created a pdf of the log file)[/quote]

I’m experiencing a similar issue with nsca_extended format it seems the value reported by skip is incorrect. A possible workaround is to not resume the script with what is reported by skip but use the last reported # of lines parsed. In this case, instead of --skip=0, use --skip=500. However, while it works to resume the import I do not know if valid log lines are being passed over.

This isn’t very practical either if you want this to run automatically with no user intervention. The root cause of the problem needs to be addressed. I’m not a python programmer so can’t provide to much input unfortantly.

[quote=Mako77]
Output is as follows:

2013-03-07 21:35:02,334: [DEBUG] Resolver: static
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-03-07 21:35:02,552: [DEBUG] Launched recorder
Parsing log log-Feb-2009…
2013-03-07 21:35:02,568: [DEBUG] Detecting the log format
2013-03-07 21:35:02,584: [DEBUG] Format ncsa_extended matches
3221 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current
)
2013-03-07 21:35:03,865: [DEBUG] Error when connecting to Piwik:
5071 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current
)
5071 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current
)
5071 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current
)
2013-03-07 21:35:06,818: [DEBUG] Error when connecting to Piwik:
5071 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current
)
5071 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current
)
5071 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current
)
2013-03-07 21:35:10,395: [DEBUG] Error when connecting to Piwik:
Fatal error: None
You can restart the import of “log-Feb-2009” from the point it fail
ed by specifying --skip=79 on the command line.

If I use the parameter --disable-bulk-tracking then it seems to say more about lines recorded but still has the urlopen error problem.

Would love to see this working :S[/quote]

To get this particular import to resume use --skip=5071. Mind you, I don’t know by doing this if valid log lines are being passed over. Have to wait for an actual bug fix. i.e. don’t use what’s actually reported by skip, instead go by lines parsed.

[quote=Mako77]
This isn’t very practical either if you want this to run automatically with no user intervention. The root cause of the problem needs to be addressed. I’m not a python programmer so can’t provide to much input unfortantly.[/quote]

totally understandable, I posted this hoping it could contribute to identifying the issue faster. In my case, it’s acceptable as a work around until the issue is resolved but not ideal though because I’ll end up re-importing 7GB of log data.

posted the issue: import_logs.py cannot resume with line number reported by skip for ncsa_extended log format · Issue #3867 · matomo-org/matomo · GitHub

Hi,

Can you let me know if you have received the logs correctly in the github ?? I can now longer see them myself now ?

Thanks

Ed

Posted example log at Github

Any updates ? :S

[quote=Mako77]
Any updates ? :S[/quote]

That is what I am interested in, too. No updates?

You should probably have a look into adjusting your server config, that resolved the issue for me even though it wasn’t apparent through the logs. I’m running nginx so I can only speak to configuration for that.

[quote=rdux]
You should probably have a look into adjusting your server config, that resolved the issue for me even though it wasn’t apparent through the logs. I’m running nginx so I can only speak to configuration for that.[/quote]

Hi. I have the same problem, as it’s described in topic, but with nginx, trying to fix this problem more than 3 days and can’t understand what to do? Can you help me?
Nginx 1.2.7
Python 2.6.6
Piwik 1.12

python /mnt/data/www/piwik/misc/log-analytics/import_logs.py -d --url=http:///mnt/data/importlog//_access.log.2013080510 --idsite=4 --recorders=12 --enable-http-errors --enable-http-redirects --enable-static --enable-bots
2013-08-07 11:57:12,370: [DEBUG] Accepted hostnames: all
2013-08-07 11:57:12,370: [DEBUG] Piwik URL is: http://

2013-08-07 11:57:12,370: [DEBUG] No token-auth specified
2013-08-07 11:57:12,370: [DEBUG] No credentials specified, reading them from "/mnt/data/www/piwik/config/config.ini.php"
2013-08-07 11:57:12,372: [DEBUG] Using credentials: (login = , password = )
2013-08-07 11:57:12,413: [DEBUG] Authentication token token_auth is: *******
2013-08-07 11:57:12,414: [DEBUG] Resolver: static
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-08-07 11:57:12,449: [DEBUG] Launched recorder
2013-08-07 11:57:12,450: [DEBUG] Launched recorder
2013-08-07 11:57:12,450: [DEBUG] Launched recorder
2013-08-07 11:57:12,451: [DEBUG] Launched recorder
2013-08-07 11:57:12,451: [DEBUG] Launched recorder
2013-08-07 11:57:12,452: [DEBUG] Launched recorder
2013-08-07 11:57:12,452: [DEBUG] Launched recorder
2013-08-07 11:57:12,452: [DEBUG] Launched recorder
2013-08-07 11:57:12,453: [DEBUG] Launched recorder
2013-08-07 11:57:12,453: [DEBUG] Launched recorder
2013-08-07 11:57:12,454: [DEBUG] Launched recorder
2013-08-07 11:57:12,454: [DEBUG] Launched recorder
Parsing log /mnt/data/importlog/
/
**********_access.log.2013080510…
2013-08-07 11:57:12,454: [DEBUG] Detecting the log format
2013-08-07 11:57:12,455: [DEBUG] Format icecast2 does not match
2013-08-07 11:57:12,455: [DEBUG] Format iis does not match
2013-08-07 11:57:12,455: [DEBUG] Format s3 does not match
2013-08-07 11:57:12,455: [DEBUG] Format common_complete does not match
2013-08-07 11:57:12,455: [DEBUG] Format common matches
2013-08-07 11:57:12,455: [DEBUG] Format common_vhost does not match
2013-08-07 11:57:12,455: [DEBUG] Format ncsa_extended matches
2013-08-07 11:57:12,455: [DEBUG] Format ncsa_extended is the best match
2013-08-07 11:57:13,158: [DEBUG] Error when connecting to Piwik:
2013-08-07 11:57:13,163: [DEBUG] Error when connecting to Piwik:
2013-08-07 11:57:13,243: [DEBUG] Error when connecting to Piwik:
2013-08-07 11:57:13,296: [DEBUG] Error when connecting to Piwik:
2420 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2420 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-08-07 11:57:15,232: [DEBUG] Error when connecting to Piwik:
2013-08-07 11:57:15,263: [DEBUG] Error when connecting to Piwik:
2013-08-07 11:57:15,335: [DEBUG] Error when connecting to Piwik:
2013-08-07 11:57:15,442: [DEBUG] Error when connecting to Piwik:
2420 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2420 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2013-08-07 11:57:17,292: [DEBUG] Error when connecting to Piwik:
Fatal error: None
You can restart the import of “/mnt/data/importlog/************/************_access.log.2013080510” from the point it failed by specifying --skip=7 on the command line.

Waiting for your respond, thanks in advance!