Log_importer and Pure-ftpd CLF log file import problem


I’m trying to use log-importer (import_logs.py) to import pure-ftpd log files. Logs are in format clf, here’s a sample line:

a.b.c.d - ftp [10/Aug/2017:12:20:44 +0200] "GET /some/path/to/some/file" 200 789201

According to pure-ftpd software, this format is the more suitable for web log analyzers, indeed I was previously using this same format with a different log analyzer.

I’m using the parameter --log-format-name=common (for me seems the most plausible option), but am getting error messages such as this:

2017-08-10 12:34:07,028: [DEBUG] Invalid line detected (line did not match): a.b.c.d - ftp [10/Aug/2017:12:20:44 +0200] "GET /some/path/to/some/file" 200 789201

Except for the Referer and UA information, which is missing, I can’t find why this wouldn’t be matching.

Any help on how to process this line format?

Thanks in advance!


Edit: Forgot to say… I’m using piwik 3.0.4 and python 2.7 for the importer script.

Reply to myself…

After many months since my original post, I finally had the opportunity to look into this.

Logs produced by pure-ftpd in CLF (Common Logfile Format), lack the protocol version (as in “GET /somepath HTTP/1.0”).

I have tampered my logs to add the fake protocol “FTP/1.0”. A simple (and fast) sed line makes the workaround:

sed -i -e "s,\" 200, FTP/1.0\" 200,g" /var/log/pureftpd.log.1

I know this might not be the most elegant, and either pure-ftpd should add this to the log lines, or log-importer consider valid a log where the protocol version is missing. But in any case, this solves my problem.

1 Like