As I mentioned in my original post I’m not a technical person but I have red the FAQ and to be honest I do not see the issue of duplicates mentioned anywhere. I apologize for asking something that maybe obvious but does it mean that import_logs.py has a built in mechanism to prevent duplicates?
Response from matthieu in a post from 2012 which can be found here Importing apache logs as long-term strategy
suggests that “duplicates are not ignored, this is a missing feature” but was it fixed since? Further responses even from 2015 imply that this is still an issue.
I haven’t tried it but the FAQ suggests that you “likely would import log files hourly or daily into Piwik” and shows the command to put in a cron job.
Yep and it works. Tried it myself, no problem there. The issue I’m concerned with is that my logs are being rotated on monthly basis. If I put a cron job once a day or once an hour it would import the same data multiple times and I’m just not knowledgeable enough to figure out how to handle it.
In that case I’ll have to defer to someone else who may know.
Any updates on preventing duplicates in log analytics? Setting up a pre-processor to split up these appending log files into unique hourly / daily / etc. log files is best practice currently? Would be nice if Matomo could handle this build in.