Rsyslog + nginx live log import silently fail

It look to be the same goal than Unable to import logs with import_logs.py script but not the same issue :

I’ve followed the rsyslog part of this :

and after some folder manual creation and the change of /var/cache/nginx/access.socket by /dev/log in nginx.conf and 10-matomo.conf if i uncomment the debug line local0.* /var/tmp/nginx.tmp;matomo, log apear in /var/tmp/nginx.tmp

If i try to import them with matomo.sh it show me the script run, but import nothing.

If i try to run import_logs.py itself, it work.

So i suspect there is something not working in /usr/local/matomo/matomo.sh

 cat /var/tmp/nginx.tmp | ./misc/log-analytics/import_logs.py --url=***** --token-auth=******  --enable-http-errors --enable-http-redirects --enable-static --enable-bots --recorders=4 --log-format-name=nginx_json -

found nothing to import but :

./misc/log-analytics/import_logs.py --url=***** --token-auth=******  --enable-http-errors --enable-http-redirects --enable-static --enable-bots --recorders=4 --log-format-name=nginx_json /var/tmp/nginx.tmp

work

Is there some documentation to fix in README.md about rsyslog /usr/local/matomo/matomo.sh syntax :

echo "${@}" | /***MY_MATOMO_FOLDER***/misc/log-analytics/import_logs.py

or is it something to fix in misc/log-analytics/import_logs.py ?

or anything else i’ve not spotted ?

PS : i’ve posted this on github 10 days ago but this forum is perhapse a better place for this ?

After reading this: https://github.com/matomo-org/matomo-log-analytics/issues/282#issuecomment-716676600
i’ve updated import_logs.py to the latest from matomo-log-analytics repository.
I’ve change in first line : #!/usr/bin/python to #!/usr/bin/python3
and it now have the same behavor with this 2 way for calling it :

cat /var/tmp/nginx.tmp | ./misc/log-analytics/import_logs.py [...] -
./misc/log-analytics/import_logs.py [...] /var/tmp/nginx.tmp

But it does’nt import anything :

Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "******/misc/log-analytics/import_logs.py", line 1849, in _run_bulk
    self._record_hits(hits)
  File ""******/misc/log-analytics/import_logs.py", line 1995, in _record_hits
    'requests': [self._get_hit_args(hit) for hit in hits]
  File ""******/misc/log-analytics/import_logs.py", line 1995, in <listcomp>
    'requests': [self._get_hit_args(hit) for hit in hits]
  File ""******/misc/log-analytics/import_logs.py", line 1891, in _get_hit_args
    site_id, main_url = resolver.resolve(hit)
  File ""******/misc/log-analytics/import_logs.py", line 1775, in resolve
    return self._resolve_by_host(hit)
  File ""******/misc/log-analytics/import_logs.py", line 1758, in _resolve_by_host
    site_id = self._resolve(hit)
  File ""******/misc/log-analytics/import_logs.py", line 1726, in _resolve
    res = self._get_site_id_from_hit_host(hit)
  File ""******/misc/log-analytics/import_logs.py", line 1681, in _get_site_id_from_hit_host
    url=hit.host,
  File ""******/misc/log-analytics/import_logs.py", line 1633, in call_api
    return self._call_wrapper(self._call_api, None, None, method, **kwargs)
  File ""******/misc/log-analytics/import_logs.py", line 1585, in _call_wrapper
    response = func(*args, **kwargs)
  File ""******/misc/log-analytics/import_logs.py", line 1574, in _call_api
    return json.loads(res)
  File "/usr/lib/python3.5/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'bytes'

1365 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
[... Ctrl+C to stop it ...]
1365 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)

Logs import summary
-------------------

    0 requests imported successfully
    17 requests were downloads
    0 requests ignored:
        0 HTTP errors
        0 HTTP redirects
        0 invalid log lines
        0 filtered log lines
        0 requests did not match any known site
        0 requests did not match any --hostname
        0 requests done by bots, search engines...
        0 requests to static resources (css, js, images, ico, ttf...)
        0 requests to file downloads did not match any --download-extensions

Website import summary
----------------------

    0 requests imported to 0 sites
        0 sites already existed
        0 sites were created:

    0 distinct hostnames did not match any existing site:



Performance summary
-------------------

    Total time: 149 seconds
    Requests imported per second: 0.0 requests per second

Processing your log data
------------------------

    In order for your logs to be processed by Matomo, you may need to run the following command:
     ./console core:archive --force-all-websites --force-all-periods=315576000 --force-date-last-n=1000 --url='****'

Do you have some idea ?
Do you need some other log to be able to give me some help ?

I’ve just update to matomo 4, i still have this problem, does anybody use matomo-log-analytics ?

Hi,

That rsyslog trick was contributed seven years ago, so I can’t promise that it works at all anymore. If you find out more about this, it would be great if you could contribute a fix for it. Maybe this uses a part of the code that nothing else uses and isn’t covered by tests and broke during the python2 to python3 migration.

updating matomo-log-analytics and python to latest stable version (3.9 now) look to have fixed my problem
(after editing misc/log-analytics/import_logs.py to ask it to use latest python version)

I wait few days to see if it really work and i put it solved if it is (and perhaps i will add a pull-request to update documentation).

But my way break the matomo integrity check.

Perhaps main matomo release should upgrade the included matomo-log-analytics to latest version ?
and misc/log-analytics/import_logs.py should begin with

#!/usr/bin/python3

rather than :

#!/usr/bin/python

Hi,

The version included in Matomo 4 is the latest one only missing these two unrelated changes that were merged afterwards:

Hm,

Do you know how common it is for a python3 binary to exist (or be a symlink). I know it is a convention on Debian-like systems, but am not so familiar with other ones.
And personally I never use the shebang with python scripts, but rather call it with the python binary directly.

But it might be worth changing if it doesn’t break anything and helps a bit.