Tor exit nodes?


#1

In the visitors’ log, is there any plugin to mark users visiting the website via a Tor exit node? If not, would it be possible to implement this as a new feature in one of the next Piwik versions?

Some of my websites are regularly visited by people using Tor. However, I have no idea on how to analyze Tor traffic using Piwik. According to the Tor documentation (see: tor - Tor’s source code Section “Query type 1 - General IP/Port” ) it is perfectly possible to figure out whether a visitor is coming via Tor or not.

So, I’d like to know what you think about my suggestion?


(vipsoft) #2

That document is probably 5+ years old. The solution it proposes requires that you run a Tor server, in order to get updates on the network topography.

Unless someone knows of a web service, it will be simpler (but still non-trivial) to use TorStatus’ published list of exist nodes (by IP address), and periodically refresh it.

Note: the list doesn’t specify ports (or port ranges) used by the exit node.


#3

Not meant to sound disrespectful, but that is not a valid argument. It is not about the age of a document but whether it’s content is (still) valid. In this case, it is.

It does not. The document clearly indcates (although I must admit that it won’t go into detail) that every Tor exit node has a unique DNS name. You don’t need a Tor server for that, as all you need is one DNS query to any DNS server. All you need to query is the IP of:

.ip-port.exitlist.torproject.org

which exists for any running Tor exit node (or a recently running one if the node itself is no longer active but the TTL of the DNS zone file has not yet expired). Also if a new node comes up, it may sometimes take a short while until the DNS query will yield a result different from “NXDOMAIN”.

Therefore the following (quickly coded and not well formatted) chunks of PHP code as a result of section “Query type 1 - General IP/Port” of the document stated above will reveal whether a visitor comes via a Tor exit node or not:

function is_Tor_exitnode() {
  if (gethostbyname(ip_reverse($_SERVER['REMOTE_ADDR']).".".$_SERVER['SERVER_PORT'].".".ip_reverse($_SERVER['SERVER_ADDR']).".ip-port.exitlist.torproject.org")=="127.0.0.2") {
    return true;
  } else {
    return false;
  } 
}

function ip_reverse($ip) {
  $ipoctett = explode(".",$ip);
  return $ipoctett[3].".".$ipoctett[2].".".$ipoctett[1].".".$ipoctett[0];
}

I am using something like this in other projects, and I dare to say it is working quite well - it will not yield a valid result all of the time (i.e. that very few exit nodes are not detected - see above) but on the other hand it will definitely not be worse than finding out (better: “guessing” ) the provider information when the AnonymizeIP plugin is active. Therefore I wanted to hear opinions on whether this could be considered as an option for future Piwik versions and not be brushed of by “that document ist probably 5+ years old.”


(vipsoft) #4

Sorry, I misread. I’ve created a ticket in trac.

Thanks for the proof of concept.


(Matthieu Aubry) #5

Thanks for the suggestion! I think we could set a custom variable Name=TOR,Value=$IP
This could be done as a simple plugin, by hooking at tracking and doing the reverse lookup
it cant be done in core since it requires a new DNS request.
Please lets continue discussion in: Identify visits from Tor exit nodes · Issue #3284 · matomo-org/matomo · GitHub
thanks :slight_smile: