Public Stats API

Admitted, after reading a little deeper on issues Allow Customization of Map Widget,
Wordpress: importing piwik data,
and Expose or return geolocation result
I have given the common denominator of these issues some more thought.

I am aware that the following suggestion is a big effort, but I’d like to offer my view, hoping it may be perceived as constructive. I’ll try to be brief.

PIWIK is great at collecting all kinds of data and turning that into reports.
As I gather, the “data in -> reports out” flow is somewhat unidirectional, based on the assumption that the tracked visitors have no interest in the statistics.
This is generally true as far as most metrics are concerned, but there are quite a lot of cases in which one would like to expose certain statistics to the end users.
Let me state some:

  • Number of page or content views on user-contributed content sites
  • public online users map
  • public media analytics stats on user-contributed content sites (media statistics are soundclouds primary incentive for paid accounts… people seem to love stats)
  • even tho it may not be displayed as such, retrieving the PIWIK geolocation result would also be a useable data detail right on the visitors end of things.

PIWIK can obviously record & generate all the data, but there seems to be no way to access the data conveniently & quickly if all I want is to retrieve one videos view count.
I am aware that especially this example would be tricky (and time-taking, if there was no pre-processed report on content views, so there might be a need to configuring individual pre-processed reports which could then be accessed via API…),
what I am trying to point out is:
There is data which could be deliberately set to public and be made available thru an API, giving PIWIK a new dimension of usefullness.

As things are right now, I have to double-track each video view, once in PIWIK, and once in my own database, just to provide the counter.
PIWIK is definetly better than me in weeding out bot hits, handling many inserts (and 1001 other things).
I have seriously considered purchasing the PIWIK media analytics plug-in, but if the gathered stats can not be offered to the creative users, then that is of much less interest to me.
Since PIWIK doesn’t share its geoIP detection result, I’m doing a second own detection for the details I need to provide the site - these duplicate efforts could be solved by a public PIWIK API module.
PIWIK has custom variables going IN, I’d love to be able to configure custom variables being returned, too :slight_smile:

Depending on the choice of data to be exposed, one might not even need any security / access key tools (the view stats are going public anyways, why worry?), but applying some general restrictions (like letting only certain domains request the API, or expecting one key pair used for all public requests) shouldn’t be too hard.

I admit, I’m encouraging you to develop something that will eventually let PIWIK take care of any statistics aspects I’d otherwise have to deal with myself - but I do believe there might be more lazy folks like me on this planet who would apreciate that.
You guys rule at stats… please keep going & thanks for the effort :slight_smile:


I like your idea, but I think this is already possible today.

An example:

For whatever reason (probably a good search engine ranking for a popular error) this forum page is by far the most popular:
Now discourse (the software of this forum) shows the number of visitors on this page, but if it didn’t it would be possible to use Piwik data for this.

We could use the reporting API and pass it the URL:

and Piwik will respond with the number of visitors in this month and some other useful stats:

    "label": "/261",
    "nb_visits": 414,
    "nb_hits": 449,
    "sum_time_spent": 17133,
    "nb_hits_with_time_generation": 449,
    "min_time_generation": "0.002",
    "max_time_generation": "28.721",
    "entry_nb_visits": 412,
    "entry_nb_actions": 456,
    "entry_sum_visit_length": 17173,
    "entry_bounce_count": 373,
    "exit_nb_visits": 406,
    "sum_daily_nb_uniq_visitors": 397,
    "sum_daily_entry_nb_uniq_visitors": 395,
    "sum_daily_exit_nb_uniq_visitors": 389,
    "avg_time_on_page": 38,
    "bounce_rate": "91%",
    "exit_rate": "98%",
    "avg_time_generation": 0.872,
    "url": ""

So to sum up in your website you could e.g. on every request:

  • fetch the visitor count
  • cache it aggressively so it will only every X hours make a request to Piwik
  • display it inside your website on site generation

Bonus of doing everything server side (beside less resource usage) is that you don’t have to worry about data privacy as your sever will only forward the data you want to see public.

Of course one could write a Piwik plugin which helps with this any maybe preprocesses the data somehow, but the basic idea should be integratable with a bit of work.

Hey Lukas,
I spent some time looking into the suggested solution for the simple visitor counter -
and I do agree, that is one way of getting there if all you want to do is fetch the visits for one fixed URI.

What I was trying to point at is a little more difficult, since my counter is not bound to the page names (content appears on newsfeeds and content pages) - so I went and looked into the ContentTracking API, and have not found a way to retrieve the impressions per content ID in the API documentation (Reporting API Reference: API Reference - Piwik Analytics - Developer Docs - v3), all I can retrieve seems to be a full list of all tracked contents.
Maybe that could be worked around by setting additional filters on the request, but if these only filter the returned data, then the database load would be huge for retrieving a single contents viewcount.

The second issue I noticed is that I’d have to use time ranges if I wanted to display the all-time-view-count (instead of the “current month” approach you suggested), and since some of the tracked content dates back to 2011, that range is quite, er, large.
I tested API calls using the Actions.getPageUrl method with date ranges and received quite a lot of completely empty results once the date range got too large. That might be due to older PIWIK data being somewhat incompatible with the API, or to some reports having gone lost when moving the installation a few years ago, so that might be my fault.

bottom line - this doesn’t really feel convenient at this time.
The three offered APIs for Tracking, Reporting and Live combined creatively may cover my demand,
but a dedicated PublicContentStats API would make my life a lot easier.

To recap the demand:
I’d like to

  • Track content impressions, adding a custom parameter indicating the contents author (that should be possible right now as far as I know)
  • Retrieve a single number of per-content views by contents ID, ideally from a pre-processed DB that updates freuqently (think 5minutes)
  • Be able to present the content authors personalized reports about the number of content impressions of his or her individual content. This may be spiced up with Media Data and any other kinds of metrics eventually, but a “total lifetime impressions per author” value would be a first starting point.

To dream that thru all the way:
I imagine it would be convenient if the setup procedure would look like this:
Instead of taking PIWIKs existing ContentTracking functionality,
it might be wise to introduce a second ContentCounting segment to branch off the different usecase right on the doorstep (no need to mix up banner impressions and other data that will not be used for generating public stats).
On the PIWIK interface, I’d expect to have a newly added segment, let’s call that “Public Stats” for now.
There it should be possible to configure which data is made available to the general public (one could include the live map widgets settings here to get away from the full access issue) .
Ideally, I’d be able to pre-configure which data to return in preconfigured responses. The configuration would be somewhat similar to setting up a segment in PIWIK, and may offer the auto-generated request-to-call once its done.
For example, I might want to preconfigure:
Report 1 Returns all-time content views by content ID
A simplified request could then look like:

Report 2: Returns current months content views per user
Simplified Request:

Report 3: Returns content impression evolution over the past 30 days for one content, as a graph
Simplified Request:

etc etc.

The beauty of such a setup should (hopefully) be obvious, and one might add configuration on pre-processing and caching of these expectable and publicly available report requests to make sure the responses come as quick as possible.
You may imagine creatives & content creators love looking at stats, and to offer PIWIKS whole graph display, pivoting etc etc magic to individual users without giving them full access to PIWIK data is something that would IMO add a lot of value to PIWIK.

Establishing such a PublicContentStats-Module as a separate area might attract some interest once it is documented & presented, to leave it at “yes, you can do that by combining 23 existing features and coding a custom module” is a solution, but doesn’t really qualify as an argument why PIWIK is the most recommendable stats tool out there :slight_smile:

I just felt like leaving the idea here, as my humble attempt to contribute to this great project.