Archiving vs. zipping

Hi,

I’m not sure I’ve understood the archiving concept correctly:

Per default every time someone opens a statistiv view piwik saves the data into database table xxx_archive_numeric_yyyy_mm and xxx_archive_blob_yyyy_mm.

You can force Piwik to save this data every now and then by starting a cron job

The data in xxx_archive_numeric_yyyy_mm and xxx_archive_blob_yyyy_mm, is it zipped or not?

If yes, why is there so much data?
(i.e.:
export 2010_03.sql.gz (numeric and blob) via phpmyadmin -> 2.2 MB
unzip 2010_03.sql.gz to 2010_03.sql -> 34.6 MB
Piwik -> Settings -> database usage -> archive_numeric_2010_03 = 8Mb, archive_blob_2010_03 = 17.8 Mb)

BTW: Is it really Mb(it), not MB? My web hosting dashboard shows usage of xx MB(yte)…

And if not zipped, how can I force Piwik to do this?

Kind regards from Germany
Michael

Hi,

ok, after digging in a lot of tables and columns and things, there’s a little light:

Archiving doesn’t mean Archiving (knickknack, knowhattamean, knickknack), but Preparation for a faster dashboard view…

There’s no zipping, saving, whatever. Alas!

So it seems Piwik urgently needs a “tidy up” button or automated “clear archive tables” functionality at the end of a month.

My second wish (if I have another…) would be a real archiving functionality for the log files, which are the base for the archive tables created on-demand. Could be something like autosaving all entries of the past month.

Seriously: Piwik is a wonderful piece of software! But database management must not result in buying more expensive webhosting packages because of need for a lot more database space!

Kind regards
Michael

P.S.: This seems to be a more feature related post, so some admin may move it to Piwik Feature Suggestions

there is a ticket for this feature request: http://dev.piwik.org/trac/ticket/53

and yes, reports are zipped before saved in the DB. However there is a lot of data to store, and it is duplicated for days, weeks months and year to some extent. Note that DB size does not impact performance in this case, but agreed that it might mean buying extra DB space for shared hosted environment for those using Piwik with non trivial websites.