Form Analytics doesn't cleanup its old form_field/form_page log data

I’m using Matomo (3.14.0) with Form Analytics (3.1.28). Both “delete old raw data” and “delete old aggregated report data” policies are in place (with core:archive cron job). Still, in practice, I see that Form Analytics purges records from the log_form db table only, leaving log_form_field and log_form_page tables filled with all the historical data from the beginning of time.

Here is my table size summary:

MySQL [matomo-prod]> SELECT TABLE_NAME AS `Table`, ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024) AS `Size (MB)` FROM information_schema.TABLES WHERE TABLE_SCHEMA = "matomo-prod" ORDER BY (DATA_LENGTH + INDEX_LENGTH) DESC limit 11;
+------------------------------+-----------+
| Table                        | Size (MB) |
+------------------------------+-----------+
| matomo_log_form_field        |      9185 |
| matomo_log_form_page         |      1405 |
| matomo_archive_blob_2020_06  |      1195 |
| matomo_archive_blob_2020_08  |      1101 |
| matomo_archive_blob_2020_05  |      1093 |
| matomo_archive_blob_2020_07  |      1039 |
| matomo_log_action            |       704 |
| matomo_log_visit             |       641 |
| matomo_log_link_visit_action |       363 |
| matomo_archive_blob_2020_09  |       200 |
| matomo_log_form              |       116 |
+------------------------------+-----------+

I can confirm that old records are correctly purged from log_form but not from log_form_page/log_form_field tables, matching them by idvisit/idlogform. How can I clean that old data and prevent its accumulation in the future?

Good afternoon.
We faced the same problem. The lack of the ability to automatically clear such data in the database greatly upsets us and forces us to perform additional manual actions to monitor free space and directly clean the data. For what reason is this data not cleared?

Thank you for the report! we have investigated and can confirm this is an issue. We’ve created a bug report in https://github.com/matomo-org/matomo/issues/16529 and you can follow progress here. It’s not currently a priority to fix this, but we’ll get to it next year.

Hi, Matthieu, does Form analytics work with apps? Or it could be used only on websites? Like Heatmaps

Hi there,

Form analytics will not work on native apps. For a list of plugins that are not compatible with native apps please check the FAQ here: https://matomo.org/faq/plugins/faq_32898/

Albeit there is no fix available as of now, I managed to write these two stored procedures to cleanup log_form_field / log_form_page tables. The deletion is performed gracefully by chunks not to disrupt a running Matomo website. I’ve used it to wipe 100M accumulated “dead” records successfully on a live production instance. One can also set up a scheduler event to trigger the cleanup daily. Here it is:

DROP PROCEDURE IF EXISTS matomo_cleanup_lff;
DELIMITER $$
CREATE PROCEDURE matomo_cleanup_lff()
BEGIN
REPEAT
DO SLEEP(1);
DELETE lff1 FROM matomo_log_form_field AS lff1
    JOIN
    (SELECT 
        lff.idlogform, lff.field_name, lff.idformview
    FROM
        matomo_log_form_field lff
    LEFT JOIN matomo_log_form lf ON lff.idlogform = lf.idlogform

    WHERE
        lf.idlogform IS NULL
    LIMIT 20000) AS lff2 ON lff1.idlogform = lff2.idlogform
        AND lff1.field_name = lff2.field_name
        AND lff1.idformview = lff2.idformview;
UNTIL ROW_COUNT() = 0 END REPEAT;
END$$
DELIMITER ;

DROP PROCEDURE IF EXISTS matomo_cleanup_lfp;
DELIMITER $$
CREATE PROCEDURE matomo_cleanup_lfp()
BEGIN
REPEAT
DO SLEEP(1);
DELETE lfp1 FROM matomo_log_form_page AS lfp1
    JOIN
    (SELECT 
        lfp.idlogformpage
    FROM
        matomo_log_form_page lfp
    LEFT JOIN matomo_log_form lf ON lfp.idlogform = lf.idlogform

    WHERE
        lf.idlogform IS NULL
    LIMIT 20000) AS lfp2 ON lfp1.idlogformpage = lfp2.idlogformpage;
  UNTIL ROW_COUNT() = 0 END REPEAT;
END$$
DELIMITER ;

DROP EVENT IF EXISTS matomo_form_cleanup;
DELIMITER $$
CREATE EVENT matomo_form_cleanup
ON SCHEDULE EVERY 1 DAY STARTS CURDATE() + INTERVAL 2 HOUR
DO BEGIN
  CALL matomo_cleanup_lff();
  CALL matomo_cleanup_lfp();
END$$
DELIMITER ;