Daten vor einem Datum komplett löschen

Hallo,

leider werde ich aus den FAQ nicht schlau. Ich habe ein paar Fehler in einem importierten datensatz aus Google Analytics entdeckt und möchte nun in der Website (id=4) den Datensatz lösen, der entweder zwischen, oder vor einem bestimmten Datum liegt - und zwar vollständig, so dass ich den Re-Import wieder starten kann.

Wie gehe ich da am besten vor?

Danke für Tipps!

Hat keiner eine Idee? Muss doch irgendwie machbar sein!?

Hast du mal das DSGVO Hilfsmittel probiert?

1 Like

Hallo Tom und danke für die Antwort!

Das ist das, was ich nicht verstanden habe.
Da kann ich doch nur einzelne Segmente der Daten löschen, aber noch nicht den kompletten Datensatz rückwirkend ab Datum X?

Er findet leider keine Datensätze, obwohl welche durch den Import von Google Analytics welche vorhanden sind

Hallo,

Der Google Analytics import kann keine Rohdaten, sondern nur die aufsummierten Reports importieren. Daher gibt es im GDPR-Tool nichts zu löschen.

Wie man die Reports löscht, weiß ich nicht.

@diosmosis Do you know an easy solution to delete imported GA data for da date range so that it can be imported again?

I don’t think I’ve ever checked specifically for this, but if you schedule a re-import, the new import should overwrite the old data. If not, you’d have to delete the archive data and then schedule a re-import.

Hi @diosmosis,

to delete archive data, ist this the correct command?

./console core:delete-logs-data --dates=2015-01-01,2015-01-31 --idsite=42

No, that deletes raw (log) data, not archive data. @Lukas do you know of a way to delete archive data through a command or does it have to be done via the database? I don’t think the archive purging command will do it.

@Lukas
Keine Idee, wie man den Zeitraum löschen kann?

@Lukas, @diosmosis
Keine Ideen, no ideas?

@Miches we’ve seen the patch doesn’t work, though it should. I’m looking into why it’s not working, when there’s a fix I’ll let everyone know.

Sorry @Miches that was for a different thread, if you want to delete archive data from the table you have to use SQL to delete the archive rows from the table for the site & date. Or you can just try re-importing the data you want to re-import for a day and see if it shows the new data or not. If not, then you have to delete the data via SQL.

@diosmosis Ah, ok!
Was this the correct SQL-Stement and wehre can i set the SiteId??

DELETE log_visit, log_link_visit_action, log_conversion 
FROM log_visit 
LEFT JOIN log_link_visit_action ON log_visit.idvisit = log_link_visit_action.idvisit 
LEFT JOIN log_action ON log_action.idaction = log_link_visit_action.idaction_url 
LEFT JOIN log_conversion ON log_visit.idvisit = log_conversion.idvisit 
WHERE config_resolution = 'unknown' 
AND visit_last_action_time >= '2019-05-01' 
AND visit_last_action_time <= '2019-05-16';

You have to delete them from the archive tables (the tables that contain archive_ in their name), those are the log tables (as they contain log_ in their name). Why don’t you try just re-importing first?

I try that, but it doesn´t work

Thanks for checking, I’ll create an issue for that.

To delete the archive rows you should run SQL like:

DELETE FROM $table WHERE idsite = $idSite AND date1 = $startPeriodDate AND date2 = $endPeriodDate AND period = $period

$table is the archive table the archive is in (archive tables are segmented by month). $startPeriodDate/$endPeriodDate is the inclusive boundaries for the period. $period is the integer period value (1 for day, 2 for week, 3 for month, 4 for year, 5 for range). It should be enough to delete the day period and re-import.

I would do the following:

  • backup your database or just the archive table in question just in case
  • look at the data currently and note some values that are different in GA
  • delete the archive data as stated above
  • check in the UI for those dates that there is no longer report data for the date
  • if there isn’t, re-import for the days
  • check the new reports and look for the data that is in GA

@diosmosis Thanks for reply!

Is this the correct syntax?

DELETE FROM matomo_286_ST_archive_numeric_2016_02  WHERE idsite = 4 AND date1 = 2016-02-19 AND date2 = 2016-02-29 AND period = 1

You’ll need quotes around the string values, but each archive is for one day. If you want to delete everything between 02-19 & 02-29, use:

DELETE FROM matomo_286_ST_archive_numeric_2016_02  WHERE idsite = 4 AND date1 >= "2016-02-19" AND date1 <= "2016-02-29" AND period = 1
1 Like