Lightweight delete in ClickHouse
Clickhouse now offers “lightweight deletes”. The feature is best described by developers:
The idea behind Lightweight Delete is that when a
DELETE FROM table ...
query is executed ClickHouse only saves a mask where each row is marked as either “existing” or as “deleted”. Those “deleted” rows become invisible for subsequent queries, but physically the rows are removed only later by subsequent merges. Writing this mask is usually much more lightweight than what is done byALTER table DELETE ...
query.
delete from [db.]table [on cluster cluster] where expr
In order to test it and see results, remember to run OPTIMIZE .. FINAL
. It
will actually remove the rows otherwise they will be removed at some point in
the future.
Alternatives - Mutations
If you feel brave (or you operate at smaller scale), you can still use mutations in order to remove data:
alter table sessionstats.impressions delete where customerid = '1002'
Alternatives - Dropping partitions
Remember that sometimes it might be easier to remove entire partition like so:
alter table table1 drop partition '...';
You can check partitions you have with:
select partition
from system.parts
where table = 'table1'
or something more fancy:
select table, partition, count()
from system.parts
where table in (
'table1',
'table2',
)
and partition in (
'20230518',
'20230517'
)
group by table, partition
order by table, partition;