High parts/partitions warnings shouldn’t be ignored. They can indicate a systemic problem with Clickhouse or your ingestion pipeline. Here’s how to diagnose it.

Check actual part counts

Start by verifying partition and part numbers across the cluster:

SELECT
    `table`,
    uniq(partition_id),
    count()
FROM clusterAllReplicas(default, system.parts)
WHERE database = 'my_database'
GROUP BY ALL
    ┌─table──────────────────┬─uniq(partition_id)─┬─count()─┐
 1. │ samples                │                  1 │     837 │
 2. │ hourly__agg            │                 83 │    6363 │
 3. │ daily__agg             │                 83 │    6065 │
 4. │ rt_events_src          │                  3 │   57206 │
 5. │ rt_stats_hourly__agg   │                  1 │   52850 │
    └────────────────────────┴────────────────────┴─────────┘

For non-clustered setups, drop clusterAllReplicas() and query system.parts directly.

In this example, the rt_ tables (real-time ingestion) have very high part counts - 50k+ parts is a problem.

Check replication queue

See if the replication log is backed up:

SELECT count()
FROM clusterAllReplicas(default, system.replication_queue)
WHERE database = 'my_database'
   ┌─count()─┐
1. │   10296 │
   └─────────┘

Run this a few times over several minutes. The count should be decreasing. If it’s stuck or growing, you have a problem - merges can’t keep up with ingestion.

Check active merges

SELECT count()
FROM clusterAllReplicas(default, system.merges)
WHERE database = 'my_database'
   ┌─count()─┐
1. │      10 │
   └─────────┘

You want small numbers here - dozens, not hundreds. If merges are piling up, Clickhouse is struggling to consolidate parts.

Check the logs

Look for serious errors in Clickhouse logs:

tail -f /var/log/clickhouse-server/clickhouse-server.err.log -n 100

You’re looking for:

  • Memory errors (OOM, RSS limits)
  • Large stack traces
  • Repeated failures

These warnings are normal and can be ignored:

<Warning> ... (ReplicatedMergeTreePartCheckThread): We have part 20251027_129034_129044_2
covering part 20251027_129040_129040_1, will not check

This just means a merge already happened and the old part is covered by a newer merged part.

Common causes

If parts are accumulating:

  1. Ingestion too fast - you’re inserting faster than merges can consolidate
  2. Too many partitions - partition by day, not by hour or minute
  3. Merge threads starved - check background_pool_size setting
  4. Disk I/O bottleneck - merges are I/O heavy

Quick fixes

Force a merge on a specific table:

OPTIMIZE TABLE my_database.my_table FINAL

Check merge settings:

SELECT name, value
FROM system.settings
WHERE name LIKE '%merge%' OR name LIKE '%background%'

If you’re consistently hitting high part counts, you likely need to either slow down ingestion, batch inserts more aggressively, or tune your merge settings.