Managed ClickHouse scaling challenges
Product manager at Firebolt on on scaling challenges and ACID compliance in OLAP databases
The real issue is that managed ClickHouse does not make capacity planning disappear, it mostly changes who clicks the buttons. ClickHouse Cloud separates compute and storage in its Scale tier and supports automatic vertical scaling, but horizontal scaling is still manual, so teams still need to size clusters for peak query bursts, ingestion spikes, and noisy neighbor risk instead of letting the system smoothly add and remove capacity on its own.
-
In practice, the pain shows up as overprovisioning. A team may need enough compute for the one hour a dashboard refresh storm or backfill job hits, but if scaling is coarse or manual they keep that larger footprint running longer than needed, which turns cloud convenience into an ongoing compute tax.
-
The oh no moment for less technical teams is usually day two operations, not day one setup. The first upgrade, failover, backup change, or cluster migration forces them to understand sharding, replication, and version compatibility, which is exactly the work they expected the cloud product to abstract away.
-
This is why Snowflake feels easier to many analytics teams. Its multi cluster warehouses can automatically start and stop extra clusters under load, while ClickHouse users still rely more on manual tuning, materialized views, sharding choices, and workload engineering to hold latency steady at low cost.
Going forward, the winners in real-time analytics will be the products that keep ClickHouse class speed while hiding cluster math from customers. If managed OLAP systems can make scaling invisible instead of merely hosted, they unlock a much larger buyer base than core database experts and platform teams.