ClickHouse tiered storage upsell

Diving deeper into

Product manager at Firebolt on on scaling challenges and ACID compliance in OLAP databases

Interview
Tiered storage is one of the mechanisms that ClickHouse keeps proprietary; they haven't pushed that down to the open source product.
Analyzed 5 sources

Keeping tiered storage out of open source turns a core cost lever into a cloud upsell. In practice, that matters less for many log workloads than it sounds, because ClickHouse compresses repetitive event data well enough that teams can often keep 30 day retention on attached disks, then fake a hot and warm layout by pairing SSDs with cheaper slower volumes on the same nodes instead of pushing old data to S3 or Blob storage.

  • For observability data, the first win usually comes from compression, not from object storage. In this case, moving from OpenSearch to ClickHouse cut cluster size to about one third, extended retention from 7 to 30 days, and still lowered cost, which is why missing native tiered storage was manageable.
  • ClickHouse Cloud uses separated storage and compute with object storage as the primary durable layer, and markets long retention as part of that design. That makes tiered storage less of a nice extra and more of a product boundary between self hosted open source and the managed service.
  • This fits the broader ClickHouse playbook. Open source gets teams in the door for fast, cheap, append heavy analytics, then proprietary cloud features, plus less operational work, become the reason to upgrade once datasets, compliance needs, or scaling pain get big enough.

Going forward, the line between local disk tricks and true cloud storage separation will matter more as customers want longer retention without babysitting clusters. That pushes ClickHouse further toward monetizing cloud only storage economics, while self hosted users will keep stretching open source with compression, attached warm disks, and careful data layout until the operational burden outweighs the savings.