ClickHouse as AstraZeneca Speed Layer
AI program manager at AstraZeneca on running self-hosted ClickHouse
The strategic point is that AstraZeneca was not just buying a faster database, it was separating real-time retrieval from batch oriented data engineering. Databricks still fit earlier machine learning and large data processing work, but once AI agents and live dashboards needed answers from billions of rows in under a second, ClickHouse became the speed layer. That turned patient record search and dashboard refresh from a waiting task into an interactive workflow.
-
The workload shape mattered. AstraZeneca describes simple aggregations on ClickHouse in 30 to 40 milliseconds and complex groupings across several petabytes in under 200 milliseconds, while the same dashboard style queries took minutes on Databricks. That is the difference between a dashboard that feels live and one that feels broken.
-
The product architecture explains the gap. ClickHouse was built for high concurrency, append heavy analytical workloads like logs, clickstreams, and user facing dashboards. In practice that means columnar storage, vectorized execution, sharding, and materialized views tuned for repeated aggregations, rather than a heavier lakehouse stack optimized first for ETL and ML workflows.
-
The trade off was operational ownership. AstraZeneca self hosts for compliance reasons, runs separate systems for governance and reporting versus speed, and accepts a steeper learning curve around cluster management, indexing, table engines, backups, and release testing. The payoff was much lower operating effort once tuned, with roughly 25 people doing work that previously needed around 100.
This points to a broader split in enterprise data stacks. Databricks and Snowflake remain the systems of record for transformation, governance, and broad analytics, while ClickHouse and similar OLAP engines are increasingly where teams land latency sensitive AI retrieval and customer facing analytics. As more AI products depend on subsecond lookups, this speed layer becomes a permanent part of the architecture.