Data path control determines winners

Diving deeper into

Charles Chretien, co-founder of Prequel, on the modern data stack’s ROI problem

Interview
If they want to keep on expanding, they've got to try and shift left, which is going to be harder for them to pull off.
Analyzed 6 sources

This is a distribution problem disguised as product expansion. Databricks began closer to the raw data and compute layer, where one workload naturally leads to the next, so it can add warehousing, ML, AI, and application database products onto an existing engineering workflow. Snowflake began with the warehouse itself, so moving left means persuading teams to adopt tools that sit earlier in the data pipeline and are tied more tightly to developer and infrastructure choices.

  • Databricks started as managed Spark and now sells a broader lakehouse stack. Recent research shows only about 40% of its revenue comes from expansion products like warehousing and AI, which means its original platform still acts as the landing zone that pulls customers into new products.
  • Snowflake has expanded mainly around the warehouse, with moves into app building through Streamlit and AI through Cortex. To go further, it has started buying upstream assets like Crunchy Data for Postgres, which shows how much harder it is to move from analytics outward into operational workloads.
  • The broader market has shifted from stitching together many point tools to buying integrated platforms. That favors the vendor that already owns more of the data path, because each added product can be sold as the next step in an existing workflow, not as a separate migration.

Going forward, the winners in data infrastructure will be the ones that control more of the path from application data creation to analysis and AI use. Databricks is better positioned for that full stack pull. Snowflake is now racing to become credible earlier in the stack, where developer habits and database choices are formed first.