dbt DAG Unifies SQL and Python

Diving deeper into

Julia Schottenstein, Product Manager at dbt Labs, on the business model of open source

Interview
with the introduction of Python models in the broader dbt DAG, we make the whole workflow a little bit smoother.
Analyzed 6 sources

Adding Python into the same dbt DAG turns dbt from a SQL transformation tool into a shared control plane for mixed analyst and data science work. In practice, an analytics engineer can build cleaned warehouse tables in SQL, then a data scientist can reference those upstream models in a Python file inside the same project, with shared lineage, testing, scheduling, and deployment instead of handing work off into a separate notebook or custom pipeline.

  • Before this, the common pattern was SQL models in dbt, then a separate Python workflow in notebooks, warehouse stored procedures, or Spark jobs. That split made ownership messy, because one team defined tables in dbt while another rebuilt logic elsewhere. Python models let both steps live in one dependency graph.
  • This fits dbt's larger product logic. dbt Core stays the open framework where teams define business logic in code, while dbt Cloud sells the workflow around it, IDE, CI checks, scheduler, docs, and governance. Python inside the DAG makes those paid workflow tools useful to a broader set of users than SQL analysts alone.
  • The competitive angle is important. Snowflake and Databricks both let teams run Python natively on their own platforms, but those tools are tied to one warehouse or compute environment. dbt's advantage is that the same project structure can span SQL and Python while staying vendor neutral across supported platforms like Snowflake and Databricks.

The next step is for dbt to own more of the daily workflow around warehouse data, not just table creation. As more teams mix SQL models, Python transformations, metrics, orchestration, and AI or ML steps in one project, the winning product will be the one that keeps cross functional work inside a single governed graph instead of scattering it across separate tools.