dbt and the Modern Data Stack: How Data Engineering Changed in Three Years
dbt transformed analytics engineering. Combined with Snowflake, BigQuery, Fivetran, and the emerging AI layer, the modern data stack has become a genuine platform.
When dbt (data build tool) appeared in 2016, it solved a specific but pervasive problem: analytics teams needed a software engineering workflow for their SQL transformations, and they didn’t have one.
What dbt Actually Does
dbt transforms raw data (already loaded into your warehouse) using SQL. It defines transformations as models (SELECT statements), handles dependency management between models, runs data quality tests, and generates documentation automatically.
The analytics engineering role — data work applied with software engineering discipline — emerged largely around dbt. Teams that adopted dbt stopped having “single analyst who knows how all the tables work” problems.
The Modern Data Stack
dbt sits in the transformation layer of a broader stack: Fivetran or Airbyte for extraction, Snowflake/BigQuery/Databricks as the warehouse, dbt for transformation, and Looker/Metabase for visualization.
The stack has matured. The main complexity now is governance: who owns what, who can change what, what are the data contracts between producers and consumers.
Where It’s Heading
Semantic layer: dbt’s semantic layer aims to create a single definition of business metrics, preventing the “different dashboards show different revenue numbers” problem.
AI in the data stack: Natural language to SQL tools are layering on top of the modern data stack. The accuracy on simple queries is high; complex multi-join queries remain challenging. The stack is becoming a foundation for AI-powered data products.