dbt
Blog Building the next-gen dbt engine: How SDF levels up data tooling

Building the next-gen dbt engine: How SDF levels up data tooling

Two weeks ago we shared the news that dbt Labs has acquired SDF Labs. We heard the same reaction from many of our customers, partners, and community members:

  1. This is very exciting!
  2. What does it mean, exactly?

So today, we wanted to take some more time to explain.

What will this do?

With the integration of SDF under the hood, dbt will be both much faster and significantly more cost-efficient — while unlocking new metadata use-cases like true column-level lineage.

As a standalone tool, SDF made modern data development easier than ever, thanks to some seismic technical innovations.

Over the past week, we began peeling back the layers on those innovations for you in a series of posts, sharing what makes them possible, and what they can unlock in your data workflows.

In the first post, we covered SQL comprehension — why it matters, and how it can be done at three distinct levels of precision.

The 3 Levels of SQL comprehension

Then we followed up by diving into why achieving the highest level of SQL comprehension is a devilishly complex technical challenge, which requires several distinct layers of underlying technology.

In this weekend’s Roundup - we'll unpack the power of a Compiler — how the logical plan enables it, and what it might do for data transformation workflows.

A lot of technical explanations! But now you have the context to know why this matters.

dbt should know about SQL

Thanks to the strength of the dbt Community, over the past 9 years, dbt has achieved widespread adoption and defined the modern data development experience.

And as dbt has scaled to be the standard worldwide, we’ve had the opportunity to learn — collectively and in public — about a lot of things that have worked and are worth keeping, and some things that are ready for an update. We also get the privilege of rethinking foundational choices from 2016 with the benefit of 2025 technology.

We’ve been specifically following the topic and tooling around SQL Comprehension for some time now. This is a problem we’ve been excited to solve so that we could elevate the dbt developer experience further.

But we didn’t just want to solve it. We wanted to do it right - we should have immediate wins for dbt users and real technical depth. And we needed to know that we were introducing a durable solution that is going to stand the test of time.

A new engine for dbt

Last year, we met the team at SDF, and we knew they had something special. Their approach to SQL comprehension operates at all three of the levels of comprehension we shared in the first post above:

  1. SDF is a parser, with syntactic support for several major dialects (and more on the way)
  2. It’s a compiler, capable of precise validation and calculation of column-level lineage that’s fast at scale
  3. And it’s an executor, leveraging Apache DataFusion’s to enable local development workflows

SDF was built by a hyper-talented team of people with world-class **expertise in the technologies required to enable SQL comprehension at every level. They know how to build an engine of the technical depth that befits being the industry standard.

And it’s these same people who are building a new engine for dbt.

Importantly, this will all be done under the same code authoring layer that we’ve all spent the last decade building together — the one learned by practitioners the world over and adopted by tens of thousands of companies: the One dbt standard that enables collaboration across dbt Cloud and dbt Core.

Right now the dbt Labs team is heads down on this integration to make it the best it can be. We’re motivated to start sharing progress soon - tune into our Spring Launch event on March 19th for more info. We’re excited about just how much this new engine will unlock when it’s ready: many things we’ve all dreamed of having in dbt, soon to be within our grasp.

Hold on folks - this is going to be a fun one.

Last modified on: Jan 31, 2025

Build trust in data
Deliver data faster
Optimize platform costs

Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.

Read now ›

Recent Posts