dbt
Blog Refactoring legacy SQL to dbt

Refactoring legacy SQL to dbt

As an analytics engineer on the professional services team at dbt Labs, I've helped a dozen or so data teams refactor from a legacy SQL transformation workflow to modular data modeling in dbt.

I've noticed a clear pattern: rare is the team that starts with a completely clean slate. More often than not, teams have accumulated years of SQL code powering business-critical, often complex data transformations.

These SQL transformations may run in a stored procedure, in an Airflow SQL operator, or some other scripting environment---but regardless of where it runs, that legacy SQL code must be migrated!

How can that migration be done with a) minimal headache and b) maximum quality?

Announcing a new (free!) on-demand course

Over the last 18 months, the dbt Labs professional services team has evolved a process to help teams migrate legacy SQL queries to dbt SQL.

And as of today, you can now learn + implement that same refactoring process, with a new on-demand course.

If you're completely new to dbt, you may want to start with the dbt Fundamentals course before diving into this one.

Everyone learns a bit differently, so you can pick up the course materials in a few different formats:

Our SQL refactoring process, in short

For a bit of background, the migration workflow that we follow internally is:

  1. Migrate your legacy SQL code, unchanged---make sure it runs and produces an identical output.
  2. Implement dbt sources (as opposed to raw table references).
  3. Choose a refactoring strategy.
  4. Implement clean CTEs (we ❤️ CTEs).
  5. Separate transformation logic into standardized data model layers (often the longest step).
  6. Audit the output (often using the audit_helper dbt package).

There is, of course, no one true way to refactor legacy SQL into dbt models!

I'd love to hear from you (on Twitter or in #learn-on-demand in dbt Community Slack) how your approach varies from the one we lay out in the course.

Last modified on: Oct 15, 2024

Early Bird pricing is live for Coalesce 2025

Save $1,100 when you register early for the ultimate data event of the year. Coalesce 2025 brings together thousands of data practitioners to connect, learn, and grow—don’t miss your chance to join them.

Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.

Read now

Recent Posts

Great data professionals never work alone

Every industry leader understands one thing: you need the right network to grow. The dbt Community connects you with 100,000+ data professionals—people who share your challenges, insights, and ambitions.

If you’re looking for trusted advice, expert discussions, and real career growth, this is the place for you.

Solve your toughest challenges

Join today and get real-world advice from experienced pros.

Expand your network

Foster connections with meetups, local groups, and like-minded peers.

Advance your career

The dbt community is full of learning opportunities and shared job postings.