dbt
Whoop

WHOOP improves efficiency by implementing dbt Core and migrating to dbt Cloud

At WHOOP, every decision starts with data. Migrating from dbt Core to dbt Cloud was critical for improving data integrity, accuracy, and governance at scale

Whoop
1day to migrate from dbt Core to dbt Cloud
3monthsto migrate from Redshift to Snowflake
32+hourssaved per month resolving data errors and issues

Tracking data to power performance and health

WHOOP is a wearable fitness tracker that helps people monitor their sleep, activity levels, and health. That also makes WHOOP a data company. Internally, its algorithms capture and analyze biometric data for people all over the world, 24/7. Across the business, every decision starts with data.

“Access to accurate data is critical,” says Matt Luzzi, Senior Director of Analytics for WHOOP. “It allows us to improve the customer experience and increase retention, lifetime value, and profitability.”

To manage the data, the WHOOP data analytics team initially adopted dbt Core. They quickly stood up a single, centralized layer for orchestration and transformation to increase visibility.

For a lean and technical team, dbt Core was effective. But as the team grew, they required a more scalable solution. They chose to migrate from dbt Core to dbt Cloud.

Data quality issues made it difficult to trust the data

While dbt Core provided a single, systematic approach to data transformation, the team encountered a few challenges as they grew.

For one, there was no centralized governance. Analysts could independently create nearly identical dbt models for the same data, without visibility into each other’s work. For another, they lacked built-in scheduling and orchestration. Since dbt Core doesn’t itself allow them to deploy their dbt models, the team relied on external tooling for orchestration.

“As we grew, we experienced bottlenecks from relying on other engineers just to do our work,” says Luzzi. “That meant stakeholders weren’t getting consistent answers to their questions or able to make decisions as quickly as they should.”

This setup also made it difficult to trust their data. If a model failed or a downstream model was skipped, it was difficult to troubleshoot the issue or vouch for the data’s integrity. To address these errors, Luzzi spent as much as a day every week addressing and resolving data quality issues—an unsustainable dedication of resources.

“The business was keen to invest in AI, but our models could only be as good as the data they're trained on,” says Luzzi. “To drive the business forward with AI, we needed data quality. We couldn’t be scrambling to fix inconsistent data.”

Finally, as the team prepared to migrate from AWS Redshift to Snowflake, they needed to ensure that the incoming data was clean and well-governed. That required strategic management; simply “lifting and shifting” their Redshift database wouldn’t cut it.

“I was already running into problems where the data wasn’t accurate or even available,” says Luzzi. “For our migration, trust in our data was our most important measure of success. When people lose trust, it's hard to win it back.”

Establishing a clean database structure on dbt Cloud

For the migration, data accuracy and integrity were the team’s foremost priorities. Given the technical debt they had accumulated, they decided to build a new project from scratch in dbt Cloud.

The team started with first principles: they identified key metrics to track, and then determined the tables and DAGs required to support these metrics. As a result, the actual migration from dbt Core to dbt Cloud took just one day—while ensuring they had all of the data they needed to create a single source of truth.

By migrating to dbt Cloud, the team also achieved the following:

  • A deliberate, incremental migration. To make the most out of their prior investment in dbt Core, the team brought new workloads onto dbt Cloud—while preserving existing infrastructure on dbt Core, like CI/CD processes, dev/test environments, and custom macros, tools, and command-line utilities.

  • Data governance best practices. The team established a weekly production release cadence based on software development practices: every Friday, they fork their dbt code off of production. An analytics engineer then implements the week’s QA changes into a single release, gets code-owner review, and runs unit tests to ensure data integrity. After pushing the update to production, they share release notes with the business detailing the changes. “All of this is possible because of the dbt ecosystem and its developer framework,” says Luzzi.

  • Breaking down data silos. The analytics team created a WHOOP Commons dbt project containing company-wide data and reusable code (e.g., macros) for use across data projects. With dbt Mesh, every team can access these common assets via dbt Explorer and reference them in their own projects. Now, they no longer need workarounds to bring models in from different projects. It’s all native within dbt Cloud and all of their models are transparent.

“dbt Cloud simplified our data transformation,” says Luzzi. “Now we only need a single analytics engineer to maintain dbt Cloud, which would have been unsustainable with dbt Core.”

Another benefit of moving to dbt Cloud has been achieving 99% documentation coverage. To maintain that, they use dbt Copilot.

"dbt Copilot has streamlined our process by cutting PR review times from thirty minutes to five—making it easy to maintain our 99% documentation coverage inertia,” says William Tsu, Senior Analytics Engineer at WHOOP. “This efficiency allows our analysts to focus on developing robust data models. It’s a significant step forward in our data warehouse strategy."

Scaling data transformation and improving business efficiency

As a result of the migration to dbt Cloud, the data analytics team is operating more efficiently.

“Now that we’ve implemented a weekly release process, we don’t spend time addressing ad-hoc data-update requests,” says Luzzi. “Our team is able to focus on data analysis, not building a data pipeline.”

It has resulted in greater trust, too. Previously, the team experienced at least one issue per week. Today, they experience no unexpected changes to historical data, no dbt production job failures, and no accidental errors in production code. The data team is confident in the accuracy of their data; most importantly, the business trusts their data ecosystem and integrity.

“I can sleep at night knowing that I'm not going to wake up to a Slack message from dbt Cloud, saying that a production job failed,” says Luzzi. “Even when we make changes to the database, our stakeholders can always trust the data they pull to make decisions for the business,” says Luzzi.

The migration to dbt Cloud also set the team up to adopt AI quickly and effectively.

“We've been one of the fastest companies in the world to adopt some of the most cutting-edge AI technologies and actually put them into production,” says Luzzi. “That’s a direct result of solving our data quality problem with dbt Cloud.”

As the team looks ahead, they’re exploring how dbt Semantic Layer can enable them to support AI use cases with natural language chatbots. If a machine can understand the semantics of their models, a user could query the data without knowing SQL or any type of coding language.

“dbt Semantic Layer will be a huge player in AI,” says Luzzi. “We’ve seen exciting new features from dbt Labs, and their roadmap is aligned with what we want to build next.”

Read more case studies

CHG Healthcare gets their data modernization right the first time

Read Case Study

Bilt Rewards builds scalable incremental models quickly

Read Case Study

Enpal fuels data efficiency with dbt Cloud and saves 70% on data costs

Read Case Study