Hello, we’re Chris and Rachael, two data analysts at dbt 👋. Today, we're reflecting on the things we wish we had known when we first learned dbt. Unsurprisingly, we think dbt is pretty great. But it wasn’t always smooth sailing.
To be frank, when we first encountered dbt, we each felt confused and uncertain. We knew SQL and were semi-comfortable with data modeling concepts and git, but dbt’s technical terminology induced some serious impostor syndrome. We didn’t identify as engineers, or even pretengineers. YAML? Jinja? Unit tests? We had gotten used to transforming our data via simple internal tooling and cron jobs. We didn’t quite understand what all the dbt hype was about.
Despite our initial hesitation, we ultimately each had to take the plunge. For Rachael, the tipping point was moving to a new startup with “the modern data stack”. For Chris, his team had reached the limits of their legacy infrastructure and realized they needed a more robust solution. In learning dbt, it took time to connect the dots, to grok project structure, and to wade through the breadth of resources and find the ones we needed. But once everything clicked, it was transformative. Things like environment separation went from intimidating concepts to common sense, and our workflows improved dramatically in a way we hadn’t anticipated. We wished we had done this years earlier.
This post is all the specifics on what we wished had clicked earlier for us and made this journey smoother. If you’re an analyst who doesn’t quite “get” dbt yet, maybe this is for you. We’ll first share what makes dbt valuable to us as analysts, then we'll dig into how some of the main concepts all fit together—no engineering background required.
Why dbt helps us as analysts
When learning dbt, we found the biggest challenge to be in revising how we thought about data transformation. Instead of viewing queries and analyses as one-off scripts, dbt pushed us to learn best practices around scalable architecture and a robust data ecosystem. This paradigm shift transformed how we worked as data analysts. It brought structure, reliability, and automation to our transformations, unlocking value in four key ways:
1. Less manual work, more impactful analysis
Before dbt: We were running queries manually or at set times, exporting CSVs, and passing cleaned data to stakeholders. We spent so much time just wrangling data, taking bandwidth away from deeper analysis.
With dbt: Our transformations are automated, version-controlled, and repeatable. If we need to adjust a calculation, fix a data issue, or update a data table, we update the relevant model in our repository and let dbt handle everything else downstream. No more manual pipeline dependency management required.
Example: We build a “customer lifetime value” model in dbt. When marketing wants an updated report, all we do is run dbt—it pulls in the latest data, recalculates everything, and updates the results in the report.
2. Transparent business logic (that we can explain)
Before dbt: We often lost business logic to messy SQL scripts, spreadsheets, dashboards, or spaghetti notebooks. If someone questioned a number, tracking down its source felt like detective work and became a huge time suck.
With dbt: Every data transformation is documented, version-controlled, and transparent. The logic behind metrics lives in one place—our dbt project. This makes explaining “how the sausage gets made” straightforward and accessible to anyone with access to our repo or dbt.
Example: When a stakeholder asks, “How did you calculate monthly active users?”, we now point them directly to a tested and documented dbt model in our repository instead of scrambling through five different queries or a stale notebook.
3. Collaboration without chaos
Before dbt: SQL scripts often lived across personal folders, BI tools, and Slack, making collaboration difficult and error-prone.
With dbt: Our whole team works in the same version controlled repository. Every change is reviewed through pull requests, and we can easily roll back changes when needed. No more wondering who changed what—or where the most recently agreed upon business logic lives.
Example: When we work on the same report, we can build models in parallel, knowing dbt will merge them smoothly once approved.
4. Greater accuracy and reliability
Before dbt: We validated results manually, using spot-checking or a gut feeling to ensure our data looked right. Sometimes we didn’t realize when things were broken until our stakeholders flagged it in a report. Human error is hard to avoid.
With dbt: CI functionality lets us understand what we are changing, so we can fix bugs before they go live. Automated tests as we deploy ensure that our data meets specific criteria—whether it’s checking row counts, unique values, or even more complex custom logic. We know immediately if and where something breaks.
Example: We add a test ensuring that every transaction has a valid customer ID. If the test fails, dbt tells us exactly where the issue is—before it makes its way into a report. Now our team can proactively roll out a fix without manually inspecting our data pipeline piece by piece.
Four main reasons we’re now so big on dbt
- Automation- We’ve eliminated time spent on repetitive tasks, manual queries, and data testing.
- Transparency- Everyone can see and explain how every metric is calculated.
- Collaboration- No more duplicative queries, lost scripts, or reading .sql files in Slack.
- Correctness- We know our data is more accurate with built-in testing and checks.
How we do our work in dbt
Hopefully by now, we’ve painted a picture of why we've found dbt so helpful in our analyst work. dbt has a lot of functionality, and we’re certainly not going to cover it all today. But we want to chat about the three main ADLC stages we execute on most as analysts in dbt and the relevant terminology, to help build a better mental model of the product (caveat: we use dbt Cloud).
1. Develop
All the folks using dbt Core plus many in dbt Cloud are used to a command line (CLI) experience when writing code. That's one of the main interfaces for developing (or building tables/views i.e. models) in dbt. However, it’s not the only one. While Chris tends to invoke dbt commands in his local CLI with VS Code, Rachael prefers to run commands and build models in the dbt Cloud IDE. She finds it to be an easier interface coming from other GUIs like RStudio. dbt is also launching a new no-code interface (Visual Editor) coming next year for other types of users who are less comfortable with SQL.
2. Deploy
Once we develop and test a model, we deploy it to our production environment. With the dbt Cloud orchestration functionality (or another product if you’re using dbt Core), we can also use a job to keep it up to date. There are many ways to run jobs. Sometimes it can be as simple as setting them to run at a certain time, but sometimes we want them only to run after other jobs complete, or run when we merge a change.
3. Explore
Data architecture gets complex with time. It’s hard for us to remember every model or column name, the code that builds them, or what they represent. Thus, we're very often poking around dbt Cloud Explorer to view model and column-level lineage for upstream/downstream dependencies, reference documentation, or peek at data health.
4. And more (…but not for today)
We haven’t even scratched the surface of how we do testing, how we scale our architecture across projects with Mesh, how we dynamically codify metric breakdowns with the Semantic Layer, and more. We acknowledge there's much more analyst-on-dbt ground to cover in future posts.
What’s next
We’ve covered a lot of content in this short post. We've recapped why we find dbt valuable, explained how we actually use it in our main day-to-day, and hinted at more to come. But Rome dbt models weren’t built in a day. From here, if you want to learn more, we recommend the dbt Fundamentals course. Or if you’re more of a learn-by-doing type of person, dbt Cloud has a free plan where you can jump in and start trying things out. We’d love for you to catch us in the community slack to say what else you’d like to see as well.
Last modified on: Dec 13, 2024
Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.