Proactively improve your dbt projects with new dbt Explorer features
Feb 13, 2024
ProductIt all starts with a question.
An executive may ask, “how did sales last week compare to a year ago?” Marketing ops pops in with, “which customers are based in Atlanta?” And now, the product lead inquires, “what impact did this new feature have on user engagement?”
How do you know where to look? What’s the source of truth? What story does that data tell? Do you trust the data? If not, what steps can you take to improve data quality?
To make the most of your data, simply building transformations isn’t enough. Data producers and consumers need to know where to look, what to trust, who to talk to, and what to do next.
Enter dbt Explorer, dbt Cloud’s knowledge base and lineage visualization experience. dbt Explorer helps dbt developers understand data lineage so they can build and improve their dbt projects. Downstream analysts use dbt Explorer to navigate and leverage data products so they can deliver trustworthy insights.
Learn more about how dbt Explorer works and how it improves upon legacy dbt Docs.
Today, we’re excited to announce the release of new features in dbt Explorer that make it even easier for data developers and analysts to navigate, understand, and improve their dbt projects: improved search and lineage, model performance analysis, project recommendations, and column-level lineage.
Let’s dive into these new features and how they help our customers build organizational knowledge to arrive at better decisions.
"dbt has always been an essential tool for managing table lineage in cloud data platforms, but the addition of column-level lineage is a game changer for analytics engineers. With column-level lineage, dbt Cloud customers can rapidly identify the potential downstream impacts of table changes, or work backwards to quickly understand the root cause of an incident. Integrating with dbt Explorer is only the beginning, and we’re excited to incorporate column-level lineage across other parts of the platform in the months ahead."
– Drew Banin, Co-founder at dbt Labs
What’s new in dbt Explorer
Below is a quick look at the new features in dbt Explorer:
Improved search and lineage
We’ve improved the keyword search interface to make it more intuitive to find the resources you need. You can also filter models by access and materialization, and even search columns within resources. Lineage is also more performant at scale, providing more navigation options and details.
Model performance analysis
Who among us doesn’t have a dbt project in need of a little glow-up? Now, it’s easy to identify low-hanging fruit for optimization so you can keep your dbt estate performing as efficiently (and cost effectively) as possible.
"dbt Explorer makes it easy for our consumers to understand the entire lineage from the source to reporting—and all of the data quality checks or issues along the way—without having to “go ask a dev.” The performance feature saved my team hours (if not days) analyzing model build durations over time so we could make intelligent decisions about job orchestration and scheduling."
– Robert Goodman, Lead Developer of Enterprise Data Analytics at Lennar
Project recommendations
No one likes an urgent fire-drill alerting you to an outage or quality issue in your data pipeline. Project recommendations uses dbt Cloud metadata to proactively surface ways to improve the test coverage, documentation, and overall project health of your dbt models. Now, data teams can tackle project improvements on their own time to ensure that pipelines are performant and data trust remains solid.
Column-level lineage
At long last…it’s here! 🎉 dbt Explorer and the Discovery API now provide column-level lineage for models, sources, and snapshots within a dbt Cloud project. The lineage reflects the latest applied production state of the columns and how they’re used in the project. Column-level lineage can be used to improve many data development workflows including:
- Auditing: See how data is being used and flows through your dbt projects
- Root cause analysis: Get to the bottom of an issue and quickly trace it back to its source
- Impact analysis: Anticipate how a change to a model affects downstream consumers
"dbt's column-level lineage has transformed our analytics workflow by not just tracking data lineage, but understanding the journey of individual columns from raw input to final analytical models. Engineering teams can now visualize the exact origins and transformations of columns they use, fostering trust and understanding. The level of granularity saves us hours of investigation, preventing inaccurate decision-making."
– Saravanan Manoharan, Senior Data Analyst at Novo Nordisk
"Column-level lineage in dbt Explorer streamlines root-cause and impact analysis. Rather than painstakingly tracing a column forward or backward in our lineage graph, we’re now able to easily follow it up and downstream. As a result, we’ll be able to troubleshoot issues more quickly, and develop a more accurate understanding of potential data model changes."
– Katie Claiborne, Staff Analytics Engineer at Cityblock Health
Showcasing workflows for data developers and data analysts
dbt Explorer provides a shared canvas for data developers and data analysts to collaborate on data, with rich features that help each user solve their unique problems. Let’s walk through a few examples of how various users could interact with dbt Explorer to build, discover, and leverage dbt assets to deliver trusted data, faster.
Data developer: Create, reuse, and improve data products for analysts
Discover and (re)use
As a data developer, you spend much of your time building data models for downstream consumption. It’s best practice to keep code DRY, and so before building, you want to understand what already exists, where it came from, and how it’s being used.
Using dbt Explorer, you get a holistic view into your dbt projects, models, and their lineage, with rich additional detail for every resource in your DAG. You can search for specific objects, filter on vectors like resource type, access level, and materialization, and even explore your assets in a file tree view.
Diagnose and improve
Inevitably, you spend some time triaging fires, and so you need a way to quickly spot and diagnose issues in the data pipeline. You can filter tests by status to quickly assess what needs your attention and use column-level lineage to understand downstream impact.
And ideally you’re getting into a flow state where bugs and quality issues are addressed proactively—before an end user alerts you to the issue. Built-in features like recommendations and model performance deliver the guidance you need to proactively improve your dbt projects.
"dbt Explorer has been instrumental in elevating our data modeling and analytics processes. We gained valuable insights into project data quality and adherence to dbt best practices. It not only helped us pinpoint areas for code enhancement but also significantly improved our documentation practices. We achieved substantial enhancements in data quality percentages, effectively mitigating data errors in the bronze/silver layer and ensuring a higher standard of data quality for our end consumers. dbt Explorer is an indispensable ally for any data-driven organization aiming for excellence in their analytics workflows."
– Shravan Banda, Solutions Architect at World Bank
Data analyst: Find trustworthy answers and deliver insights to decision makers
Find, examine, trust, leverage
Data analysts are downstream from the data developers, and need a way to first discover, and then leverage existing data assets to provide actionable insights to business stakeholders. A primary concern is ensuring the data they’re using is trustworthy.
Using dbt Explorer, you’re empowered to navigate data resources and their lineage, with the ability to easily search for relevant models, filter down to metrics of interest, and visualize dependencies so you can build confidence in how the data is flowing from source to output. You can dive into metadata that gives you context into definitions, data quality, data freshness, and more so you’re sure that decisions are being made from trusted data.
Looking ahead
We have big plans for dbt Explorer, and if you’re interested in learning more about our product vision, keep reading! We also love gathering user feedback, so please keep it coming.
Making it easier to find the resources you need
We’re committed to making continuous improvements to dbt Explorer that make it easier navigate your dbt projects. A few priorities that are top of mind:
- Richer metadata: Capturing more timely and higher fidelity metadata from your dbt projects, especially warehouse table metadata and project definition changes, for insights into the latest state of your sources and models. In addition, lineage will expand to active sources and active exposures, for more end-to-end visibility from common source systems to data consumption use cases.
- Lineage layers: Incorporating lineage layers to make it easy to distinguish nodes and identify issues. With added relevant context like latest model execution status and materialization type, lineage layers help your dbt DAG represent a clearer and more actionable map of your data pipeline.
- Improved search: Making it easier to find relevant resources—especially models and columns—by searching using keywords based on resource name, column name, descriptions, code, and relation; introducing new filters like model type (e.g., marts); and supporting more selectors in lineage search.
Viewing project state in development
We also plan to have dbt Explorer support non-production environments. You’ll be able to view your dbt Cloud project’s metadata in staging and development and get a holistic understanding of how it evolves. That way you can understand changes like a new model build and catch issues like a failing test before they hit production. dbt Explorer will also natively integrate with your dbt Cloud development workflows; think of editing a resource in the IDE from dbt Explorer, or seamlessly navigating from the Cloud CLI to dbt Explorer.
Enhancing collaboration
Today, dbt Explorer is a launchpad for data developers to discover new ways to improve their projects, save time, reduce costs, and improve data quality. We’re focused on investing in new capabilities that further empower downstream analysts to self-serve data and get the context they need to definitively trust that data so they can be a strong partner to the business in driving better decisions. Some examples include:
- Tailored project overviews: Providing a distilled view of relevant models and metrics personalized for specific user types.
- Accessibility and permissions: Making it simple to onboard your colleagues to collaborate in dbt Explorer.
- Embedding and sharing: Share views to surface insights, align on trust signals, and share the value of data products with more collaborators.
Getting started
dbt Explorer is currently available in public preview to dbt Cloud Team and Enterprise plan customers. Column-level lineage is available in public beta to Enterprise plan customers only.
To get started, navigate to the “Explore” tab in dbt Cloud or check out the docs, and be sure to sign up for our webinar to see these features in action.
Last modified on: Sep 11, 2024
Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.