Why managing data source changes in Tableau is challenging (and how dbt can help)
May 30, 2024
LearnTableau is considered by many to be best-in-class for data visualization. It’s a popular tool in the world of data analysis and business intelligence, enabling users to create compelling and interactive dashboards and reports. Its user-friendly interface and powerful processing capabilities make it a top choice among professionals for transforming complex data into actionable insights.
Published data sources are one of the most widely used and powerful features of Tableau. They're also a source of considerable tech debt in virtually all Tableau Server deployments. Data duplication becomes rampant, extracts fall out of use but continue to consume resources, and the functionality of published data sources is under constant threat from changes to the data they depend on. This amounts to a lack of data consistency, impaired data trust, and burnt-out data teams.
That’s the bad news. The good news is there IS a better way, and that dbt Cloud makes it possible to serve up more relevant, efficient, current, and agile data sources. When coupled with the dbt Semantic Layer, these data consistency and velocity issues evaporate. The dbt Semantic Layer gives data teams a governed and scalable way to define metrics and offers a first-class integration into Tableau so downstream consumers can get answers to their questions in an interface that’s familiar, flexible, and accessible.
Using the dbt Semantic Layer and Tableau together has numerous benefits, and in this blog post, we'll focus on how these solutions improve how teams manage changes to underlying data.
What is the dbt Semantic Layer?
The dbt Semantic Layer is a technology that allows you to centralize your metric definitions and make them available to users across an enterprise through a consistent, tool-agnostic interface - in other words, to create the coveted “single source of the truth” for all your organization’s metrics. That “single source of truth” makes it possible to guarantee consistency wherever and however your end users consume data.
The dbt Semantic Layer not only makes the coveted “single source of truth” possible, it also makes it easy to implement. Through a simple, declarative interface, users can model the metrics, dimensions, and data relationships that they wish to expose to their data consumers. The dbt Semantic Layer then leverages that semantic modeling to enable dynamic querying and optimized SQL generation, even the automatic—and correct!—handling of joins to satisfy user requests.
Tableau enjoys a first-class integration with the dbt Semantic Layer through a custom live connector, currently generally available in Tableau Desktop and Tableau Server (see our Tableau Exchange listing), and in the future will be available in Tableau Cloud. You can connect to your dbt Semantic Layer and guarantee that your Tableau consumers are always getting the same answer to the same question, every time.
How the dbt Semantic Layer improves data change management
As noted above, most data consumed in Tableau dashboards is served up in the form of published data sources. With many users of Tableau building data sources and dashboards that users depend on, the data landscape eventually comes to resemble a “Wild West” of content that's difficult to manage at scale. Duplicate data sources lead to confusion on what is the correct one to build on top of and maintain, and teams will spend unneeded effort updating redundant components. Additionally, the resulting content sprawl means that managing any underlying data change becomes a tedious effort to keep affected Tableau data sources up to date. There must be a better way - and fortunately, there is.
Without the dbt Semantic Layer, standard Tableau workflows include content creators connecting to their data and spending time establishing the logic and relationships for how the underlying data relates (e.g.: how customer
relates to transactions
through customer_id
). The logic created from these data sources is often reused, making it increasingly difficult to manage what’s correct and what’s not.
Despite this being a fairly common workflow, this isn't how things should work. Users defining joins and creating relationships within their data is error prone and will inevitably lead to inconsistency, duplicate logic, and expensive queries.
And on top of all of that, what happens if the upstream data that’s feeding these Tableau data sources changes?
Consider a common scenario in data management where an organization decides to change the structure of a primary data source, such as a sales database. In Tableau, you'd need to track down all instances of this data source and where this changed data was used and completely refactor all of these relationships in Tableau. Such a scenario often leads to significant administrative overhead, where a single change can ripple through the entire analytical framework and require extensive time and effort to return to a consistent state.
Furthermore, data changes like the one described above aren’t a matter of “if”; they’re inevitable, and so the question becomes when and how often they’ll occur. Our philosophy with the dbt Semantic Layer is designed to help teams proactively address these inevitabilities head on. It’s architected as a hub-and-spoke model to avoid inefficiencies and problems that arise when managing logic at the edge (where the data is being consumed). By contrast, we believe this logic should be centralized further upstream and then be queried at the consumption layer.
Consistency at Scale
The dbt Semantic Layer acts as a middle layer between your data warehouse and Tableau dashboards that manages all the business logic and relationships across your data models through declarative semantic models. By situating the business logic in the dbt project, teams can centralize data change management. Furthermore, by being an extension of a dbt project, where all of your underlying data modeling logic lies, the dbt Semantic Layer benefits from the automated lineage, documentation, version control, and automated test inherent in the dbt Cloud platform.
When upstream changes occur, updates are made in one place: dbt Cloud. The changes are then run with your scheduled dbt jobs and propagated downstream, without you needing to change anything in your data sources in Tableau. So, a best practice would be not to use Tableau for modeling your data when connected to the dbt Semantic Layer.
Key benefits of using the dbt Semantic Layer and Tableau
The benefits of using the dbt Semantic Layer in conjunction with Tableau extend beyond simplified data change management.
By centralizing the business logic in the dbt project, users can:
- Focus more on analysis and less on the mechanics of data management: Your analysts can spend their time on what questions they need to answer for the business rather than spending time on preparing the data.
- Avoid duplicate work: Your teams can work more efficiently since they aren't spending time reconciling redundancies across data sources and resolving consistency issues
- Foster data trust: Our recommended approach enhances the overall integrity of the data that’s analyzed in Tableau. With dbt managing the transformations and your metrics being defined in code with a CI process and orchestration, analysts can trust that the data in Tableau reflects the most current and accurate information available, which is crucial for making informed business decisions.
- Avoid managing a large number of extracts: The dbt Semantic Layer offers a live connection and your data will always be up to date when you query, reducing the dependency on extracts.
Final Thoughts
The integration of dbt's Semantic Layer with Tableau makes it far easier to manage upstream changes in data (an inevitable part of any organization’s evolution). Additionally, it alleviates the burdensome task of manual updates in response to updates, simplifies the data workflow, and speeds up time to insights from Tableau. Above all, it delivers that coveted “single source of truth” to you Tableau users, in fact, to everyone in your organization.
To learn more about our dbt Semantic Layer and Tableau integration, you can access the Tableau Exchange or the documentation.
Last modified on: Oct 15, 2024
Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.