dbt
Blog One dbt: the biggest features we announced at Coalesce 2024

One dbt: the biggest features we announced at Coalesce 2024

Coalesce 2024 kicked off this morning in Las Vegas. In front of 1,800+ data practitioners, team leaders, and executives at Resorts World and thousands more watching online, we shared our vision for the future of the analytics workflow. A time where zero-sum choices about cloud data platforms, infrastructure, and even dbt Core vs. dbt Cloud become unnecessary, because everything works together in support of the analytics development lifecycle. We call this, “One dbt.”

One dbt

One dbt isn’t a specific feature. It’s an ethos that’s influencing all aspects of how we operate at dbt Labs. It represents our commitment to an integrated, governed, and scalable approach to data analytics.

We are building towards a future where everyone—from data engineers to analysts to business decision-makers—has a common unifying framework for solving analytics problems. This framework should allow all of these humans to collaborate on any data platform, on any cloud. Their work should be accelerated by AI. And they should be able to work in the tools that make the most sense for them.

You’ll see this reflected in the announcements below and certainly in our investments in the future.

dbt has always been a force for unification in the data industry, bringing together people, platforms, and workflows. But this focus has ramped up over the past year for a few reasons:

  1. Market demand for platform flexibility: Most companies run multiple data platforms. Iceberg is quickly becoming the de facto standard for how companies embrace multiple cloud data platforms and compute engines. This opens up a future where teams have greater flexibility around how and where they access data and reduces dependence on a single data platform.
  2. The need to empower more data collaborators: The rise in self-service BI interfaces and new generative AI tools have brought business users much closer to the analytics development workflow. These users bring unique context about the business that typical data practitioners may not otherwise have. Empowering business people and data teams to collaborate creates more avenues for developing trustworthy and useful data products.
  3. The trust imperative: Our most recent State of Analytics Engineering report found that poor data quality, ownership, and stakeholder literacy are among the top challenges in analytics today. Consider what happens when a business user spots an error in a sales dashboard. Can they troubleshoot it themselves? Would they know where to go for support? This gets at the heart of the trust issue, and a lack of trust between business and data teams can stifle company growth. The Analytics Development Lifecycle (ADLC) is a framework for maturing the analytics workflow and improving data quality and trust, tying together the technology stack with the multiple personas who contribute to generating and disseminating organizational knowledge.
Image of the analytics development lifecycle (ADLC) infinity loop

Our newest features are designed to help our customers take advantage of these trends, helping them improve cross-platform flexibility, empower more people to safely contribute to analytics workflows, and improve organizational trust in data.

New dbt features announced at Coalesce 2024

dbt Cloud is a data control plane that centralizes metadata across the ADLC—orchestration, observability, cataloging, and more. Organizations can move faster with trusted data and dbt is the lynchpin for helping data teams build, deploy, monitor, and discover data assets. Let’s dive in to the newest features powering the dbt Cloud data control plane.

Image of the dbt Cloud Data Control Plane architecture

Cross-platform dbt Mesh

dbt Mesh will soon support cross-platform references, allowing organizations who rely on various data platforms to have a centralized, governed approach to multi-platform collaboration. This new capability is in direct response to the fact that, more often than not, large enterprises rely on multiple data platforms (for example Snowflake and Databricks, or Redshift and Athena) for various use cases—and they need to maintain a unified workflow regardless of what data platform a particular team uses.

"Cross-platform dbt Mesh makes the promise of data mesh an actual reality for us. Now, it will be possible to work on an organization-wide data model—one that all teams can contribute to and consume from—regardless of what that team's tech stack looks like. Cross-platform dbt Mesh gives the technology diversity of our data ecosystem a common denominator that we can all build around."
- Ulrik Svanborg Møller, Lead Data Engineer, Vestas Wind Systems

With cross-platform dbt Mesh, practitioners will be able to reference and re-use data assets from not only different projects, but also different data platforms, breaking down silos and fostering collaboration across an organization’s full data estate. We are actively working with a small handful of design partners to develop this capability, and we're iterating towards a beta which will include support for Snowflake, Databricks, Redshift, and Athena. Learn more here.

Architecture image of cross-platform dbt Mesh

Iceberg table support

dbt Cloud now supports the Apache Iceberg table format, which is a key capability that enables cross-platform dbt Mesh. This allows users to leverage the benefits of Iceberg's table format, including improved query performance and schema evolution. By supporting Iceberg, dbt Cloud enables data teams to work more efficiently with large-scale data lakes while maintaining the familiar dbt workflow. Support for Athena, Spark, Databricks, Starburst/Trino, and Dremio are GA and Snowflake is currently in beta.

Visual editing experience

We are adding an intuitive, visual drag-and-drop interface (currently in beta) to our suite of data development environments. Why? The short answer is to empower more types of users with governed workflows for data collaboration. The long answer is that the folks within an organization who are closest to the business context typically have two courses of action for getting the data they need: they can lob a ticket over to their (overworked) central data team and hope that they get the right data back in a timely manner or, they could go-it-alone with a CSV pull and a spreadsheet. Neither of these approaches is scalable.

UI of dbt Cloud visual editing experience

With the new visual editing interface, downstream teams have a governed inroad to building data products—even if they don’t know how to write SQL. Under-the-hood, it’s the same familiar dbt workflow governed by the same principles: it generates SQL in your dbt project, the code is version-controlled, and the code can be tested and documented so models are trustworthy.

This accessible visual editing experience is how data teams can take themselves off the critical path for ad hoc requests, and empower their business colleagues to translate domain knowledge into real analytics code. Even folks well-versed in SQL will find it useful to distill unwieldy lines of code into a visual representation of their models so they can better understand, optimize, and explore their data pipelines. Register your interest in joining the beta here.

dbt Copilot

dbt Copilot (currently in beta) is dbt Cloud’s embedded AI engine with various user experiences across the dbt workflow designed to help users accelerate and automate analytics. Using dbt Copilot, users can perform tasks that previously required repetitive manual work and significantly improve productivity, data quality, and stakeholder trust. Today, this includes the ability to auto-generate tests, documentation, and semantic models (all in beta), an AI-chatbot powered by the dbt Semantic Layer (in beta as part of the dbt native app in Snowflake), and the ability to bring your own OpenAI API key (GA). In the coming months, dbt Copilot will extend to help automate code generation and workflows across all of dbt Cloud.

Compare changes with advanced CI

Continuous integration in dbt just got even smarter. While users have long been able to validate that a pull request wouldn’t break something in production, with the new compare changes feature inside of CI jobs (now generally available for dbt Cloud customers on the Enterprise plan), they’ll be able to ensure that what is being built meets their expectations. When enabled, each CI job will include a breakdown of the columns and rows that are being added, modified, or removed in your underlying data platform as a result of executing your dbt job. Additionally, users can see a summary of these changes inside the actual PR in their Git provider interface. This additional context allows data teams to catch any unexpected behavior before code is deployed into production, improving data quality and increasing trust among all collaborators. Learn more here.

Advanced CI compare changes feature in dbt Cloud

Auto-exposures with Tableau

Now in Preview, the dbt DAG can be automatically populated with downstream exposures in Tableau (and Power BI to follow). This gives data teams automatic context into how and where models are used, so they can prioritize data work to promote data quality. Meanwhile, by triggering downstream dashboards to automatically refresh when new data is available, business stakeholders can be confident that they’re always making decisions from the freshest data. These exposures are automatically accounted for throughout dbt Cloud, including in dbt Explorer, scheduled jobs, and CI jobs.

dbt lineage with auto-exposures for Tableau
“With auto-exposures in dbt Explorer, it’s like going from a treasure hunt to having a treasure map. The native Tableau integration and auto-generated lineage help us see everything clearly, making impact analysis across hundreds of dashboards a breeze!”
—Rahavan Raman, Director - Data Engineering & Analytics, Zscaler

Data health tiles

What good is a dashboard if you can’t trust the freshness and veracity of its supporting data? With dbt Cloud, you can now embed health signals like data quality and freshness within any dashboard, giving your downstream stakeholders at-a-glance confirmation of whether they can trust the data they’re about to use. Users can also navigate back to dbt Explorer with a single click to investigate further.

These trust signals provide users with a quick and easy way to assess the reliability of their data assets. By offering clear indicators of data health and freshness, teams can make more informed decisions and have greater confidence in their analytics processes. This feature aligns with our commitment to enhancing data quality and fostering trust across the entire data lifecycle.

Embed tile with data health signals in downstream dashboards

New integrations

dbt Cloud now integrates with AWS Athena (GA) and Teradata (Preview), enabling more organizations and teams to collaborate on data workflows. Athena will also be one of the first adapters compatible with cross-platform dbt Mesh. Additionally, we are expanding our BI tool integrations with a new dbt Semantic Layer connection to Power BI, coming soon.

dbt Core v1.9

dbt remains committed to advancing our open source offering for data practitioners, and starting in Core 1.9, we have two big improvements, available now in beta. First, you can use the new microbatch strategy to optimize your largest datasets. This lets you process your event data in discrete periods with their own SQL queries, rather than all at once. Second, we've streamlined snapshot configuration to make dbt snapshots easier to configure, run, and customize.

In addition, we are solving for a list of smaller “paper cuts” upvoted by the community that you can read about in full in our docs here.

Embrace analytics best practices at scale

From cross-platform dbt Mesh to the new visual editing experience, the latest platform features exemplify the "One dbt" ethos. We are empowering data teams to work seamlessly across data ecosystems, enabling more users to contribute to analytics workflows, and improving trust in data with a single, integrated set of governance features we call the data control plane.

Want to learn more? Join our upcoming webinar, One dbt: The Control Plane for data collaboration at scale, to see these features in action. You can register here.

And as a reminder, if you miss any of the action this week, you can catch a recording of our opening keynote or other sessions from Coalesce online at any time.

Last modified on: Oct 08, 2024

Accelerate speed to insight
Democratize data responsibly
Build trust in data across business

Achieve a 194% ROI with dbt Cloud. Access the Total Economic Impact™️ study to learn how. Download now ›

Recent Posts