Delivering data that works: the biggest new dbt Cloud features
May 14, 2024
ProductToday, dbt Labs held our first annual product launch virtual event: the dbt Cloud Launch Showcase. It was a jam-packed 90 minutes with executive keynotes, new product announcements, and demos that all centered around the theme of how dbt Cloud is helping teams deliver Data That Works.
We spent the majority of the time talking through our latest innovations, which fell into one of three categories:
- Quality: Introducing new and improved ways for dbt developers to build, test, and deploy high quality code, including our new AI copilot experience dbt Assist, unit testing, and more advanced CI capabilities
- Connections: Building out the context of the dbt DAG to automatically include upstream sources and downstream exposures; additionally, extending our warehouse integrations across the Microsoft ecosystem (Synapse and Fabric)
- Collaboration: Empowering more people to participate in well-governed data development workflows and the insights they drive via a new low-code UI and improvements to dbt Explorer and the dbt Semantic Layer
Supporting all of these innovations is the dbt Cloud platform, which we continue to improve upon in order to make data workloads more reliable, performant, and scalable.
With these new and improved ways to test, build, govern, catalog, and democratize data, we’re looking forward to continuing on our journey to help our customers deliver Data that Works.
Get on demand access to the replay, or keep reading for all the details!
✅ Quality: deliver reliable data without compromising velocity
For data to be useful, it needs to be reliable and it needs to be delivered downstream as quickly as possible. We’re investing in new and improved ways for data developers to build, document, test, and safely deploy data transformations, so that data pipelines continuously hum and business stakeholders have confidence in the data they’re using to make decisions.
dbt Assist (AI)
AI is happening. You can now use dbt Assist, a new AI-powered co-pilot experience built into dbt Cloud, to boost your productivity and enhance data quality. dbt Assist allows you to quickly generate documentation and tests to augment your dbt models, helping you accomplish more in less time.
If you're interested in early access to try dbt Assist in the Cloud IDE, you can register your interest in joining the private beta here.
Advanced CI
We also announced important enhancements to dbt Cloud’s Continuous Integration (CI) capabilities, so data teams can ensure better data quality at any scale. With the upcoming “compare changes” feature inside of CI jobs, you’ll not only be able to verify that the code in a pull request (PR) will build (which you can do today!) but also ensure that what is being built meets your expectations. When enabled, each CI job will include a breakdown of what’s being added, modified, or removed in your underlying data platform as a result of executing the job.
Additionally, the compare changes feature will show a summary of the outcomes of your quality checks inside the actual PR on your Git provider. All of this amounts to more seamless prevention of unexpected changes, a smoother QA process, and a better experience for downstream data product consumers.
Stay tuned for the beta coming soon!
Unit testing
You can now use unit tests to validate the behavior of model logic before the model is materialized in production. If a test fails, the model won’t build—saving you from unnecessary data platform spend, while improving data product reliability.
To get access to unit testing, as well as other new features in the future, simply select “Keep on latest version” in your dbt Cloud jobs or environments.
Read our docs or check out this blog post to learn more about how to approach data testing in dbt.
Unit testing is also available in dbt Core v1.8. We take our responsibility as stewards of the dbt open source standard very seriously and are excited about this latest release. dbt Core v1.8 highlights include:
- A new stable, decoupled adapter interface, giving adapter maintainers more control over when and how they ship updates
- Project-level behavior flags: opt into recently introduced changes (which are disabled by default) or opt out of mature changes (which are enabled by default)
- A new
--empty
flag for building schema-only dry runs, helping ensure your models will build while avoiding expensive reads of input data
dbt Cloud CLI
We’re pleased to announce that the dbt Cloud CLI, used by hundreds of organizations, is now generally available (GA). Develop anywhere using your code editor of choice, bolstered by the rich features of dbt Cloud including capabilities like dbt Mesh support, defer to production, and improved performance.
And if you use VS Code, you can now also use the Power User for dbt Core and dbt Cloud extension with the dbt Cloud CLI to bolster your productivity. This blog post from our CEO Tristan Handy dives deeper into the features and benefits of building in the Cloud CLI.
Get started using the dbt Cloud CLI today.
〰️ Connections: plug into everywhere your data is
For data to be useful, it also needs to be complete. That means you need a holistic view of your entire estate, with the ability to trace and orchestrate your workflow from source to metric to consumer. And to do it seamlessly and automatically. You also need your data workflow to be interoperable with the platforms you’ve invested in—whether that’s Snowflake, Azure, Databricks or anywhere else.
Automatic exposures
We announced new end-to-end orchestration features that make your data workflow and dbt DAG more automated, context-rich, up-to-date, and cost effective.
When you configure automatic exposures, your dbt DAG will automatically reflect the downstream dashboards that your models power. You can also trigger downstream dashboards to automatically refresh when new data is available upstream so your stakeholders are always making decisions from the latest data. These exposures are automatically accounted for throughout dbt Cloud, including in dbt Explorer, scheduled jobs, and CI jobs.
On our near term roadmap, we’re also incorporating the concept of active sources so that everything downstream and upstream of your dbt workflow is synchronized, automated, and always up-to-date .
Automatic exposures for Tableau is expected to go into beta this summer, with PowerBI exposures to follow.
Microsoft integrations
As dbt has become a standard for data transformation on the data warehouse, we have seen significant demand from the Microsoft community for the collaboration and productivity features of dbt Cloud with Azure Synapse and Fabric. We announced our Microsoft Fabric integration as GA, and also launched our Synapse adapter in Preview.
To get started, create a new dbt project in dbt Cloud and choose Fabric or Synapse as your data platform.
Databricks OAuth
Now generally available, dbt Cloud supports developer OAuth with Databricks, providing an additional layer of security for dbt Enterprise users. When you enable Databricks OAuth for a dbt Cloud project, all dbt Cloud developers must authenticate directly with Databricks in order to use the dbt Cloud IDE, removing the need to manage credentials in dbt Cloud. With this addition, we now support data platform authentication via OAuth on Snowflake, BigQuery, and Databricks, and look forward to continue adding to the list.
👭 Collaboration: empower more people to access, trust, and use data
Data is a means to an end—and that end is always driven by the business. And so, more stakeholders need the freedom and ability to participate in data development workflows. Whether that means building and testing data models from their preferred development environment, querying consistent metrics from their favorite visualization tool, or building a collaborative data mesh architecture—you need a variety of governed inroads to that data to encourage more people to actually use it.
Low-code development environment
We also unveiled a brand-new development experience we’ve been building in dbt Cloud: a low-code visual editor!
The precision and flexibility afforded by code-based development is inarguable. And, we also understand that many potential contributors to dbt development are precluded from doing so if they don’t have SQL expertise.
But not anymore. Now, less SQL-savvy analysts will be able to create or edit dbt models through a visual, drag-and-drop experience inside of dbt Cloud. These models compile directly to SQL and are indistinguishable from other dbt models in your projects: they are version controlled, can be accessed across projects in a dbt Mesh, and integrate with dbt Explorer and the Cloud IDE. As part of this visual development experience, users can also take advantage of built-in AI for custom code generation where the need arises.
This allows organizations to enjoy the many benefits code-driven development—such as increased precision, ease of debugging, and ease of validation—while retaining the flexibility to have different contributors develop wherever they are most comfortable: via the dbt Cloud CLI, the dbt Cloud IDE, or now, a low-code visual editor.
Register your interest in joining the private beta here.
dbt Explorer
Initially introduced at Coalesce 2023, we’re pleased to share that the foundational dbt Explorer experience—including column-level lineage—is now generally available! Over the past few months, we’ve launched new dbt Explorer functionality—including improved search and lineage, model performance analysis, and project recommendations—designed to help data teams understand, troubleshoot, and improve their data pipelines while also making it easier for stakeholders to discover and analyze trusted data. Today, over 1,400 organizations rely on dbt Explorer as a critical tool for their data workflows.
Looking forward, our vision is for dbt Explorer to help both data producers and consumers improve the ROI of their data and of their valuable time. That is, for dbt Explorer to be an indispensable command center for driving high quality decisions while keeping costs in check. To that end, we pre-announced a few new features coming to dbt Explorer soon:
- Visualize automatic exposures: Easily (and automatically) enrich your lineage with context into the dashboards, teams, and use cases your models power. We are launching auto-exposures with Tableau, with PowerBI to follow.
- Model query history: Identify the relative popularity of your models based on usage queries so you can prioritize development and improve data trust.
- Tile embedding: Surface data health signals, like freshness and quality, wherever stakeholders consume data to build trust and accelerate quality decision making.
Also, stay tuned for a revamped dbt Explorer landing page that makes it even easier to find trusted, relevant models or identify pipeline issues.
“dbt Explorer has not only helped us pinpoint areas for code enhancement but also significantly improved our documentation practices. We have effectively mitigated data errors in the bronze/silver layer and can ensure a higher standard of data quality for our end consumers. dbt Explorer is an indispensable ally for any data-driven organization aiming for excellence in their analytics workflows.”
– Shravan Banda, Solutions Architect, World Bank
dbt Semantic Layer enterprise readiness features
In planning our roadmap for the dbt Semantic Layer, a top priority has been delivering what we call “enterprise ready” features—the types of things you’ve come to expect from your SaaS providers that give you the confidence to adopt and embrace a service at scale. We announced a number of enterprise features to the dbt Semantic Layer including:
- Access controls: Get more granularity, control, and precision to semantic layer permissions with group-level and user-level controls. Soon, dbt Cloud admins will be able to create multiple data platform credentials and map them to service tokens for authentication, meaning that various departments (marketing, data, sales, etc.) have curated access to relevant and governed data. With user-level controls, you can reuse existing developer credentials for more fine-tuned permissions. Both are expected to go into preview by July.
- Caching: We shipped result caching a few months back, which is a useful way to improve load times and reduce compute costs for frequently-queried metrics. With the GA of declarative caching, you have even more power and control to configure the cache that is most relevant to your use case. Now, you’ll be able to “pre-warm” the cache using saved queries and significantly improve the performance of key dashboards or common ad-hoc query requests. Any query requests with the same inputs as the saved query will hit the cache and return much faster.
- SSO & PrivateLink: We’re further unifying the semantic layer experience with the underlying dbt Cloud experience with the GA of SSO and PrivateLink (both now generally available). You can now develop against and test your dbt Semantic Layer in the Cloud CLI if your developer credential uses SSO. Additionally, dbt Cloud users who deploy with PrivateLink can now use the dbt Semantic Layer.
- Tableau and Google Sheets integrations GA: With the GA of these popular integrations, you now have the ability to ask better questions, faster. The self-serve GUIs are now on par with the semantic layer API capabilities and you can also save queries to enable collaborative workflows and improve time to insight. The Tableau interface also now supports important constructs like relative dates and parameters.
- MetricFlow improvements: dbt Semantic Layer is powered by MetricFlow—a flexible, SQL query generation tool—and we continue to make improvements to MetricFlow to make it even more powerful in helping teams collaborate around metrics. We announced new enhancements to MetricFlow—including metrics as dimensions, sub-day granularity, timezone support, and complex date joins—designed to give teams more flexibility and power as they build and consume metrics with increased velocity and accuracy.
“The dbt Semantic Layer gives our data teams a scalable way to provide accurate, governed data that can be accessed in a variety of ways—an API call, a low-code query builder in a spreadsheet, or automatically embedded in a personalized in-app experience. Centralizing our metrics in dbt gives our data teams a ton of control and flexibility to define and disseminate data, and our business users and customers are happy to have the data they need, when and where they need it.”
— Hans Nelsen, Chief Data Officer, Brightside Health
dbt Mesh
Support for dbt Mesh —a pattern for collaboration at scale in dbt Cloud—is now generally available.
The dbt Mesh enables data teams to make use of multiple, inter-connected dbt projects, each aligned to a domain team. Central data teams are able to maintain a view of global lineage and implement governance policies. This pattern improves speed, reliability, and governance relative to a single monolithic project, and it's been adopted to connect thousands of real-world dbt projects, including at some of the largest enterprises in the world.
“dbt Mesh enables us to make data mesh a reality by offering a simple, cohesive way to integrate and manage data pipelines & products across the enterprise using a single platform.”
—Marc Johnson, Data Strategy & Architecture, Fifth Third Bank
We also announced some new capabilities to streamline dbt Mesh adoption, including support for jobs triggering on completion across projects, as well as for staging environments as a canonical environment type for improved data isolation.
Soon, we’ll be rolling out environment-level permissions and warehouse connections, along with support for sharing re-usable macros across projects.
⚙️ Platform
On top of the above three innovation themes, we announced a number of improvements to the underlying dbt Cloud platform to make it more reliable, performant, and flexible for the thousands of customers that depend on it every day.
“Keep on latest version” in dbt Cloud
By providing a fully managed SaaS solution, dbt Cloud allows your team to focus their efforts on building data products instead of maintaining the infrastructure required to run dbt.
We recently introduced the ability to “Keep on latest version” in dbt Cloud, allowing you to receive fully vetted new dbt features and fixes in dbt Cloud continuously—without needing to manually upgrade dbt versions—saving your team valuable time and energy. We’re pleased to announce this is now generally available.
And the best news: we’ve also made some optimizations under the hood to significantly improve parse performance in dbt Cloud, cementing it as the most performant way to run dbt. These are available today to everyone running on “Keep on latest version.”
To get started, simply select “Keep on latest version” for all your environments and jobs in dbt Cloud, and you’ll be off the version treadmill for good: you’ll never have to upgrade dbt versions again.
The engineering team at dbt Labs recently wrote a blog post that delves into the rigorous processes we’ve put in place to ensure a stable, reliable experience for all of our customers that depend on this feature.
Cell-based architecture
Our new cell-based architecture will be the foundation of dbt Cloud going forward. This new architecture offers improved scalability (making dbt Cloud maximally performant regardless of the complexity of a customer’s deployment) and improved reliability (ensuring we can continuously deliver the same great experiences to all dbt Cloud customers across all regions and deployments without risk to product stability).
Our new cell-based architecture will be gradually rolled out to customers over the course of this year.
Microsoft Azure support
We’re striving to bring the same great dbt Cloud experience to all our customers, regardless of on which cloud they choose to deploy it. We announced that dbt Cloud will soon natively support deployment on Microsoft Azure, in addition to the currently available option of deploying on AWS.
Beta is coming soon.
Thank you
We’re really excited about this momentum, and as always, look forward to hearing your feedback! If you want to catch the replay of our launch event (you might even enjoy a Willy Wonka reference or two), you can find it here.
Last modified on: Jul 29, 2024
Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.