Unlocking new possibilities with dbt Cloud on Azure Databricks
The rapid pace of AI adoption requires a strong foundation of data accuracy and governance. At a recent summit, Gartner stated that at least 30% of genAI projects will be abandoned by the end of 2025 as organizations fail to realize value due to poor data quality, inadequate risk controls, and rising costs. The outcome of AI applications can only be as good as the underlying data. That’s where dbt and Azure Databricks have proven to deliver a powerful combination to scale the use of data and build new data products, including genAI applications.
With dbt Cloud on Azure, this combination now extends to Azure data stores and services. dbt Cloud on Azure streamlines data transformation workflows in the Azure ecosystem. Along with Azure Databricks, customers now have access to a simple, open platform to meet their analytics development needs using services native to the Azure cloud. Providing access to data and transformations in the cloud of the customer’s choosing continues to deliver on the promise of One dbt. One dbt is our strategy of unifying our experiences, audiences, and communities together.
Customers especially benefit from:
Manageability
With dbt Cloud on Azure Databricks, data teams can seamlessly manage their data pipelines in a few clicks, all while leveraging Azure’s integrated environment for managed access control and authentication (Microsoft Entra ID) and services (Power BI, Azure Data Factory and Azure OpenAI). This out-of-the-box, first-party integration between Microsoft and Databricks provides a cohesive experience, reducing the complexity of managing separate tools and platforms.
Data access
While building their data pipelines on Azure Databricks, dbt users can take advantage of optimized reads and writes directly from Azure Data Lake Storage (ADLS Gen2) and other Azure storage systems (e.g. Blob storage) for a streamlined data workload.
Governance and security
Azure Databricks offers unified governance and lineage via Unity Catalog to analysts, data engineers, and scientists to securely discover, access, and collaborate on trusted data and AI with end-to-end lifecycle visibility. Data teams using dbt on Databricks get the added protection and monitoring capabilities of the Azure Databricks environment with Azure Security Center.
Lower cost of compute
Write the most efficient dbt code by taking advantage of Azure Databricks’ flexible pricing, which charges only for the compute you use. This flexibility helps data teams of all sizes manage budgets and grow efficiently without the risk of unexpected cost spikes.
Performance, power, and simplicity for Analytics Engineers
Databricks SQL warehouse is an optimal engine for building and running dbt projects. Built with DatabricksIQ, the Data Intelligence Engine that understands the uniqueness of your data, Databricks SQL helps dbt users become more efficient and build faster pipelines with an intelligent and auto-optimizing platform that also benefits from the simplicity, unified governance, and openness of lakehouse architecture. In 2024, Databricks SQL has undergone significant advancements, leveraging AI to automatically improve performance and efficiency, resulting in a 4x improvement in query performance over the past two years. Key enhancements include:
- Intelligent Workload Management: Optimize resources for high-concurrency BI workloads
- Liquid Clustering: Automatically managing data layout without manual fine-tuning
- Predictive I/O: Index-like performance without the need for index creation or maintenance. This means faster query execution times for dbt models, especially for large-scale dbt transformations.
Retool, a joint customer of Databricks and dbt Labs, uses the combination to power most of their analytics workloads. “With dbt Cloud on Databricks SQL, we decreased the actual spend on our daily dbt production jobs by 50% and decreased the runtime by 25% at the same time”, said Samuel Garfield, Analytics Engineer at Retool. “This is truly a no-compromise situation.”
Simplified user experience
By combining Databricks SQL with dbt Cloud’s intuitive modeling layer, business analysts can work more effectively. Expanding who can be successful with dbt and Databricks allows more people to participate in the Analytics Development Lifecycle (ADLC). Databricks SQL offers AI-assisted tools to simplify data analysis for the broader organization. The Databricks AI Assistant provides a context-aware tool to help dbt users create, edit, and debug SQL queries. To prevent ADLC implementations from getting siloed within centralized teams, cross-functional teams can improve collaboration with Databricks AI/BI, a new business intelligence product that allows quick visualizations based on business context, and Genie, a conversational tool answers business questions knowing the context of your own data.

Databricks Assistant to create, debug and explain code in SQL Editor

Analysts can generate visualizations using natural language
Core SQL warehouse capabilities
Databricks SQL has several core features including:
- dbt + Materialized Views (MV): Building efficient pipelines becomes easier with dbt, leveraging Databricks' powerful incremental refresh capabilities. Users can use dbt to build and run pipelines backed by MVs, reducing infrastructure costs with efficient, incremental computation.
- dbt + Streaming Tables: Streaming ingestion from any source is now built-in to dbt projects. Using SQL, analytics engineers can define and ingest cloud/streaming data directly within their dbt pipelines.
Looking ahead
As we continue to evolve together, expect deeper integration between dbt andUnity Catalog, enhanced support for real-time data transformations using dbt Cloud and Databricks' streaming capabilities, and more seamless workflows between data engineering, data science, and machine learning teams.
By leveraging dbt Cloud on Azure Databricks, organizations can build more robust, scalable, and maintainable data pipelines while empowering a wider range of users to work effectively with data. This powerful combination is set to drive the next wave of innovation in the world of data analytics and engineering. To learn more, check out this guide on setting up your dbt project on Databricks or take it for a spin on the Databricks Platform.
Last modified on: Oct 08, 2024
Early Bird pricing is live for Coalesce 2025
Save $1,100 when you register early for the ultimate data event of the year. Coalesce 2025 brings together thousands of data practitioners to connect, learn, and grow—don’t miss your chance to join them.
Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.