dbt
Blog How AI will disrupt BI as we know it

How AI will disrupt BI as we know it

This post first appeared in The Analytics Engineering Roundup.

Business intelligence is on a collision course with AI.

The collision itself hasn’t happened yet, but it’s clearly coming. The inevitability of this has been clear roughly since the launch of ChatGPT, but no one knew exactly what shape that would take.

Today I want to propose how that collision is going to happen and what will happen in its aftermath.

I think it will be a very good thing for data practitioners of all stripes—those who officially have the word ‘data’ in their title but also everyone else who simply uses data in the service of their larger job. So: I’m all for it.

Before getting into AI part of the story, I need to introduce two specific mental models.

Let’s go.

BI is a portfolio of stuff

We all use the term “BI” but have become inured to what an Orwellian term it is. “Business intelligence” isn’t descriptive, it is industry-speak for a bunch of stuff glued together in order to achieve a desired user outcome: know facts about a business using tabular data.

For a long time, BI included a bunch of stuff that it no longer does. Like: data processing. Pre-cloud, BI tools processed data locally and often had proprietary processing engines. They competed on being fast.

With the cloud, that evaporated. Local data processing was anathema. BI tools got easier to build but gave up a part of their value proposition.

In today’s post-cloud world, I would suggest that BI tools have three jobs:

  1. Modeling: Define the semantic concepts behind your structured data: metrics, dimensions, joins, etc. Think: LookML.
  2. Exploratory data analysis (EDA): The iterative process of exploring data in search of useful insights. Highly iterative, flow-state, and unpredictable. Think: Looker explore window.
  3. Presentation: The aggregation of multiple data artifacts together to present a single cohesive narrative that can be shared out to potentially many others within an organization, all governed by a permission model. Think: Looker dashboard.
The 3 jobs of a BI tool in 2025

Some tools skip modeling and just allow users to do EDA without a model. EDA and presentation are the most core jobs of any BI tool and every BI tool I’m familiar with does both. And it is the fact that BI tools facilitate the EDA process that enables them to govern and share the presentation of that analysis.

Scaling the criticality of an analysis

All credit to my collaborator Dave Connors for this mental model 🦞

Generally speaking, artifacts pass through a few lifecycle stages as they mature into data products supporting production use cases. Think about these stages as the ‘production line’ of BI.

Phase 0: Exploratory analysis

The first thing data practitioners do when faced with a business question is to start developing low fidelity sketches to try to answer it.

The vast majority of the work generated here will be thrown away, so there are low expectations code quality and governance. The primary goals of the best EDA experiences are iteration speed, flow state, and flexibility.

Phase 1: Personal reporting

At a certain point, some exploratory analysis will cross over into a true insight; your question is answered, your curiosity sated. The question is important enough that you want to make sure you can return to it later. But it is not yet “ready for prime time”—you’re not ready to share it with others and have it be a part of someone’s operating cadence.

Some BI tools have a separate section for your “personal space”—think about your personal folder in Looker.

Phase 2: Shared reporting

The moment that a report gets shared with another person, the required governance characteristics of a data artifact increase significantly. When you create a report you understand its context; when someone else starts using it they just expect it to be correct.

In phases 0 and 1, there may not be any governance applied—all governance may be applied at the compute layer with grants. But once you share an artifact, it is the governance at the BI layer that determines who gets to see what. This is simply because most data consumers don’t have accounts within the data platform and so the BI tool takes over as the arbiter.

In phases 0 and 1, there is also no auditability requirement. Auditability, change tracking, and general data ops best practices are introduced when artifacts are shared with others in Phase 2.

Phase 3: Production artifact

When shared reporting reaches a very high level of criticality (frequent access by a large number of end users, agreed upon SLAs, supports a critical business process, dynamic features), it’s officially “in production” and needs to be owned and operated like any other production data asset.

===

If you think about these stages as the ‘production line’ of BI, the most important job of a BI tool is to be the conveyor belt through all of these stages. Start with raw materials, end with a production data product. At each phase of maturity, it’s easy to extend the product to support the next set of capabilities: governance, dynamic filters, SSO, etc. You never think about those things during Phase 0, but as your work progresses, the BI tool makes it straightforward to progressively add those capabilities.

But for this all to work, you gotta start the process inside the BI tool all the way back from Phase 0. You can’t do your EDA in Jupyter & Pandas and expect to ship it to users in Tableau…that’s not how that works.

So: you gotta do your EDA in a BI tool to take advantage of the “production line”. But…are BI tools typically the best way to do EDA? We’ll return to that later.

MCP and AI-as-aggregator

The final thing we need to understand is the impacts of a context protocol. I wrote about this a few weeks ago:

The easiest thing to do for any technology vendor at the very onset of the AI era was to take all of the domain-specific context that you had and surface it to users in a chat interface. And we did the same thing. It was (and is) quite good—it does a great job of allowing users to ask business questions and answering them with semantic-layer-governed responses.

The problem with this approach is that users don’t actually want to interact with dozens of chat interfaces. They don’t want to remember to go to a given tool to get one type of answer and another tool for another type of answer. There will not be 30 chat experiences all with different context. There will be one…or maybe just a few. But likely a single dominant one.

This is how aggregators work. You likely don’t use a bunch of different search engines—you probably just use one, and it is probably Google. This is how chat will go as well.

The problem is, Google could scrape the web and respond to all queries based on that knowledge. But ChatGPT cannot know all of the information you want to ask it questions about (at least, yet). That lack of business context is the problem.

That’s where a context protocol comes in. A context protocol—a somewhat new topic in the public AI conversation—is a standardized way for services to provide additional context to models via an open protocol. The most promising one today is called MCP, but whether or not MCP wins, the awareness/excitement/support for this idea has developed a ton of momentum and I am fairly convicted that something like this will become real and widely-supported.

There will be a large number of context providers (every source of valuable enterprise context) and a large number of context consumers (different products with AI capabilities). There is no way to create point-to-point integrations to facilitate this. A protocol will be needed if we are going to see the right type of advancements, and I think it will happen.

Imagine that your license to ChatGPT enterprise or Claude Desktop or whatever already came with a connection to all of the metadata about every piece of structured data you had access to. What was there, how trustworthy it was, how suitable it was for the analysis you were describing, etc.

Well, in the intervening weeks since I wrote this, a couple of things have happened. First, this:

Sam Altman tweet

…then this:

Sundar Pichai tweet

Clearly this thing is going somewhere.

Just as momentously, I have gotten access to an internal-only dbt/MCP-powered experience in Claude Desktop. In it, I can ask every type of metadata question I might want (powered by our Metadata API) and I can also ask questions about all of our business metrics (powered by our Semantic Layer API).

It is incredible. I don’t want to share too much right now, but … having your data and metadata available in the context of a modern reasoning model is incredible.

BI in an AI-first world

Ok, we now understand: the jobs of a BI tool, BI conveyor belt, and how to get structured data context into your AI-of-choice. We’re finally in position to tackle the coming collision.

Here it is; plain and simple:

  1. AI is going to be meaningfully better at exploratory data analysis than any BI tool.
  2. If you take away EDA from BI, the ‘conveyor belt’ model breaks down. And the conveyor belt model is the primary reason you use your current BI tool.
  3. It is not yet clear how the BI ecosystem will adapt to this new reality.

That’s it. That’s my entire argument. Let’s see if it holds up.

Artificial intelligence will far outstrip business intelligence for exploratory data analysis

There are a lot of data tasks that AI is good at. I’ve talked about a lot of these in the context of data engineering here. But the area of data analysis that is going to be most benefited from AI is EDA.

I am confident about that for two reasons. First, I have empirically validated this first-hand. The dbt + MCP + Claude 3.7 combo that I outlined earlier is just dramatically better at EDA than anything I’ve experienced in my life, and it’s getting better fast. But I am not ready to show you that (it’s single-digit weeks away from a public demo!), so you may not believe me. Fair.

The second reason I’m confident about this is the fact that most time spent in EDA is writing code (whether done by hand or via a GUI). And we now know how good leading-edge models are at writing code when supplied with the right context. Whether you want to reference individual developer testimonials or the head of YC or Andrej Karpathy or Google, it all lines up. And it just so happens that the two software engineers whose opinions I trust most in the world—my cofounders Drew and Connor—have gone all in on Cursor over the last 3 months and are not-quite-but-almost religious about the experience.

If you find yourself skeptical of this, here are a few things to keep in mind.

  1. You don’t need the LLM to answer ‘why’ questions, or generate hypotheses, to have it be far superior than your current workflow. Rather—it just makes you a lot faster because it can write EDA code a whole lot faster than you can (whether you’re writing Excel formulas or dataframe operations).
  2. Accuracy is a non-issue as long as you ask a question that can be governed by a semantic layer. The code written tends to be: get data from the SL, manipulate it in Python, generate a chart using some dynamic javascript library. If you can’t get a dataset governed by an SL query, text-to-SQL does continue to improve with sufficient context.

Just imagine: an interface that allows you to just have your questions answered far faster. You remain the objective function and the creative drive behind the process, AI is simply better and faster than you are at writing analytical code.

IMO that shouldn’t feel threatening, that should feel empowering. I seriously lost it the first time I interacted with our internal data in this type of experience. The primary value prop of a data analyst shouldn’t be writing code, it should be analytical problem solving and generating action.

The conveyor belt model breaks down

You typically don’t tend to use your BI tool because it is the fastest or most delightful EDA experience. You use it because, when you have something to publish to your coworkers, you know exactly how to do that.

But what if another tool were so much better at EDA that you would be handicapping yourself if you didn’t use it? What would you do?

There are likely three answers.

First, you could go back to publishing one-off assets. If you ask any AI experience to “give me that in an Excel file” most of them have no problem doing that. So maybe you just go back to shipping attachments. But that doesn’t feel like progress.

Second, having iterated and found the insight you were looking for, you now have to reconstitute that analysis inside of your BI tool of choice. In practice this will likely only happen rarely; it is not a stable equilibrium because every human hates double work.

Third, and hopefully preferable, is that we find some way to pull back in the results of an exploration into the governed framework of the BI tool. Imagine asking “make a PowerBI worksheet out of this analysis.” We will need to get deeper into the MCP era to see exactly how this will play out, but I’m optimistic that it will be possible.

The third option still sees the BI tool as an important governance and presentation layer but pulls out the most strategic responsibility (EDA) from its portfolio.

A very different BI tool

BI tools used to ship with compute engines. Today they do not.

What if BI tools were no longer the primary way EDA was done?

What if their primary job was to render data artifacts a governed, interactive environment?

That is still an incredibly valuable thing, and needed as long as humans are going to continue to interact with structured data (IMO: a long time). But it’s not what BI tools look like today.

Most BI tool vendors want to pull this new EDA experience inside their Chrome—exposing AI-powered interfaces inside their products. I don’t believe this will be how most users do EDA, for three reasons:

  1. User behavior: Aggregation theory will dominate, every knowledge worker inside of a company needs access to this functionality and they’re not all going to think to go to a specific tool first, they’re going to prefer to simply ask data questions in the same place they ask all of their other questions. Claude, ChatGPT Enterprise, whatever.
  2. Tool combinations: MCP is not only powerful because it lets you use a single tool, it is powerful because it is a pluggable framework to pull in all kinds of tools for the model to use. You’ll be able to ask a BI question (”Show me our most important renewals for the coming quarter”) and then immediately act on it in another tool (”Email the main point of contact on the account to set up a check-in meeting”). Having all of these tools interact together inside of a single interface is combinatorially powerful. There is already a large ecosystem of tooling available and community-driven innovation is happening fast.
  3. Tech: Except for MSFT, current BI vendors are not AI research labs. They are just not going to create better models or be the primary destination for all AI interactions within a company.

My predictions

I think that the BI workflow that has dominated for the past ~15 years is going to change significantly over the next 2. EDA will significantly migrate over to AI interfaces, enabled by MCP.

I think this will be incredibly positive for all knowledge workers throughout a company. It will enable more users to create sophisticated analytics and will enable existing data practitioners to move significantly faster.

I think this will be a headwind to many current BI vendors. BI is extremely sticky and this change isn’t going to happen overnight, but it will be a headwind.

I think there is likely space for new players to innovate: to be the best place to aggregate and govern all of the artifacts built in this new workflow.

I’ll return to this post after six months and see how my predictions are faring!

Last modified on: Apr 07, 2025

Early Bird pricing is live for Coalesce 2025

Save $1,100 when you register early for the ultimate data event of the year. Coalesce 2025 brings together thousands of data practitioners to connect, learn, and grow—don’t miss your chance to join them.

Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.

Read now

Recent Posts

Great data professionals never work alone

Every industry leader understands one thing: you need the right network to grow. The dbt Community connects you with 100,000+ data professionals—people who share your challenges, insights, and ambitions.

If you’re looking for trusted advice, expert discussions, and real career growth, this is the place for you.

Solve your toughest challenges

Join today and get real-world advice from experienced pros.

Expand your network

Foster connections with meetups, local groups, and like-minded peers.

Advance your career

The dbt community is full of learning opportunities and shared job postings.