dbt
Blog M1 Finance powers AI self-service with Claude and dbt structured data

M1 Finance powers AI self-service with Claude and dbt structured data

You can imagine that as a fintech company, M1 deals with a lot of complex, highly regulated data. For its analysts and business users, fast access to accurate data is critical, both for strategic decision making and strict regulatory compliance. Relying on the wrong figures could directly affect customer portfolios and misinform executive planning. To meet this need, the data team rolled out a self‑service AI layer powered by Anthropic’s Claude LLM, hoping to let anyone pose questions in natural language

But going from a simple business question to reliable insight wasn’t so simple.

Bottlenecks to accessing data

To provide self-service analytics for its business users, M1 uses Superset. It’s a powerful tool for querying the data warehouse directly—but it required a level of SQL proficiency that most business stakeholders weren’t comfortable with.

As a result, users relied heavily on the data team to help them write SQL queries and find answers.

“The data team was becoming a bottleneck to the business,” says Brady Dauzat, Machine Learning Engineer at M1. “Instead of enabling people to make data-driven decisions, we were actually slowing them down because they needed to wait for us to do their jobs.”

To truly unlock self-serve for business users, the data team decided to build an LLM-powered conversational interface (ie chatbot) to enable business users to directly access and analyze data. Let’s walk through how they built a hallucination-proof SQL AI using dbt Semantic Layer.

The problem: frequent LLM hallucinations

For their first iteration, the data team built an LLM chatbot using a “direct catalog” approach.

When the user asked a natural-language query, the LLM (Claude by Anthropic) generated a SQL query for them to run. To create the SQL query, the LLM accessed the entire M1 data catalog, like schema, column names, and descriptions.

Unfortunately, the LLM frequently responded with hallucinations. While the SQL queries could run in the data warehouse and return data, they often retrieved the wrong data. This is a well-known challenge with LLMs when given overly broad contexts.

For example, the LLM might invent a column name that sounded correct but in fact didn’t exist. Unless the user deeply understood SQL and could identify errors in the query, they would never know there was an issue in the first place.

“To understand why this was challenging for the LLM, think of it this way. Say you provided your full data catalog to a smart person who had never seen it before,” explains Dauzat. “If you asked them to produce a valid query for key business metrics, would you trust them to do it?

“We were essentially asking the LLM to sort through all kinds of complex contexts around our entire data warehouse just to answer a question,” he continues. “That's very difficult to do and still produce a correct answer.”

Essentially, providing the LLM with the entire data catalog without constraints meant it had to "understand" a vast amount of context. This method was prone to errors, leading to incorrect queries or invented column names.

The solution: a “Mad Libs” approach to ensure high-quality AI outputs using dbt Semantic Layer

Clearly, this was a big problem. To overcome hallucinations, the data team tried a different approach—making the LLM serve as an interface between the user and dbt Semantic Layer.

“Going back to our earlier example, let’s say you gave a smart person a list of key business metrics in a fill-in-the-blank template for how to answer your question,” says Kelly Wolinetz, Senior Data Engineer at M1. “My trust in them generating the correct answer goes way up.”

In other words, the LLM no longer has to understand the full data warehouse. It just needs to understand how to fill out a form.

With this approach, the LLM no longer generates the SQL statements on its own. Instead, the data team asked the LMM to write a dbt Semantic Layer command, where key business metrics are predefined. To do that, the data team provided it with a system prompt and the user question—basically a “Mad Libs” approach to finding and providing the answers.

Now when the user asks a natural-language question, the dbt Semantic Layer command runs in the background to produce the actual SQL, and the LLM returns the SQL query to the user, ensuring it conforms to vetted metrics and standard query templates.

Impact: increased user engagement and reduced hallucinations

For the data team, dbt Semantic Layer has been pivotal for ensuring data accuracy, quality, and consistency.

Hallucinations are almost completely eliminated, with SQL syntax errors and misinterpretations virtually nonexistent. The SQL output always conforms to how the data team wants it queried, so that business users only ever submit queries for predefined, vetted metrics. It provides a robust check against the LLM’s tendency to invent or misinterpret data points.

As a result, the process of pulling data is dramatically simpler for business users. They’re engaging with the tool, and the data team has received positive feedback about their experience so far.

Anecdotally, ad-hoc Slack requests to the data team have decreased. Meanwhile, the team is receiving new requests for metrics—showing that users are engaging with the tool and learning which questions they need answered.

“When we’ve held information sessions for our engineering and non-engineering teams, the feedback has been really encouraging,” says Wolinetz. “We love it when people use our tools, and it’s our non-technical stakeholders who are using it the most.”

Advancing AI with dbt-based architecture

As for what’s next, the data team is exploring the following:

  • Adding more metrics to the tool.
  • Integrating the LLM directly into Superset, so that users can interact with the chatbot right where they work.
  • Experimenting with different error-handling approaches.
  • Restructuring the LLM to answer general questions about the data warehouse. (e.g., “What dimensions are available for metric X?”)
  • Leveraging dbt Model Context Protocol server to directly integrate their dbt Semantic Layer and structured data to their LLM.

“For our business needs, our investment in AI and dbt Semantic Layer has been worth it,” concludes Wolinetz.

Watch M1 Finance's Coalesce session here:

If you’re thinking about how to add AI to your data workflows, we’d love to chat. Reach out to book a demo, or sign-up for dbt Cloud to connect your data warehouse and start building.


Last modified on: Apr 17, 2025

Early Bird pricing is live for Coalesce 2025

Save $1,100 when you register early for the ultimate data event of the year. Coalesce 2025 brings together thousands of data practitioners to connect, learn, and grow—don’t miss your chance to join them.

Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.

Read now

Recent Posts

Great data professionals never work alone

Every industry leader understands one thing: you need the right network to grow. The dbt Community connects you with 100,000+ data professionals—people who share your challenges, insights, and ambitions.

If you’re looking for trusted advice, expert discussions, and real career growth, this is the place for you.

Solve your toughest challenges

Join today and get real-world advice from experienced pros.

Expand your network

Foster connections with meetups, local groups, and like-minded peers.

Advance your career

The dbt community is full of learning opportunities and shared job postings.