dbt
Blog Data leadership in the age of AI

Data leadership in the age of AI

Aug 26, 2024

Insights

According to a 2024 CDOIQ conference report, only 6% of planned AI applications have made it into production. It’s a sign that many companies are still struggling with fundamental issues around Gen AI.

Successful Gen AI applications require a strong foundation that enables high data quality and strong governance. Fortunately, we already have many of the tools and processes to enable this.

Recently, I participated in a panel for CDO Magazine with Andrew Foster, Chief Data Officer of M&T Bank, and Karthik Ravindran, who leads Data Governance and Enterprise Data at Microsoft. They asked us how our businesses are crafting our approach to AI, including our approaches to governance, skills and team structure, and driving business value.

The first part of our discussion focused on our respective company’s overall approach and strategy. Here’s what I said about how dbt Labs is taking a human-centric approach to AI adoption while keeping business values as well as compliance front-and-center. ‌I’ve also included (my own) summaries of Andrew and Karthik’s insightful takes on how their companies are facing the same challenges.

Human-centric AI

Q: Some of the top challenges companies have cited with AI are data availability and data quality, business outcomes, culture and adoption, and change management. Where's your organization ‌currently with rolling out AI-powered solutions?

A: I spoke to a Chief Data Officer [CDO] recently who called himself the CSNO, or Chief Saying No Officer. When someone told him they needed Gen AI, he’d respond, “I don’t think you do - but tell me what business problem you want to solve with it?”

Finding the right applications for a new technology is key. At dbt Labs, we think of Large Language Models (LLMs) as great co-pilots for development. If they can put context front-and-center for developers, that’s helpful in tightening development loops.

Andrew Foster: Every vendor is embedding Gen AI into their capabilities—whether the tools need it or not. For our part, we have to make sure we’re approaching it with the right discipline, taking a “safety first” approach. So our focus is boosting internal productivity and vetting tools through internal testing, incorporating all the existing controls we have in place.

Karthik Ravindran: I think at times that, as technologists, we lose sight of the human component. With Gen AI, we need to really resonate the story of the why.

Why is AI going to help humans be more impactful and scale better? What change management processes do organizations need to embrace the shift that’ll be foundational to AI success? ‌We can’t ignore the human component if we want to transcend the hype and get to true value outcomes.

AI use cases, known and unknown

Q: What are some of the most compelling use cases for AI today?

A: I try and take a longer-term view, thinking about what the next five years or so will look like.

Obviously, we don’t know what’s going to happen with tomorrow’s OpenAI models, let alone five years from now. Take the original iPhone apps—they felt more like web pages. As the industry evolved and people became more comfortable with them, the apps did too.

We see a similar thing with Gen AI apps now. They’ve started as chat interfaces to LLMs—which is a natural progression. But I think, over the next few years, we’ll see different modes of interaction.

Agents will be a part of that. But I think we’ll also see the use cases we discuss today obviated by new approaches that free up humans to do more high-leverage work. I also expect we’ll see something like interrogating dashboards with natural language or even bypassing parts of the entire BI workflow.

Andrew: When email came along, lots of people got fired. They got fired because they didn’t know what to say or not to say in corporate email. You saw something similar with social media. People had to ingrain newly learned behaviors.

I’m less concerned with how a tool works and whether you should adopt it. I’m more concerned about how we ensure people understand and use the tool in ways that are additive—to themselves and the organization—instead of detrimental.

As an example, say you’re using something like Microsoft Teams and recording a call. It’s recorded and summarized by AI. Are you comfortable with what everyone in the company might say on that call? What’s your retention period for that data?

This is less a question about picking the perfect tools and more about building an organization that’s ready for an era where these new capabilities will exist everywhere.

Karthik: I think we’re all familiar with the use cases for Machine Learning and traditional AI. Generative AI brings about a bunch of new opportunities.

At Microsoft, we think of it as a layer cake. At the bottom, you have things like Q&A bots and natural language agents that can provide basic information and assistance. The next layer would be productivity use cases like content summarization or content tuning. And then from there, it goes further northy—e.g., advanced use cases around bootstrapping content, and then agents that can help take over and automate more of our manual tasks.

As you go further up these layers, however, the level of change management and risk also rises. At Microsoft, we take this framework and use it to make conscious decisions about the maturity model of an overall team. We use this to create a graduated framework to help guide us in how to introduce people to these opportunities, letting humans turn the knob in terms of how sophisticated they want to get.

Balancing AI with data accuracy

Q: How should organizations balance the rapid pace of AI adoption with the need to ensure data accuracy and integrity? And what are some of the techniques companies can use to build trust in AI-powered insights?

A: I don’t think we need to reinvent the wheel here. We know that if you put garbage into an AI or ML system, you’ll get garbage out. We also know how to govern data and manage access to different classes of data for different groups.

That’s a lot of what we build dbt for. It enables you to model your data, making sure it’s in the right format for analysis or operational use cases, like feeding it to an LLM. You can test data to ensure it conforms to your requirements, and even detect anomalies and issue alerts before they impact downstream data consumers.

That requires, of course, understanding what level of data quality is “good.” Data is never perfect. But if someone can aggregate and zoom in on the data and say, yes, this is useful, this maps to my understanding of how the business works, then that builds trust in the data.

Data quality also requires gathering feedback and staying attuned to the needs of our users. It’s hard for the data team sometimes to understand if a given set of data looks off. But it’s easy for members of our field teams to look and say, “Hey this user is on 10 different accounts and they’re the admin everywhere—what’s going on?” There’s a smell test where the people closest to the data can tell if it’s suspect.

All this is to say that I don’t think we need to reinvent the wheel here. We need to take these tried-and-true principles and apply them in this new domain.

Andrew: I see data quality as an opportunity to strengthen the effectiveness of my team and our federated roles within the organization. Typically with data quality, a technologist and an analyst sit in a room somewhere and hammer out some rules. It’s not particularly efficient. And it doesn’t scale.

With AI, we can use things like anomaly detection to compare data across time and detect things such as a spike or reduction in volume. I see that as a massive enabler for ongoing day-to-day improvements, particularly because it still keeps humans in the loop. It can be self-learning, taking the inputs from humans, and learning how to point the human decision-maker to anomalous activity in a more effective manner.

Last year, there was a very technology-centric, “shiny new toy” view of Generative AI. Now, based on conversations I’m having, companies are looking more at how adoption is led through human behavior.

Karthik: We faced this challenge at Microsoft, where we had to traverse quality for our internal data estate. At first, we did what Andrew said, setting up a bunch of rules, and dashboards with green/yellow/red statuses. It was an interesting scalability challenge. But what we lacked was how to tie those dashboard statuses to business outcomes and why teams should care.

We realized it’s hard to excite your stakeholders when you cast everything through a purely technical IT lens. So instead, we changed our focus on the problem to business outcomes and aligned them to OKRs. This eventually became the mainstream way of approaching data quality—top-down versus bottom-up.

Generative AI brings a whole set of different quality dimensions—bias detection, ethical compliance, infringement protection, etc. But at the end of the day, it’s all anchored in data. And not just in data coming in, but in data going out.

You can’t just unleash a foundation model in its raw form. There are tougher processes to go through first. Are you doing continuous training? Are you domain-contextualizing and training the model with inputs that are specific to your context? Are you fine-tuning? Are you prompt engineering?

All of this eventually accrues to the quality of the user’s experiences and the output the models can generate. There are a lot of thoughtful investments needed on both sides of the fence.

Conclusion

The use cases for AI are evolving and will continue to change. However, some basic principles will remain unchanged. Data quality, strong governance, and a focus on people and processes over tools will continue to be the firm bedrock upon which production-ready AI applications are built.

In the second part of our discussion, we touch on the more practical aspects of managing AI initiatives, such as dealing with multi-modal data and measuring success.


Last modified on: Oct 16, 2024

Build trust in data
Deliver data faster
Optimize platform costs

Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.

Read now ›

Recent Posts