dbt
Blog How to structure your data team

How to structure your data team

Apr 02, 2024

Learn

There is no one-size-fits-all way to structure your analytics function. Given the pace of technological change in our industry, it’s fair to assume that your data team structure will need to iterate and evolve over time. The data team is a brand new thing: it’s not “IT”, it’s not finance, it’s not any of the typical business functions within an operating business. So…who does it report to? How does it interact with the rest of the organization? How big is it?

These are all questions that are getting answered in real time throughout the industry. And they’re likely questions that you have as you go about constructing, or re-architecting, your data team. As of today, there are no clear answers. Companies are answering these questions in a bunch of different ways, all customized to their particular businesses. This is why this question is such a hard one.

Overview

So rather than opine about what we think the best answers are —and we do have our own opinions! — we figured it would be most useful to collect a bunch of “reference architectures” from amazing companies. Peruse them and see if any of them resonate with the team you’re trying to build. The core themes we picked out that seem to drive the design of a modern data team are:

  • How centralized / how distributed? Some teams are highly centralized, some teams distribute members to sit and work with organizational units.
  • Where should data engineering sit? Some teams put data engineers on the data team, some draw a dotted line with the engineering organization.
  • What is the role of the data team? Some teams embrace data as a product, and some teams operationalize data as a service.
  • What executive does the data team live under? Some teams have VP- or C-level executives leading them who report directly to the CEO. Other times data rolls up to a functional head.

Why is data team structure so difficult?

This topic clearly resonated with a lot of folks, and I think it’s worth considering why that is. It’s something we’re all thinking about right now. My take? Data team structure is difficult because data technology has changed so rapidly over the past five years and this has had a cascading effect on what data people do.

Ten years ago, the most challenging problem a data team faced was managing compute and store resources. At this time, data analysts had no choice but to request changes to the data warehouse and patiently wait for the data engineers to deliver. Modern cloud warehousing completely upended this relationship.

The challenging problems of managing compute and store resources have largely been solved. The biggest challenges today are around speed: How can we help data engineers and analysts collaborate more effectively? How can we empower analysts to move quickly without sacrificing data quality? How can we empower analysts, engineers, and business users to make sense of the data in our warehouse? These aren’t questions about technology, they’re questions about humans and how we can all work better together.

Centralized vs decentralized data team structure

What I’ve seen working with companies at varying sizes, and what I’ve learned from folks in the dbt Community, is that the spectrum of centralized to decentralized, also referred to as “embedded” or “distributed”, is one of the key decisions to make about data team org structure. Here is how David, from SnapTravel, defined the two ends of this spectrum:

diagram depicting contralized vs decentralized data teams

Centralized data team structure

In the fully centralized data team model, all data resources – people (data analysts, analytics engineers, data engineers, data scientists, etc.) and technology (data warehouse, transform, ingest, BI tools) – are owned by one central data team. If someone from product or finance has a data-related request, they submit it to the data team for prioritization.

diagram of a centralized data team

Advantages and disadvantages of a centralized data team structure

A few benefits of this model…

  • Alignment of data resources to company need: When you are a small data team, like Snaptravel was, and are growing, company alignment is particularly important. A small company doesn’t have the bandwidth to do all the things. It’s important to focus data resources on the highest-impact areas of the business.
  • Knowledge-sharing: By placing analysts and engineers in close alignment, the centralized model prioritizes knowledge-sharing. This makes it easier to build cultural data norms together like naming conventions, syntax, or even how to write and review pull requests.
  • Mentorship: In the centralized model, analysts get to learn from more senior analysts as well as data engineers. This is incredibly valuable for analysts new to the analytics engineering workflow.

The biggest issue with a centralized model is speed. If marketing needs support adjusting their attribution model, it’s likely going to have to wait until the end-of-month reporting is wrapped for the finance team.

Decentralized data team structure

In the decentralized model, you’ll typically see a central core group of data engineers who own the data warehouse with analysts being decentralized, or embedded, within a business function such as finance or product.

diagram of a decentralized, or embedded, data team

Advantages and disadvantages of a decentralized data team structure

The biggest advantage of the embedded model is speed. Data resources are aligned with department needs (instead of company needs). So if a business user has a request, they don’t need to wait for that request to be prioritized against all of the other needs of the business. Faster time to insights!

Speed also comes from having greater context. In a centralized model, work tends to be assigned in a more “round-robin” fashion. In a decentralized model, the marketing analyst owns all marketing requests. They understand the function’s KPIs, know the metric definitions, and are familiar with the quirks of the data. This is often a benefit to both business users (who spend less time explaining themselves) and analysts (who get to go deep into a given function).

One of the biggest downsides that we see with the decentralized model is how challenging it can be to keep analysts working closely together and improving their shared knowledge of data analytics. Let’s say your head of finance hires a finance analyst. It’s very possible (likely!) that person will continue to work in the spreadsheets that finance teams are traditionally accustomed to rather than adopt the modern data stack used by your centralized team.

Data team structure examples

Contained in this section are a handful of stories of how data teams have walked this path over the past few years, to help you answer questions like:

Thanks to the many teams who shared their stories in this section.

Snaptravel

Snaptravel trialed five data team structures over nine months. Five data team structures in nine months is a lot, but the potential efficiency gains for their team felt important enough to make these efforts worthwhile.

Each of the five structures that Snaptravel tried was a different mix of centralized vs. decentralized. Ultimately they landed where we see more and more companies land – a hybrid version. The question for data teams is no longer “centralized vs. decentralized?” The question is “What, exactly, should be centralized, and what should be decentralized?”

Here are the five structures Snaptravel’s data team used:

  • Growth Team: When Snaptravel received Series A funding, they launched their growth team and began to embed their data analysts to better serve other departments.
  • Agile: While vacationing in London, England, Nehil discovered dbt, which allowed Snaptravel to keep track of all their data models. To allow the analysts to work together, they quickly centralized analysts onto one team, switching to an agile approach.
  • Full-Stack: Snaptravel’s agile approach led to a ton of problems within the organization. Data engineers and analysts were not company-level aligned with their priorities and that needed to change. Snaptravel quickly changed this approach and merged four data engineers with four analysts to form a full-stack team. They were finally able to prioritize tasks at a company-level while improving knowledge-sharing between both roles.
  • Pod: Their full-stack team quickly grew from eight team members to 12 in March 2020, and team meetings became a waste of time for most members, because only one or two people were needed to make a decision. Their solution to this problem was to create multiple pods that specifically owned a full-stack problem in a given area of the business.
  • Domain Structure: While their pod solution solved an initial problem, it eventually led to a bigger problem that slowed down their team’s progress. The full-stack pod structure lacked ownership over objectives and, at times, there were four to six people all trying to come up with a decision. The last, final change they made to their structure is referred to as a Domain structure.

Data team structure

Finally, after nine months of constant change, Snaptravel landed on a hybrid setup that they call “domain-based” team structure. In this structure, a senior member of the team is labeled “domain lead” for a specific business area in a domain-based structure. They are then responsible for assigning work to other data engineers and analysts on an individual basis to support business priorities

diagram depicting a "domain-based" team structure

This filled some critical gaps for them:

  • Ownership: “One of the reasons domain leaders really really really like this structure is because they have ownership over all the outcomes of a given area of the business,” David said. Data team members aren’t just order takers, they get to see the way their work impacts the results of a given team.
  • Domain Expertise: David pointed out that this ownership creates something valuable for business users as well – domain expertise. When business users have a data need, they’re always working with the same people and have confidence that this person already knows how their core data sets work and understands the unique nuances of their function.
  • Collaboration: Data analysts and engineers are able to work on tasks that fit their skill set while sharing best practices with one another. With every analyst and engineer having their own responsibility, they are held accountable to complete tasks in a reasonable amount of time.

While this process currently works for Snaptravel, they recognize that it will evolve as their data team and organization grow.“One of the things that I’ve heard from people is that it won’t scale,” David said. “And to be honest, I don’t know. I’ve never worked at a large data organization. What we know is that this works for 10 people, and we think it could probably work for 20 people. Beyond that, we don’t know.”

Away Travel

Mike Berardo

Director Data & Strategy

Industry: E-commerce

Company Size: 250 employees

The numbers

⚡️3% of the organization focused on analytics

🔎20% of the organization is comfortable using BI tool

🔬1 data scientist (provisions and analyzes data)

⚙️1 data engineer (provisions data)

📊5 analysts (mostly analyze)

Data team structure

Away’s data needs are supported by five people on the analytics team and one person on the data science team, both teams report to the Director of Data & Strategy. The one-person data engineering team works closely with the Data & Strategy team but reports to engineering. Data & Strategy reports to the CEO, though Mike points out that this is an interim setup, long-term, data will report to the CFO.

diagram depicting Away's data team structure

The expectation at Away is that business stakeholders can do their own analysis, though customer experience, legal, and people operations all have dedicated analyst support.

Hubspot

Gordon Wong

VP of Business Data

Industry: SaaS

Company Size: ~3000

The numbers

⚡️5% of the organization focused on analytics

🔎10% of the organization is comfortable using BI tool

⚙️10 data provisioners

📊120 data analyzers

Data team structure

As the VP of Business Data, Gordon manages data engineering, data warehousing, and analytics enablement. Like many SaaS businesses, HubSpot’s software creates and moves an enormous amount of customer data, which can be challenging to understand. At the same time, data engineering is seen as a skill distinct from full-stack engineering. Because of this, the data engineering team is staffed by skilled software engineers but reports to Business Intelligence. The data engineering cluster on Gordon’s team is focused on building data infrastructure that supports business and product analytics.

diagram depicting Hubspot's data team structure

Gordon describes his team today as “engineer-heavy.” They are focused exclusively on enabling analytics, not doing analytics. The analytics function is fully decentralized with each business function hiring its own analysts and data scientists.

How dbt Cloud helps data teams


Structuring a data team is an evolving process that depends on your organization’s size, goals, and data maturity. Whether you choose a centralized, decentralized, or hybrid approach, the key to success is ensuring collaboration, data consistency, and scalable processes.

dbt Cloud helps data teams — no matter their structure — work more efficiently by enabling analytics engineers, data analysts, and data engineers to build, test, and document their transformations in a well-governed, version-controlled environment. With built-in collaboration tools, automated testing, and an intuitive development experience, dbt Cloud supports teams in delivering high-quality, trusted data to the business.

Get started with a free 14-day trial of dbt Cloud for teams and find out how.

Last modified on: Feb 11, 2025

dbt Developer Day

Join us on March 19th to hear from dbt Labs product leads about exciting new and coming-soon features designed to supercharge data developer workflows.

Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.

Read now

Recent Posts