Data that informs internal and external stakeholders, as well as customers
Deputy makes it easy for employers and their staff to schedule and manage their shifts. Founded in 2008 in Sydney, it’s currently used by one in 10 Australian shift directors.
Data is crucial for Deputy and it starts from the product itself. Customers use Deputy for insights on how employees are working and what they choose to do with their time.
Deputy’s data is so comprehensive that during Covid, the big four consultancy firms reached out to access their raw workforce data for insights on how work patterns were changing with the pandemic.
At the same time, data helps Deputy understand and improve their business. It gives them an overview of if they’re investing in the right place – from marketing to product to sales.
Data becomes a bottleneck
However, their large product usage dataset proved challenging for the data team to leverage internally. As Deputy grew, its data architecture became so convoluted that the team could no longer efficiently use it.
It was a complicated setup where a series of Snowflake tasks would move raw data into an analytics layer. The team tried to resolve the complexity by using Airflow instead, but it didn’t make data access or modeling any easier. The data team still struggled to access and model the data they needed.
Messy data + no documentation = no deliverables
This complicated architecture was further harmed by a lack of documentation. With messy data and no clarity on how it worked, analysts couldn’t perform their jobs.
Dashboards at Deputy would break, but the team couldn’t identify why. Debugging was extremely difficult. It was customary for a broken table to take half a day of work to be fixed.
Since there was no documentation, analysts would have to manually trace everything back to the source in order to understand how tables were built.
With such a complex, time-consuming workflow, the data team struggled to fulfill "simple" requests.
“When I arrived at Deputy, I was all excited,” said Sara Young, Senior Manager of Data Analytics. “Stakeholders would make requests that didn’t seem complicated, but it would take me so long. I couldn't explain properly to the stakeholders that the data was such a mess. It was making me and the data team look bad.”
Trust in the data team hit rock bottom
All Deputy data team’s stakeholders—finance, customer success, marketing, product, sales—were unhappy with the team. The company’s low confidence in their abilities reached the point that when stakeholders needed data, they would circumvent the team. They neither trusted the data they’d deliver nor their ability to complete it in a timely manner.
“All leadership had nothing but horror stories about working with the data team,” admitted Huss Afzal, Data Director. “The VP of Customer Success shared that she asked for a dashboard that the data team said they couldn’t deliver for nine months, because that’s how long it would take them to build it.”
“Stakeholders knew the data team was not a good team. They weren’t surprised to be disappointed,” shared Sara.
High employee churn
The challenges the team was experiencing also harmed the team’s morale. They knew they had the skills to deliver the requested data products but, blocked by their tech stack, they faced their stakeholder’s disappointment every day. They were stressed, and team members were quitting because of the pressure.
“The team got to a breaking point”, said Sara. “A lot of people started leaving. First it was the director of data left. And then just more things started breaking. Then we lost one of the lead data engineers and he left with little documentation behind… things just turned off.”
Starting over from scratch, with dbt Cloud
When Huss joined the team, he worried even more people were going to leave the data team. He suggested re-building Deputy’s data infrastructure from scratch to remove the technological blockers and allow the team to do the work they knew they could do for stakeholders, well.
Both a new data engineer hire and Altis—the data consultancy Deputy worked with—suggested implementing dbt at the center of their new data infrastructure.
After ingestion via Stitch and Workato, all different sources—payment, CRM, Segment, Google Analytics—now go into a raw layer in Snowflake. dbt sits on top for transformation. After the models are created in dbt, they are fed into Tableau and exposed as reports.
“Learning dbt was a small learning curve,” said Sara. “Analysts proficient in SQL can pick it up once they see how it works. Once you get your head around, the rest falls into place.”
Building confidence in the data
One source of truth
dbt unified Deputy’s data and created a single source of truth within their new data stack. Instead of different databases, different views, and different logic, they have one model.
“You have one fact table and everyone is feeding off that table using the same metrics and getting consistent numbers,” said Andy Kwier, Lead Data Engineer. “Having that single source of truth grew confidence in the data. Today, if I run a report and you run a report, we get the same metric.”
Using data freshness to identify issues earlier
Another step in building trust in the data was to spot issues before business users did. With freshness testing, whenever a field or table stops updating, the data team receives a notification.
“Freshness testing has helped a great deal to pick up on things straight away and get them fixed,” said Andy. “We can now proactively flag and solve issues before anyone notices. This happened the other day with a table that was failing because of a null value that wasn’t allowed in the field.”
Greater visibility: reduced costs and easier debugging
dbt’s documentation features—such as data lineage—finally enabled the Deputy data to see what was running under the hood of their reporting.
“Before, we had over 400 tasks running: streams and pipes going everywhere. We didn't even know which ones were still being used for dashboards or reporting,” shared Andy. “We saved 20% computing costs on Snowflake by cutting what we didn’t need.”
Their newfound ability to report back to their source tables and corresponding data sources also made debugging easier than ever.
“Since we migrated to dbt, I haven’t had a problem that's taken more than a couple of hours to fix. Before, it’d take a minimum of half a day,” said Andy. “We had no lineage so you’d need to look into multiple pipes and streams. Now I can see what’s downstream or upstream.”
Faster turnaround times for new dashboards and metrics
The improvement in governance paired with a clear overview of their data lineage has enabled the Deputy data team to work better with other teams.
“The pace of delivery has been phenomenal. Analysts are collaborating directly with the business,” said Huss. “When you can answer questions almost immediately, then you’re invited by business stakeholders to join meetings and provide valuable insights on the spot.”
The deputy team has launched 26 data products in the last 10 months, most of them in the last 3-4 months after they got dbt up and running.
“On average, I’d say dashboards that’d take months now take a week,” smiled Andy.
Following the implementation of dbt, the trust in the data team has done a full 180. In just a few months, the team has gone from “the usual disappointment” to a total delight. Last quarter, they polled their internal partners (CS, marketing, product, finance) on how they’d rate the data team’s work; they received a perfect NPS score of 100.
“The business feels very comfortable asking us quirky ad hoc questions, and because we have a model in place, it's easy for us to either build it or answer that query,” shared Huss. “Productivity is skyrocketing.”
Moving forward: scaling dbt and creating new revenue opportunities from data
Data science-led predictive metrics
With the data team moving faster than ever, Deputy is now looking toward predictive analytics. For example, calculating and predicting the likelihood to churn.
“Outside of that we also want to introduce some data science to some of our metrics,” said Huss. “dbt’s integration with Python will come in handy there.”
For example, by adding data science to churn prediction and net user expansion metrics, Deputy will be able to help their customer success team focus on the customers that have a higher likelihood of expanding or churning so that they can better serve them and their needs.
Expanding into other dbt features
Now fully onboarded to dbt, Deputy is exploring more dbt features, such as the semantic layer and the metadata API. Both features will continue Deputy’s journey into building trust and improving governance.
“With the metadata API, we’ll be able to understand which sources and which tests generally fail,” said Andy. “This will give us more visibility into where our problems are.”
Data as a service
During the pandemic, Deputy saw the value of their aggregated data in spotting trends and telling a story. They’re doubling down on this, by creating a revenue stream for the internal data they capture.
“This is something we’re building next year. We’re super excited about it,” said Huss. “And dbt will play a big role in modeling that data and serving our new product.”