How Virgin Media O2 streamlines data operations with dbt Cloud
Nov 26, 2024
LearnVirgin Media O2 faced several data-driven challenges all at once—including shifting their focus from products to customers and pulling off a migration from on-prem to the cloud.
The data team achieved some initial successes using dbt to migrate to Google Cloud Platform (GCP). However, a ride-along one day with customer support revealed that support technicians weren’t getting the data they needed when they needed it. Customers paid the price in terms of long wait times.
Fortunately, after some experimentation, the company found that dbt Cloud made it easy to lower data processing times. The result was more loyal customers—and at a lower cost to boot. Here’s how they did it.
Long customer support times
Virgin Media is a broadband television and home phone services supplier. O2 UK was the largest mobile network provider in the UK. In 2021, these companies joined forces in the largest UK telecom merger to date to create a new venture, Virgin Media O2.
Over time, Virgin Media O2 noticed a shift in their customers' desires. Years ago, new products drove consumers to purchase new service contracts. The company noticed that, once technology had met the needs of the consumer, they had less incentive to purchase new products. This led Virgin Media O2 to shift their focus from products to customers.
This was only one of the transitions the company embarked on to adapt to the needs of its customers. Another was a digital transformation of their e-commerce platform. An initiative to move their on-premise solution to Google Cloud Platform (GCP) was the perfect opportunity to prove their customer-focused direction by improving their net promoter score, a standard measure of customer loyalty.
The company decided to take a dual approach. A group of hundreds of contractors would focus on doing a lift and shift of the existing platform into the cloud. Meanwhile, it organized an internal team of four people to rewrite the solution from the ground up to be cloud-native.
The internal, cloud-native team achieved their goal within a year. Throughout the year, they worked alongside customer support to identify improvements.
One key improvement came from a “go and see day,” where data engineering rode along with support. The engineers observed that, since the merger, a regular support call would require information from multiple systems and pages. This, in turn, led to longer support calls and longer wait times for the customer.
The engineering team identified the opportunity to reduce the amount of time it took to support a customer by transforming the data and providing support with a single interface that included all of the data needed by the support team. To provide even more value to the customer, the interface would also include data-backed recommendations for upgrades and services that the support team could recommend to customers.
Reducing five-hour data pipeline run times
The team envisioned their work as building modular data backed products and models that service Artificial Intelligence (AI), Machine Learning (ML), and MLOps. They achieved this by consolidating data from various sources into Google Big Query, then running data transformation jobs to produce purpose-built data sets. One of these data-backed products supported the needs of the customer support team.
The data transformation job to produce the data needed initially began as a daily run that took five hours to complete. They saw large spikes in usage of Google Big Query around the time the transformation job would run.
That was expected. However, it also represented a lot of waste in the form of reprocessing and waiting. The team aimed to reduce the amount of data being processed to only what they needed and also to reduce the amount of idle time.
The team determined that they should move to an incremental data processing system that's designed to handle only data updates, such as new or modified data. They evaluated two different approaches for this: push and pull.
Push system
A push system would kick jobs off every hour to do the work in increments. The team found some issues with processing their data in this way to achieve their goals:
- Timing: If the hourly run processes the data in less than an hour's time, that’s wasted processing time, which the team wanted to reduce. If a run takes longer than an hour, this causes delays in processing and potentially re-processing. That, again, adds to waste.
- Manual Intervention of Issues: When an issue in processing does arise, there’s a risk of losing data and having to make code changes to recover a point in time.
Improvements in data processing would add to wasted processing time. However, running the jobs too close together would exacerbate potential issues caused by failures.
Pull system
A pull system continually pulls data to be processed. When a job has finished its processing, it triggers another job to start.
This solved the timing issue the team saw in a push system, as the data flow was continuous and not wasting any processing time. This also solved creating code changes to recover lost data, as failures in the processing wouldn’t trigger the next job.
How dbt Cloud enabled the new pull system
To pull of re-processing of data efficiently, the team turned to incremental models in dbt Cloud.
In dbt Cloud, an incremental model is a table in the data warehouse. Upon the first model run, it transforms the entirety of the source data. On subsequent model runs, it only processes rows from the source data that have been created or updated since the prior run. This solves the problem of over-processing by only transforming changes.
Circular referencing of the data model allows the data to flow continuously by batching the processing of the incremental data and then retriggering another run. This was achieved by using the {{ this }} Jinja function in their dbt models. This enabled:
- Using the reference of the latest timestamp of processed data in starting a new job.
- Using the data pipeline to keep track of what data has been processed and where to instruct the next job to begin processing.
This solves the problem of wasted processing time, as data was now flowing continuously.
The team used data segmentation to allow for sections of the data to follow a batch processing model. At the same time, it allowed for the data needed for just-in-time processing to flow continuously.
Key stages of the data pipeline will always receive full data refreshes. However, models that don’t require this can use the incremental delta tracking process to reduce the amount of data processed and the amount of time it takes to deliver the result.
More customer loyalty at less cost
The team quantified their success in stages:
- Initially, when the entire data set was being processed daily as a batch job, the section of data the team was optimizing processed 8TB of data per run, which took 47 minutes apiece to complete.
- In the next phase, using incremental models in a pull system, a run that processed 130GB of data took 140 seconds to complete.
- Finally, implementing delta tracking in conjunction with the incremental pull system, runs could process 32GB of data in 50 seconds.
The result of these changes equated to £100 savings per run or £36.5K per year. The team also fulfilled its goal of increasing its net promoter score, which rose by 26%.
Transforming data to transform business
The Virgin Mobile and O2 merger led the data engineering team to take a huge risk. Rather than reproducing the existing infrastructure in the cloud, they aimed to rebuild the system from scratch for better performance. The team hoped this would open up support for AI, ML, ML Ops, and Large Language Models (LLMs), all of which require low-latency data model production.
The goal of re-platforming the data infrastructure was to provide stability and standardization. That allowed the team to innovate quickly and focus on building in quality over the course of the journey.
This journey started with following a traditional approach to data transformation by batch processing the data daily. That gave them stability and also standardized their processes.
Once stable and standardized, the team could rapidly innovate to produce specific models that would serve as products to their consumers. If a team wanted a portion of their data model, such as the example given around increasing the value and speed of the support team, they could focus on reducing the latency by employing just-in-time data processing.
In the end, using features built into dbt and dbt Cloud, Virgin Mobile O2 accelerated its time to market while also improving customer satisfaction and overall data quality.
Find out how dbt Cloud can accelerate your digital transformation and streamline data operations—contact us for a demo today.
Watch Virgin Media O2's session at Coalesce to learn more about how they streamlined operations with dbt Cloud.
Last modified on: Nov 26, 2024
Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.