Understanding the difference: analytics engineer vs. data analyst
Feb 05, 2024
InsightsAnalytics engineering is a relatively recent data team role. Since then, the field has grown across industries. According to Business Insider top companies like Apple, Amazon, and Netflix are all hiring analytics engineers as part of their data teams.
An analytics engineer is a valuable addition to a data team. However, because the field is still somewhat new, many aren’t aware of what analytics engineers can offer and how to scope the role. What makes an analytics engineer different from a data analyst? How does analytics engineering help companies be successful in achieving their data goals?
Let’s look at the origins of analytics engineering, how the practice of analytics engineering has evolved and the tooling that supports it, and how to get started in this new field.
The data analyst/data engineer cycle
To prepare data for a project, it has to be sourced, transformed, and stored in an easily accessible and efficient format. This involves creating one or more data pipelines that ensure the source data is available, clean, and as free from anomalies as possible.
Traditionally, data development projects follow a set outline:
- A data analyst talks with business stakeholders about the needs of the project
- The analyst develops a plan for models based on those discussions
- Then, the analyst presents the data needs of those models to a data engineer
- The data engineer handles the data infrastructure setup and maintenance - ingesting source data, defining data models, creating the data pipeline, etc.
Unfortunately, this approach is riddled with inefficiencies that lead to bottlenecks, lack of context, and hindered trust in data.
Long projects have multiple back-and-forth cycles of feedback and adjustment. Analysts communicate with business stakeholders, adjusting the project to organizational needs. However, they have to bring their data requests (sourcing, modeling, and analytics needs) to data engineers whenever an adjustment is needed.
These feedback cycles grow slow as projects wait for engineers and analysts to communicate, not to mention actually deliver on the data requests. Engineers end up with long queues of requests for data pipeline updates. That bogs down their schedules and keeps them from working on improving infrastructure efficiency, regular maintenance, etc. The result is that everyone in the cycle becomes frustrated as project timelines grow longer, and business trust in data in compromised
This is why the analytics engineer emerged. Analytics engineers provide clean data sets to users. They focus on cleaning, transforming, testing, deploying, and documenting data. They do this by employing industry best practices - such as data modeling, source control, and continuous integration/continuous deployment (CI/CD) - to manage multiple versions of high-quality data sets.
Using new tooling that helped make the work done by data analytics easier, analytics engineers helped extricate their companies from the inefficiencies of the old approach to data development. The result is an approach that treats data more like software, ensuring that it’s modular, documented, tested, and automated.
The emergence of analytics engineers
The roles, responsibilities, and efficiencies afforded by analytics engineering has long been a pipedream among industry professionals. Before it became a reality, a number of shifts needed to happen in the field of data engineering and data management to make data more accessible and open.
First, there was the shift to cloud computing. Cloud-based technologies have now become the default method for storing data. The utility model of cloud computing combined with data warehouse technologies like Snowflake and Amazon Redshift have made it easier and more cost effective to spin up data warehousing clusters and expand storage on demand. That, in turn, has made it possible for any organization to have a data warehousing strategy and laid the groundwork to democratize the other categories in the data workflow (data integration, data transformation, orchestration, and more).
Other cloud-based pipeline services have helped pave the way for easier management of data pipelines. These include data ingest services like Stitch and Fivetran and business intelligence tools like Tableau, PowerBI, Sigma, Thoughtspot, and Hex. These tools lower the barriers to entry around data extraction and visualization.
Second, there’s the emergence of data modeling and transformation platforms. A data transformation platform adds a transformation layer to data development, enabling engineers to develop models using familiar SQL syntax. Engineers can build transformations for their projects and develop clean tables ready for analysis in a way that’s modular, scalable, discoverable, and automated.
These data objects have value beyond a single project. As a result, can focus on building data products that multiple teams can use and reuse. Instead of building one-time solutions, analysts focus on multi-purpose, well-documented data objects.
These changes paved the way for the birth of the analytics engineer. With easy access to compute, plus readily available SQL-based tools for ingesting and transforming data, analytics engineers can focus more on the data rather than on the technology required to maintain the data. In other words, they can spend less time on data architecture and more time solving problems within their business domain.
The analytics engineering role
Over time, analytics engineering has developed into a full-fledged data team position. Analytics engineers have several roles they play on data teams:
- Exploration: Exploring data already ingested into data platforms in response to stakeholder questions and needs.
- Preparation: Cleaning and preparing datasets for analytics use cases.
- Transformation: Transforming prepared datasets into objects that can serve organizational objectives, such as a super-table that can serve as a base for multiple applications.
- Documentation: Documenting the objects they find and create in the data warehouse, ensuring that other users can also see, understand, and use them.
Analytics engineering provides value in several ways:
- Having someone focused on developing and documenting data objects sets up analytics teams for data self-service with an accessible and understandable catalog of data products.
- Improved data discoverability helps prevent dark data, which sits in storage unused or unknown, from dragging on storage costs and potentially becoming a compliance liability. Analytics engineers stay engaged with the data warehouse, keeping data from getting overlooked and bringing unused but valuable data to the forefront.
- Having an analytics engineer responding to stakeholders relieves long queues from data engineering teams, opening up time for critical maintenance, infrastructure updates, etc. It also allows for faster feedback loops since analysts with stakeholder context can work directly on maintaining and changing data pipelines.
Transferring skills from data analyst to analytics engineer
Analytics engineering emerged from problem-solving in data analysis projects, so analysts already have many of the required skills. Importantly, they have relevant business knowledge of data use cases: awareness of the data’s meaning and potential metrics, an understanding of data presentation and documentation, and the ability to communicate and coordinate with stakeholders.
Analysts who want to get into analytics engineering should look to develop several critical technical skills:
- SQL, Python, R, and Excel
- Data transformation workflows like dbt
- Version control systems (primarily git)
- A knowledge of CI/CD for managing data pipelines and data production. (Data transformation tools like dbt Cloud can help start this transition, offering an approachable UI and guardrails for git and CI/CD.)
Analysts who shift into analytics engineering can not only boost the efficiency of their organization’s data pipeline and perhaps even ease some of their frustrations with development cycles, but also uplevel their career and skillset in the process. For more information on transferring into analytics engineering, check out this dbt developer blog about making the switch.
Conclusion
Analytics engineering is an established field thanks to cloud computing and the emergence of dbt. It bridges the gap between raw data and analytics-ready datasets, accelerating development cycles and improving trust in data and data teams. Having an analytics engineer on a team improves data quality, discoverability, and security and enables data self-service.
With some technical training, data analysts can transfer their understanding of data into a role supporting their organization’s data infrastructure.
Give your data teams the tools they need to bring analytics engineering to your data operations. Try dbt today.
Last modified on: Oct 15, 2024
Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.