Much has been written about the game-changing potential of big data, machine learning tools, and predictive, or even ‘prescriptive’, analytics. The expression ‘data is the new oil’ (coined by Tesco Clubcard mastermind Clive Humby in 2006) has become common parlance. Like oil though, data is useless in its raw form and needs to be carefully refined, managed and secured.
As recently as 2020, data scientists were still spending more than 40% of their time cleansing data.1 While the situation is improving (from over 60% in 2016), it still represents an enormous waste of expensive and highly qualified resources.2
THE HIDDEN COSTS OF DATA
The business value to be gained from effective data management goes far beyond just data scientists’ time. IBM estimates that handling data-quality issues costs the US economy over US$3.1 trillion per year. Some of this is made up of easily measurable costs like fines, but the vast majority comes from the ‘hidden data factory’, where knowledge workers spend up to 50% of their time searching for data, finding and correcting errors, and searching for confirmatory sources for data they don’t trust.3 Of course, this figure fails to reflect the benefits that high-quality data brings including reduced risk, improved customer and employee experience, and the ability to spot and act on opportunities.
If a proliferation of manual processes, confusing architecture and general lack of trust in data sounds familiar, you’re not alone.
SO, HOW DO YOU FIX IT?
While improving data management maturity isn’t easy, the good news is that the root causes of issues often transcend sectors and functions. An approach that balances best practice with pragmatism is recommended, allowing you to cut through the technical jargon and bring your people along the journey. At a high level, there are three key areas of focus.
1. Create a vision and data strategy
Articulate where you want to go with your data capabilities and why. To be effective, a vision statement for data needs to be compelling and understandable, plus it must be clear on how it supports the wider business strategy. Amazon’s data vision demonstrates this perfectly – ‘to figure out what customers want and what is important to them’.4
The vision needs to be underpinned by a data strategy. At a minimum, this should include: what data the organisation needs in order to achieve its goals; how it will get this data; how it will manage the data; and how the data will be used.
A compelling vision coupled with a clear strategy quickly forms a powerful mix of purpose, direction and rationale.
2. Understand your data landscape
Once your strategy has been created, it is important to build a common understanding of your data landscape and where changes are needed.
Data architecture and modelling is the starting point for this. It creates a shared vocabulary around your data by describing the key characteristics: where it comes from; what it is used for; how it interlinks; where and how it is stored and processed; and when it should be disposed of.
Unfortunately, there are many examples of organisations getting this wrong, even those considered to be at the forefront of this space. It was recently revealed that Facebook’s parent company Meta has major shortcomings in its ‘data lineage’, i.e. knowledge of where data goes in the organisation and what it’s used for.5
A leaked document stated: “We [Facebook] do not have an adequate level of control and explainability over how our systems use data and thus we can’t confidently make controlled policy changes or external commitments such as ‘we will not use X data for Y purpose’.”5
Early and deliberate efforts to map your data landscape will help you anticipate and mitigate issues like these. It will also form a great foundation for strategic improvements in how you use data to your advantage.
3. Educate your stakeholders
According to 92% of chief data officers and other C-suite leaders, organisational culture is the main barrier to becoming data-driven.6 It is critically important that your people understand the value of quality data and the costs of poor data. Anyone with responsibility for data needs to know why their job is important, how to assess data quality, and what to do if they spot a problem.
Avoid jargon – clarity is key. Using terms such as metadata without context will immediately create a barrier with your stakeholders.
WORTH THE EFFORT
Building data management maturity is a continuous, incremental, and sometimes arduous process, but high-quality, secure data is fundamental to maintaining the trust of your customers and enabling leading-edge analytics to inform decisions.
1 The State of Data Science 2020 | Anaconda
2 Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task | Forbes
3 Bad Data Costs the U.S. $3 Trillion Per Year | HBR
4 Amazon Annual Letter to Shareholders 2021
5 Facebook Doesn’t Know What It Does With Your Data, Or Where It Goes: Leaked Document | Vice
6 AWS For Data
Our team harnesses the power of data to help set you up for success in a changing world.