Why data isn’t the new oil anymore

As change accelerates and AI becomes the norm, organisations need to rethink and reengineer their relationship with data. F-TAG’s Christine Ashton FBCS and Sue E Forder FBCS believe, rather than thinking of it as oil, uranium yellowcake might be a better analogy.

The increasing frequency, magnitude, and far-reaching consequences of data breaches and cyberattacks are alarming.

Against this backdrop, F-TAG believes that it is appropriate to remind all organisations of their critical role in securely storing, managing, and allowing access to personal data.

We can’t stress enough the responsibility all organisations have to protect their critical employee and customer data from unauthorised access, publication, and harmful misuse.

Data is central to almost everything we do. The trend toward more automation driven by AI means we will use more of it. This means security considerations will continue to increase.

In this paper, we emphasise the need for every organisation, big or small, to proactively take responsibility for data and its management.

We offer 12 accessible practices that every organisation can apply, beginning with a clear data purpose and strategy.

Does an oil analogy for data still serve it well in today’s complex world?

Seventeen years ago, Clive Humby said that ‘data was the new oil’. Thanks to its succinctness, this widely used quote has been used by countless communicators seeking to explore data’s huge strategic value.

However, since Humby’s oil analogy was coined, the technology landscape and our ability to innovate using data science and analytics have evolved significantly.

Today we benefit from the work done to define open data and interchange standards and to build easy-to-use, low-code, cloud-based analysis software. These widely accessible tools and techniques have helped us curate and spot patterns from many data sources, allowing us to create hugely valuable single sources of truth.

Unlike oil, data is not a finite resource; it can be enriched and has the power to empower without being depleted; it can grow in volume and value exponentially as it is processed, analysed, and shared, as recent data breaches have shown.

Data no longer flows around your organisation like oil does in engines. It has many ports of entry and flows in and out and around customers, employees, endpoints, homes, suppliers, and regulators. Data’s value is not exclusively in making an individual organisation work better; as a report by Thinkers50 points out, its role is increasingly to power digital ecosystems.

An oil comparison oversimplifies data’s pervasive nature. The more data is shared between organisations, service providers, supply chains and payment platforms –especially in real-time – the more valuable it becomes. That is until its quality is compromised.

Furthermore, an oil analogy downplays data’s central role in enabling artificial intelligence and the increasing number of bias, ethical and fairness considerations AI exposes.

Will data’s next two decades be more Uranium Yellowcake than Black Gold?

So rather than as oil, F-TAG believes organisations need to start thinking more about data’s potential to create both positive and negative value, harm or damage.

Yellowcake is uranium in its rawest form before it undergoes enrichment; like data, it is mined and brought together from many sources. Like yellowcake uranium, data is a raw material with significant potential energy that needs careful processing, refinement, and handling to unlock its value and results.

The many databases and lakes organisations fill with single-source-of-truth data have parallels with yellowcake. Before it is fully refined, uranium yellowcake isn’t intensely dangerous. Like our corporate and personal data stores, it is safe to handle – but only if it is managed responsibly.

Once enriched to create uranium however, like data, it can impact individuals, industries, economies, and society globally. This enrichment – like data processing – can happen without considering the consequences.

Yellowcake uranium is subject to regulations, guidelines, segregation of duties, access control and ethical considerations to ensure responsible and safe use. Like leaks of uranium, data leaks can cause extensive reputational damage, the implications of which might not be known for years to come. While oil spills may be easier to spot, data is increasingly more challenging to contain, especially across virtual infrastructures.

Comparing data with yellowcake highlights the transformative power of data when accessed, processed, and utilised responsibly. It also makes us think more deeply about the negative consequences of uncontrolled, fake and unregulated data – particularly in an AI-driven world.

The responsibility of organisations

The rate at which we acquire and store data is increasing exponentially. Analysts predict it will have grown worldwide to almost 200 zettabytes by 2025. At a minimum, and as stated in The Data Protection Act - GOV.UK, all organisations have a regulatory responsibility to protect customer and employee data from being changed or misused. Data breaches don’t happen solely by accident. Assessing and understanding the risks and liabilities are crucial.

In recent public disclosures about leakage incidents, some organisations have quickly blamed junior staff members for the exposures. According to Harvard Business Review’s whats-your-data-strategy (2017), 70% of employees had access to data they should not have, and more than 80% of analysts’ time was spent simply discovering and preparing corporate data.

So organisations could deliver customer service and stay competitive during COVID-19, businesses needed to provide data access to workers, many of whom were also working remotely. After such a heightened time, many companies have likely lost track of who was given access to what and are unlikely to revoke unnecessary permissions.

This, together with increased use of cloud, hyperscalers, and SaaS applications, means we can no longer point to where our data is physically stored, nor are we absolutely certain who can see it. We cannot rely solely on perimeter-style security to protect data stored in large spreadsheets or databases.

In this form, without any segregated access controls, our data is the perfect yellowcake for enrichment using advanced analytical tools or for training large language models (LLM) for good and bad. Therefore, it is easy to appreciate why the number of monthly incidents and records breached is increasing.

Making privacy by design your default

Adhering to the largely common-sense principles of privacy by design by default, as outlined in GDPR, is critical if organisations are to become confident they know who is accessing or processing their data. Or, to use our analogy, consuming hard-earned yellowcake.

When designing processes and systems, organisations must make provisions to ensure privacy. They should also assess whether bias is present in data sets. Ideally, they should do these things by default.

Every certified data protection officer, data owner and system owner must be held accountable for data quality standards and risk assessment. They must also assess the balance between privacy and access, ensuring that data management practices remain fit for purpose within their organisations and their partners. The design of data governance processes, use of technology, ongoing training, culture change and employee support are vital to achieving this.

Organisations must know who is authorised to access, augment and share their data. They must protect themselves from too much unauthorised access, incorrect sign-offs and segregation of duties. These can lead to disclosures of personally identifiable data. The Northern Ireland Police Service is a powerful example of how not practising ‘data minimisation’ – giving staff too much access – turn data from corporate yellowcake to digital uranium in the blink of an eye.

Have we taken our foot off the GDPR pedal?

Five years ago, when GDPR became law, many organisations had a duty to appoint a chief data protection officer (CDPO), and many did so as a matter of good practice. CDPOs were there to help executives understand data and how their organisations should collect and share information. The CDPO also has responsibility in response to freedom of information (FOI) and subject access requests (SAR).

Recent enforcement action taken by the Information Commissioners Office (ICO) against organisations indicates that data is being collected and shared without the proper controls, with some, such as TikTok, receiving multi-million pound fines for misusing personal data. As technology and its application advances, so does the attention of the ICO, which published in Oct 2023 its intention to take action against Snap’s failure to properly assess the data protection risks posed by the generative AI embedded in its chatbot My AI to children and other users.

But, despite this work done by CDPOs over the years and the risk of fines by the ICO, the statistics show a worrying trend that we might be in danger of relaxing our data protection efforts. As we have said, how we use data continues to evolve. As we accumulate more and more of it, there is mounting tension between data sharing and data protection as we strive to digitise services to fuel business efficiency and growth. It is time to reflect on whether collecting too much data is becoming a liability.

Balancing the needs of organisations, shareholders, and regulators in a continually evolving context takes time and effort. For this reason, it is vital that organisations are clear about the purpose their data serves and regularly review how, when, and by whom it can be consumed, shared, safely archived, and disposed of.

The evolving role of CDPOs

Many executives and data functions have been instrumental in establishing analytical capabilities within organisations. However, as we increasingly rely on digital automation, AI, and data exchange across partners and ecosystems, CDPOs must champion data minimisation practices and data access change management. They are well-positioned to understand where there needs to be trade-offs between standardisation, control, and flexible access and to collaborate with senior data owners. They have an essential role in ensuring and risk-assessing data governance processes are fit for purpose compared to an organisation’s appetite for exposure to data risks and liabilities.

In turn, CEOs and boards must listen to them and review status updates on data strategy implementation. They should also consider introducing an information dashboard to regularly review and discuss data leakage near-miss events, control vulnerabilities, governance issues, and security violations. Companies must take the time to train boards and executives in data loss scenarios and what to do if the inevitable happens and there is a breach, cyber-attack or ransomware event.

Recommendations

BCS F-TAG believes a timely reminder of the risks of poor data practices is overdue. Despite everyone’s efforts, a lack of clarity regarding data access and usage persists among employees and executives. While fines are applicable, regulators, industry bodies, and government agencies must reinforce organisations’ responsibilities to protect customer and employee data. The ICO recently published lessons learned from reprimands. These conclusions point to the actions all organisations can reasonably take. In a similar vein, we recommend the following 12 data management practices to organisations:

Communicate your data purpose: An explicit, concise declaration outlining the intended use and objectives of collecting, processing, or storing data. It can be part of your privacy policy or data protection framework. This statement will inform all stakeholders and individuals what you are collecting under what legal or ethical grounds, why it is collected and how data will be used.
Publish a data strategy: Show how your data principles, governance and initiatives align with business goals. Describe how you will use data to inform decision-making, reporting, etc., to deliver your obligations and objectives.
Data ownership: Assign accountability and responsibility for data to specific individuals or groups within the organisation. It helps ensure that data is accurate, reliable, and secure, which provides that measurable value can be derived from its use. Assign data owners for critical master and business data classes. For example, start with customers, employees, and finance to get going.
Data governance: document your processes for managing the availability, usability, integrity, and security of the data used in an organisation. Establish policies and procedures for data management, assigning access, and deciding quality, architecture, and security.
Transparent data classification and handling: Create a formal policy document by data class that sets out collection methods, retention periods, quality standards and usage. Reinforce by communicating general and role-based access control and information security classifications. Regularly review access rights to repositories, systems, and data. Train employees with regular refreshers to recognise and question risky practices or unauthorised data access.
Data minimisation: Define, collect, process, and share only the data necessary for a specific purpose to reduce aggregation risk, security and privacy concerns. Consider minimising access using segregation of duties to prevent oversharing or as a way to minimise risk by creating sign-off points, for example before releasing large amounts of confidential or critical data.
Data sharing policies: always consider the impact on individuals and society and implement anonymisation and de-identification strategies internally and externally to your organisation to ensure the privacy and confidentiality of individuals, including testing data.
Ethical and unbiased data handling: prioritise equity, inclusivity, and fairness. Ensure you have fair data practices that avoid bias and discrimination in data collection, handling, and system testing across customer and employee journeys, marketing and product lifecycles.
Responses to requests. When it comes to subject access requests (SARs), be clear and transparent on data collection needs and usage, particularly informed consent and respecting individuals’ privacy rights. Be clear on selecting, signing off and sharing the appropriate data within the required period. Ensure training is given to staff expected to respond to SARs and follow-up requests, such as ‘the right to be forgotten’ within the timescales. Ensure the number of requests and your progress with responses are monitored and tracked. Next, we have responses to freedom of information (FOI). Here, ensure a systematic process to ensure compliance and transparency at all request stages, from acknowledging receipt to providing appeal information. If any part of the request is withheld due to exemptions, clearly explain the reasons for non-disclosure, referencing the relevant FOI exemption.
Compliance and security: Always ensure compliance with the latest data protection regulations. Encourage a habit of ongoing review and improvement in data management practices, points of entry and exit and access controls (including virtual infrastructures), and continue to refine joiners, movers, leavers (JML) processes and permissions. Evolve technology and security to adapt to changing threats, near misses and lessons learned. Stay abreast of the latest security measures and practices, whether manual or automated, for example multi-factor authentication (MFA). Use these to control access and ensure you know who you are transferring your yellowcake and data to.
Data literacy and education: Promote data literacy to all stakeholders appropriately and provide support channels for addressing data security where employees and customers can speak out if they witness data security risks and issues.
Data for positive impact: Use data and technology to create a positive and inclusive societal impact aligned in a way that enriches and is good for society.

We recommend adopting these 12 data practices and proactively assessing your progress with them as you refine and adapt your approach.

In closing

We hope this paper reminds organisations, individuals, data and IT professionals of the data management lifecycle, security principles and classifications that must be adopted and maintained.

Progressive, robust and ongoing training on data use, policies, and governance will help prevent data breaches and unauthorised access and, to continue with the yellowcake analogy, prevent your data from becoming a ‘Destroyer of Worlds’ in someone else’s hands.

Data is an organisation’s most valuable asset, which hasn’t changed. Any time spent understanding it and how it delivers your business strategy will be well spent.

Data management activities should focus on creating a data leadership culture with role clarity, incentivised governance, optimised enterprise architecture and continued education in its protection and use. Organisations must continually keep the balance between data sharing and security under review to ensure it is within their risk appetite and that their desire to maximise data is consistent with regulations.

For you

BCS members can read the very latest F-TAG technical briefings and reports.

We all need to take on board the unintended consequences of publishing data. Lean data management practices, emphasising data minimisation, can help mitigate aggregation risks and prevent ungoverned data sharing.

Consider regular reports or KPIs that help you understand how your data flows in, out, and around your organisation and provide insights into how your yellowcake is consumed at all levels.

To summarise, if there was ever a time to get the data management basics in good order, it is now. This work will help organisations build strong data foundations as they invest to accelerate the benefits of automation and AI. Not doing this now will lead to impacts further down the line through biased data, and rather than garbage in garbage out (GIGO), you will have faster garbage in and garbage out (FGIGO), causing significant operational risk as data is used in more sophisticated modelling and automated processing.

Case studies in failure

In 2023, according to IT Governance, the 2,814 most significant data incidents resulted in the breach of 8,214,886,660 records. In December 2023 alone, significant breaches and cyber-attacks exposed 2,241,916,765 records, the consequences of which might not be known for some time. 23andMe’s data breach in April 2023 is one of the most significant and severe incidents of biometric data theft in history. The records were stolen over several months, and the breach potentially exposes almost 7 million people to identity theft, discrimination, and other risks. The data which included user’s profiles, health and ancestry data, raises questions about the regulations used to ensure the privacy and security of personal testing data. The August 2023 Northern Ireland Police Service data breach was also a salutary reminder of the unintended and far-reaching consequences that breaches of personal information can have. The recently published review of that data loss event (2023) made it clear ‘that a failure of the Northern Ireland Police Service to recognise data as both a corporate asset and liability together with a siloed approach to information management were strong contributory factors to the breach’.