The risks of centralising data

As federated, centralised identity systems have proliferated, identity and personal data have become one and the same placing consumers and organisations at risk.

May 2, 2024

The modern digital economy has driven an unprecedented centralisation of data, creating vast, internet-connected databases that act as both irresistible targets for malicious actors and systemic points of failure.⁴ When these repositories contain Personally Identifiable Information (PII), their compromise can lead to immediate and devastating consequences, including identity theft, financial fraud, and personal endangerment.⁵ The very architecture of these centralised systems—storing millions of records in a single logical location—means that a single successful intrusion can yield a disproportionately massive reward for attackers, a reality proven by the relentless cadence of large-scale data breaches affecting corporations and government agencies alike.¹ The security measures protecting these "honeypots" must be flawless, yet they are pitted against a determined and ever-evolving threat landscape, making a breach not a matter of if, but when.⁶

Beyond the immediate risk of a direct breach, centralisation creates a more subtle but equally pernicious threat known as correlation risk. While organisations may diligently protect overtly sensitive PII such as passport numbers or financial details, they often collect a wide array of seemingly innocuous data points: location check-ins, purchase histories, website Browse habits, or even smartwatch heart rate data. Individually, these data points may appear anonymous. However, when aggregated within a single, vast database, they can be cross-referenced and correlated—by malicious insiders, external attackers who have gained access, or even by the data controller itself—to de-anonymise individuals with alarming accuracy.² A landmark study showed how supposedly anonymous Netflix movie rating data could be correlated with public IMDb ratings to re-identify specific users.⁷

This risk is magnified because data is rarely static. Datasets from different sources, often acquired through mergers, third-party agreements, or data brokers, can be combined.⁸ A user's seemingly anonymous activity on one platform can be linked to their real-world identity from another, creating a composite "super-profile" without their explicit knowledge or consent. This process of re-identification can reveal intimate details about a person's life, beliefs, and vulnerabilities, transforming disparate, non-sensitive data points into a highly invasive and detailed personal dossier.³ The fundamental problem is that as datasets grow and are linked, the possibility of identifying an individual from a small number of unique data points approaches certainty, making the very concept of "anonymous data" in a centralised system a dangerously flawed assumption.

This core problem is at the heard of our research into personal data, especially as it relates to identity. How to make data useful to a relying party, without it placing the relying party or the subject at risk? Not solving this problem has made delivering on the 7 laws of identity almost impossible as organisations couldn't benefit from implementing those laws.

¹ Identity Theft Resource Center. (2025). 2024 Annual Data Breach Report.

² Ohm, P. (2010). Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization. UCLA Law Review, Vol. 57.

³ Information Commissioner's Office (ICO). Anonymisation, pseudonymisation and privacy enhancing technologies guidance.

World Economic Forum. (2024). The Global Risks Report 2024.

UK National Cyber Security Centre (NCSC). The cyber threat to UK business.

Spitzner, L. (2003). Honeypots: Tracking Hackers. Addison-Wesley Professional.

Narayanan, A. & Shmatikov, V. (2008). Robust De-anonymization of Large Sparse Datasets. Proceedings of the 2008 IEEE Symposium on Security and Privacy.

Federal Trade Commission (FTC). (2014). Data Brokers: A Call for Transparency and Accountability.

Identity is not a product
The increasing trend among governmental and corporate entities to conceptualise and manage human identity as a product is fundamentally flawed. The approach is not only morally problematic, but also technically unsound.
Exploring Call Fraud
Since Telephone operators were replaced by dial phones, call fraud has been a problem, today it is a huge global industry. We explore how and why this problem remains so difficult to eradicate.
Your security problem is an identity problem
It’s only going to get more difficult for organisations to protect themselves, their customers and their data as the technology the hackers use gets better. In many cases organisations simply aren’t ready for today's challenges, let alone those which are coming