By Julian Weinberger, CISSP, Director of Systems Engineering at NCP engineering
Regulatory initiatives such as the EU General Data Protection Regulation (GDPR) have granted consumers powerful rights to determine how organizations collect and use personally identifiable information. Companies that hold on to personal data without consent, or who fail to employ adequate measures to protect it, may face stringent penalties.
Yet, there is one important exception. Anonymized data – information held without key details to prevent identification – is exempt from the rules.
Data in anonymized form is meant to reduce the chance of a breach or damage from its loss because it cannot be used to identify specific individuals. Received wisdom holds that with no threat to personal privacy there is no risk of punitive fines.
Anonymized data is ideal for medical trials and market research. Healthcare organizations, for example, can take patient names, addresses, and dates of birth out of digitally stored medical records to use the information for research purposes without the risk of disclosing individual identities.
It’s not just medical research that benefits from anonymized data. Transport for London recently mined anonymized mobile phone data of passengers to gather information that enabled it to create more accurate travel times and arrival estimates.
While anonymized data undoubtedly has its uses, it is far from perfect.
Deciphering the Datasets
On its own, anonymized data is impossible to decipher – until that is, someone starts to cross-reference it against publicly available data sets such as an electoral roll or a national census.
Belgium’s Université Catholique de Louvain (UCLouvain) and Imperial College London discovered this can be achieved with alarming accuracy. The study found that an anonymized dataset containing 15 demographic attributes could be used to identify individuals in the state of Massachusetts with 99.98 percent accuracy. Considering the state population is close to seven million people, the findings are remarkable.
In another prominent example, researchers found that publicly available anonymous data about routes taken by New York City cab drivers could be used to reveal their home addresses. The de-anonymizing process seems to be more accurate with smaller datasets – especially when cross-referenced against the right database.
European regulators have shown they are ready to issue stiff penalties to organizations that do not take proper precautions with anonymized data. Most recently, Denmark’s data protection agency fined a taxi company approximately $180,000 for failing to anonymize data properly.
Clearly, organizations cannot expect anonymized database data alone to protect sensitive customer information. Firms must be proactive and implement the proper security measures and technology to ensure customer privacy is safeguarded.
Encryption is one of the most reliable strategies for protecting the privacy of digital assets, especially if the organization needs to send or share them over the public Internet. Encrypted data is encoded and can only be accessed with the correct key, usually using symmetric- or public-key encryption. Data treated this way is impossible to decipher, effectively rendering it unintelligible to outside observers.
Encryption is essential to protect database data in storage but also on the move. A professional, enterprise-quality virtual private network (VPN) is an extremely effective way to secure digital communications.
In summary, database anonymization is useful for storing personal information that is collected in the course of research. However, researchers cannot trust anonymization alone to keep personal data protected from third-parties. Implementing a robust, enterprise-standard VPN is the best way to guarantee customers’ personal information remains fully protected at all times.
About the Author
Julian Weinberger, CISSP, is Director of Systems Engineering for NCP engineering. He has over 10 years of experience in the networking and security industry, as well as expertise in SSL ‐ VPN, IPsec, PKI, and firewalls. Based in Mountain View, CA, Julian is responsible for developing IT network security solutions and business strategies for NCP.