Facebook 533 Million Records: Published in 2021, Breached in 2019, "Old Data"

Personal data of 533 million Facebook users was published freely on a hacker forum in April 2021. Facebook said it was "old data" from a 2019 scraping incident they had already addressed — but had never publicly disclosed.

Facebook / Meta·2021·2 min read

Background

In April 2021, Irish data protection researcher Alon Gal discovered 533 million Facebook users' personal data freely available on a hacker forum. The data included phone numbers, email addresses, birthdays, locations, and relationship statuses. Facebook had never publicly disclosed the 2019 scraping incident.

The Attack

The data was collected in 2019 by exploiting Facebook's "contact importer" feature — which allowed users to find friends by uploading phone numbers. Attackers automated the process to submit millions of phone numbers and harvest the associated profile data. Facebook had fixed the vulnerability in September 2019. However, the harvested data was sold privately for months before being published freely in April 2021. When the free publication occurred, the data became available to anyone with a Telegram account or hacker forum access.

Response

Facebook's initial response called it "old data" from a previously resolved issue. The company declined to notify affected users. Ireland's DPC (Data Protection Commission) investigated. Multiple EU regulators contacted Facebook. Meta was ultimately fined €265 million by the Irish DPC in November 2022 for GDPR violations related to the scraping incident.

Outcome

The "old data" response was widely criticised — the data remained perfectly usable for identity fraud, phishing, and SIM swapping regardless of its collection date. The case established that GDPR obligations apply to the public exposure of historic scraped data, not just newly breached data.

Key Takeaways

  1. Data collected years ago retains its value for fraud — "old data" does not mean harmless data
  2. API endpoints that allow bulk enumeration (contact importer, user lookup) must have rate limiting and anomaly detection
  3. GDPR notification obligations are triggered by the public exposure of data, not just the initial collection incident
  4. Phone number exposure enables SIM swapping attacks years after the data was collected
Facebookdata scrapingGDPRphone numbersold data