The scraped data of 2.6 million DuoLingo users was leaked on a hacking forum, allowing threat actors to conduct targeted phishing attacks using the exposed information.
The scraped data of 2.6 million DuoLingo users was leaked on a hacking forum, allowing threat actors to conduct targeted phishing attacks using the exposed information.
“Scraped” data suggests that it’s data available on public profile pages. However, the article also says the dump is a mix of public and non-public info. So which is it, scraped or not? It’s an important distinction, because data collection by scraping is technically not a breach.
Take this with a pinch of salt but what I’m gathering is that it’s essentially just taking people’s public profiles but the Duolingo api also exposes users’ e-mail addresses (and possibly other info) that isn’t normally displayed as part of the user’s public profile via their app.
In essence, they’re exposing more data than they probably should be and users were not really aware that data was being made public - that’s why people are upset about it.
Ok, this makes sense – in which case the API should not be exposing data that isn’t otherwise available on the public profile, so that is significant.