235 Million Instagram, TikTok and YouTube User Profiles Exposed In Massive Data Leak (forbes.com) 19
An anonymous reader quotes a report from Forbes: The security research team at Comparitech today disclosed how an unsecured database left almost 235 million Instagram, TikTok and YouTube user profiles exposed online in what can only be described as a massive data leak. The data was spread across several datasets; the most significant being two coming in at just under 100 million each and containing profile records apparently scraped from Instagram. The third-largest was a dataset of some 42 million TikTok users, followed by just under 4 million YouTube user profiles.
Comparitech says that, based on the samples it collected, one in five records contained either a telephone number or email address. Every record also included at least some, sometimes all, the following information: Profile name; Full real name; Profile photo; and Account description. Statistics about follower engagement, including: Number of followers; Engagement rate; Follower growth rate; Audience gender; Audience age; Audience location; Likes; Last post timestamp; Age; and Gender. "The information would probably be most valuable to spammers and cybercriminals running phishing campaigns," Paul Bischoff, Comparitech editor, says. "Even though the data is publicly accessible, the fact that it was leaked in aggregate as a well-structured database makes it much more valuable than each profile would be in isolation," Bischoff adds. Indeed, Bischoff told me that it would be easy for a bot to use the database to post targeted spam comments on any Instagram profile matching criteria such as gender, age or number of followers. The data appeared to have originated from a company called Deep Social, which was banned by both Facebook and Instagram in 2018 after scraping user profile data. The company was wound down sometime after this.
The researchers reached out to Deep Social, which then forwarded the disclosure to a Hong Kong-registered social media influencer data-marketing company called Social Data. Social Data shut down the database about three hours after the researchers' initial email. "Social Data has denied any connection between itself and Deep Social," reports Forbes, citing Comparitech.
Comparitech says that, based on the samples it collected, one in five records contained either a telephone number or email address. Every record also included at least some, sometimes all, the following information: Profile name; Full real name; Profile photo; and Account description. Statistics about follower engagement, including: Number of followers; Engagement rate; Follower growth rate; Audience gender; Audience age; Audience location; Likes; Last post timestamp; Age; and Gender. "The information would probably be most valuable to spammers and cybercriminals running phishing campaigns," Paul Bischoff, Comparitech editor, says. "Even though the data is publicly accessible, the fact that it was leaked in aggregate as a well-structured database makes it much more valuable than each profile would be in isolation," Bischoff adds. Indeed, Bischoff told me that it would be easy for a bot to use the database to post targeted spam comments on any Instagram profile matching criteria such as gender, age or number of followers. The data appeared to have originated from a company called Deep Social, which was banned by both Facebook and Instagram in 2018 after scraping user profile data. The company was wound down sometime after this.
The researchers reached out to Deep Social, which then forwarded the disclosure to a Hong Kong-registered social media influencer data-marketing company called Social Data. Social Data shut down the database about three hours after the researchers' initial email. "Social Data has denied any connection between itself and Deep Social," reports Forbes, citing Comparitech.
Paint me surprised (Score:1)
not
Ban Organized Surveillance (Score:2)
This shit needs to stop. Organized surveillance and the creation of databases of PII belonging to other people needs to be illegal.
Nobody else should be selling your personally identifiable information, and no business, government, organization, or individual should have the right to create such massive stalker databases.
That ship has sailed. (Score:3)
I agree with you in principle, but I think the reality is that there is simply too much incentive to do this sort of tracking for it to stop. There are too many very wealthy and well-connected people who have a vested interest in such practices continuing for any grassroots political movement against it to succeed.
We MIGHT see some more regulation around data handling if enough rich-and-powerful people get harmed by the leaks, but that will be about it.
For our part, the smartest thing we can do is adopt li
Re: (Score:3)
You forgot "Never use actual personal information in response to security questions."
Q: What high school did you attend?
A: Beta Nebulon Colony Preparatory School, in the Sombrero Galaxy
Re: (Score:2)
Anyone who puts those as security questions needs to spend a year in jail and hit with a massive fine for being a complete idiot. I could fleece huge amounts of info about someone with a casual conversation and looking at the archives of the local newspaper to find his mother's maiden name, and that would be enough to get me into a few of his accounts.
Re: Ban Organized Surveillance (Score:4, Insightful)
If you don't want your information to be public and aggregated into massive databases of public information, then don't post it in public.
This is really not hard.
I remember a time when I could trivially look up the address and phone number of almost anyone in America. It was compiled in a database that was even xonveniently shipped to every household and left on their doorstep!
Re: (Score:1)
Re: (Score:2)
Not old enough to have heard of unlisted numbers?
The current data harvesting business model is fundamentally broken and puts it all at risk - expecially when you're also required to use 2FA tying you to a physical device.
What we need is a legal requirement for ZERO KNOWLEDGE encrypted storage of all our data (such as spideroak or mega) to prevent these sorts of leaks.
Re: (Score:2)
Incorrect.
There has been many, many times when Facebook has discovered 'bugs' where people other than the intended recipients which by definition is NOT Public, have been able to access and/or scrape data.
This theoretical world where the 'we don't need privacy protections' crowd lives in doesn't have mistakes, backdoors, greed and stupidity.
What will the next generation think about all this (Score:4, Interesting)
When I first heard about Facebook, I was baffled why anyone would want to be part of it. That's me, and I don't claim any moral superiority for having a distaste for such things. But, setting aside my prejudices and biases (Or at least trying to; it's not easy), I wonder what future generations will think of Facebook, Twitter, and stuff of that ilk. Apparently the Facebook phenom is not going to be a passing fad. But will people in the future who grew up with this stuff be more savvy and hardened and less prone to the addictions and manipulations? I sure hope so.
Re: (Score:1)
What's boggles my mind is those of us who started using this tech when it was brand new had a better personal edict. Same rules taught to us from the real world as children carried over to cyberspace. Don't talk to strangers, don't share your personal info and such.
What's even funnier is the generation who taught us Gen Xer's these things are now possibly the worst offenders of sharing to much online.
I really don't know why everyone scummed to Facebooks real name policy. I could semi tolerate it's existence
Hmm - logic failure somewhere (Score:1)
And yet they have the data set and, clearly, the ability to shut it down. Just found it in an envelope in a railway carriage did we sir?
Social Media Companies are responsible (Score:3)
This was NOT an accident. Instagram, TikTok and YouTube purposely configured their sites in such a way that made account scraping possible. They should be fined $10/account(?) that was scraped.
It's not about cyber crime (Score:2)
> The information would probably be most valuable to spammers and cybercriminals running phishing campaigns
No, as always it's most valuable to databrokers.
There are two kinds of issues with technology:
- Incidental. These are exceptions to the rule, like leaks and hacks. Most consumers acknowledge these, but also don't give these dangers much weight since they think "It won't happen to me". Which is understandable.
- Structural. These are problems baked into the very nature of the technology itself, and of
"Exposed"? (Score:2)
From https://www.comparitech.com/bl... [comparitech.com], linked from the linked article:
> The profiles were taken from publicly viewable social media pages on Youtube, TikTok, and Instagram.
Are the profiles really "exposed" if they were public anyway?
Re: (Score:2)
I can't find any indication in the articles that they had access to any non-public data.