IBM, and Some Other Companies Did Not Inform People When Using Their Photos From Flickr To Train Facial Recognition Systems (nbcnews.com) 105
IBM and some other firms are using at least a million of images they have gleaned from Flickr to help train a facial recognition system. Although the photos in question were shared under a Creative Commons license, many users say they never imagined their images would be used in this way. Furthermore, the people shown in the images didn't consent to anything. From a report: "This is the dirty little secret of AI training sets. Researchers often just grab whatever images are available in the wild," said NYU School of Law professor Jason Schultz. The latest company to enter this territory was IBM, which in January released a collection of nearly a million photos that were taken from the photo hosting site Flickr and coded to describe the subjects' appearance. IBM promoted the collection to researchers as a progressive step toward reducing bias in facial recognition. But some of the photographers whose images were included in IBM's dataset were surprised and disconcerted when NBC News told them that their photographs had been annotated with details including facial geometry and skin tone and may be used to develop facial recognition algorithms. (NBC News obtained IBM's dataset from a source after the company declined to share it, saying it could be used only by academic or corporate research groups.)
"None of the people I photographed had any idea their images were being used in this way," said Greg Peverill-Conti, a Boston-based public relations executive who has more than 700 photos in IBM's collection, known as a "training dataset." "It seems a little sketchy that IBM can use these pictures without saying anything to anybody," he said. John Smith, who oversees AI research at IBM, said that the company was committed to "protecting the privacy of individuals" and "will work with anyone who requests a URL to be removed from the dataset." Despite IBM's assurances that Flickr users can opt out of the database, NBC News discovered that it's almost impossible to get photos removed. IBM requires photographers to email links to photos they want removed, but the company has not publicly shared the list of Flickr users and photos included in the dataset, so there is no easy way of finding out whose photos are included. IBM did not respond to questions about this process.
"None of the people I photographed had any idea their images were being used in this way," said Greg Peverill-Conti, a Boston-based public relations executive who has more than 700 photos in IBM's collection, known as a "training dataset." "It seems a little sketchy that IBM can use these pictures without saying anything to anybody," he said. John Smith, who oversees AI research at IBM, said that the company was committed to "protecting the privacy of individuals" and "will work with anyone who requests a URL to be removed from the dataset." Despite IBM's assurances that Flickr users can opt out of the database, NBC News discovered that it's almost impossible to get photos removed. IBM requires photographers to email links to photos they want removed, but the company has not publicly shared the list of Flickr users and photos included in the dataset, so there is no easy way of finding out whose photos are included. IBM did not respond to questions about this process.
Which brand of Creative Commons license? (Score:5, Interesting)
There's no implication IBM did anything wrong. This is what the Creative Commons licenses are for. What's the story?
Re:Which brand of Creative Commons license? (Score:5, Insightful)
Exactly! What the fuck? "I shared something and someone viewed it, it's not supposed to happen!" YES, IT IS, DERP.
Re: Which brand of Creative Commons license? (Score:4, Insightful)
"The photos were shared but not with this use in mind." - Too bad! "Granted it is splitting hairs" - Sure is! "but it does bother some people" - Who DID IT TO THEMSELVES, sure. Who cares about them?
" Imagine if you were one of the people in the photo's background." - I don't pose for group pictures. I value my privacy to the extent that I do stuff specifically to keep it, like not sharing or appearing in photos with insta-bombers.
"Those people don't get much say when you share the photo." - Damn straight, welcome to the world. This is just a fact of living in a society where taking pictures and sharing them is generally not illegal. People are upset? Aww.
Poor little things have to try harder if they want to stay out of photos on the internet. So sorry, but the internet isn't going to forget what you tell it just because you didn't expect it to remember, or think of the outcomes ahead of time.
Re: (Score:2)
Re: (Score:1)
I bet there is some technical violation there, but to get to the point of suing them in reality where lawyers cost money and nitpicking is either subsidized heavily or waved aside, who's going to prove damaged by this? \
Against alphabet, IBM, etc?
Nobody. Nobody is. There's no white knight of the law that's going to pick that scab unless there's millions of dollars in it.
Re: (Score:2)
100% this. If your photos are "public" (without having to authenticate to whatever $social_media site you put them on, you should have no expectation of privacy. Like the other posters below said, I try very hard to guard where my photos are taken and if they are specifically ask what is going to be done with them.
-Miser
Re: (Score:3)
If this isn't the use they wanted, the licence shouldn't be allowing it. But it does, so too bad.
Re: (Score:3)
When I shared the photo I imagined someone might use it for a hamburger advertisement.
But OMG!!! Someone used it for a hot dog advertisement! I never had this use in mind when I shared it!
Re: (Score:2)
Hamburger is perfectly fine food, but hot dog is an abomination. THOU SHALT NOT MAKE FOOD IN THE IMAGE OF DOG, thus says OC Bible.
Re: (Score:2)
Re: Which brand of Creative Commons license? (Score:2)
Oh mighty Shai-hulud
Keeper of balance
Bless the Maker and His water
Bless the coming and going of Him
May His passage cleanse the world
Re: (Score:2, Insightful)
It's the same story as yesterday. Idiots who have no idea what they're doing are outraged again. They'll go nuts again tomorrow.
Re:Which brand of Creative Commons license? (Score:4)
Uh, No.
They stay nuts continuously. You just notice it again tomorrow. But they were nuts the entire time. So they don't go nuts again. They are nuts still.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
There is no problem so great that it cannot be solved by adding more government regulation and taxes.
Re: (Score:1)
I DIDN'T UNDERSTAND WHAT I AGREED TO AND IT"S SOMEONE ELSE'S FAULT!!!11!!!11
My boyfriend took those pictures. I was drunk. I needed the money.
Re: (Score:3)
Someone shared photos on CC, and someone else used it to train an AI.
So what!
Nothing is wrong with that.
But I would be curious to know what is wrong with that in someone's imagination where that is wrong.
The trained neural network consists of matricies of connection weights. No images are in the trained network. Just 'trained' interconnection weights.
Re: CC BY SA? CC BY NC? Other? (Score:2)
There's no need, because they are not republishing the photos or even legitimately derivative works. What they used from the photos is objective data, not an artistic work built on the photos. You can't copyright objective data, only a particular representation of it. In this case, the photographer's representation is a photo, and IBMs is a neural net.
why is this a surprise now? (Score:5, Insightful)
Your photos are public. What the hell do you expect?
Someone tell these people how search engines work.
Re: (Score:3)
Re: (Score:2)
Ok, but what about pictures that you are in, taken and uploaded by someone else without your consent?
Re: (Score:3)
Ok, but what about pictures that you are in, taken and uploaded by someone else without your consent?
Standard /. IANAL response: like it or not, our current copyright system has no protections for you in that situation. Only the person who photographed you. Now, if they were invading your privacy doing so, you have grounds for recompense, but if you were in a public place and/or gave them permission to take your photo, legally, they had the right to take the photo and do what they want with it.
Re: (Score:2)
Ok, but what about pictures that you are in, taken and uploaded by someone else without your consent?
Standard /. IANAL response: like it or not, our current copyright system has no protections for you in that situation. Only the person who photographed you. Now, if they were invading your privacy doing so, you have grounds for recompense, but if you were in a public place and/or gave them permission to take your photo, legally, they had the right to take the photo and do what they want with it.
Which is all perfectly understandable. The only point I was trying to make is that the hard line message of "Well, you uploaded the photo so you are responsible" is not 100% inclusive of all situations.
Re: (Score:3)
A photo on a public accessible web site does not make the photo public available or public domain. ... moron.
Try copying a photo from New York Times or similar and lets see how far that gets you
Re: (Score:2)
Photography sent to a newspaper could be sold to the newspaper. The newspaper would then own that image.
The newspaper could use the image as they wanted.
The newspaper could be allowed to use the image with the rights staying with the photographer.
The photographer could then use the image in books, magazines as they wanted as it was still their image to use.
Re: (Score:2)
Exactly, and hence using the photo for AI research would be a breach of copyright ... for funk sake, work on your reading comprehension ...
Re: (Score:2)
Re: (Score:2)
"sing the photo for AI research would be a breach of copyright "
No. Republishing the photos would be breach of copyright -- that's what the "copy" in the word means. Analysing the photos is no more infringing than watching a movie and publishing a review is.
Re: (Score:2)
Not sure I agree but IANAL. I believe you're not allowed to profit off of someone else's copyright, and if you're feeding it to your AI, you're likely doing so for profit.
Re: (Score:2)
I believe you're not allowed to profit off of someone else's copyright
You believe incorrectly. The rule is that you aren't allowed to profit from copyright infringement. (Of course infringement without profit is also prohibited, but that's considered a lesser offense.) Training an AI on an image is not one of the things copyright law reserves for the copyright holder, and so does not constitute copyright infringement, whether or not it's done for profit.
Ignoring the "artificial" aspect for a moment, all they're really doing here is looking at the images other people have publ
Shocked, I tell you (Score:2, Insightful)
People place images on the public internet, available to world+dog, and then express surprise and dismay that world+dog has access to the images? What's next, shock and dismay upon learning that Zuckerberg knows more about them than the NSA does?
Re: (Score:3, Funny)
Kobayashi Maru/Sexbot scenario (Score:1)
Maybe they could use my image to train sexbot AI's...
Then I would just tell them that I am Captain James T Kirk, and their synthetic panties all drop for me...
An incel can dream
Not even a harvard comma (Score:2)
Whether you love or hate the harvard comma, it is generally agreed you don't use it on two-item lists.
Re: (Score:1)
oxford comma?
Re: (Score:2)
I think what the other AC was getting at is the construct or more widely known as an Oxford comma, not a Harvard comma.
Re: (Score:2)
An appositive that modifies (restrictive appositive) shouldn't use commas. FWIW.
PhotoSynth (Score:5, Informative)
Flickr photo sets have been used for computational work loads and data mining for well over a decade, this is hardly NEWs.
https://www.ted.com/talks/blai... [ted.com]
It's not the "wild" (Score:5, Insightful)
It's pictures available for public conniption ("conniption" was an autocorrect error too funny to correct).
Consumption is just what model training is doing; they are not republishing the pictures in any way, just using them to train models - which do not contain any element of images they train from.
If you put your image in public, how can you be aghast someone has viewed it?
Re: (Score:2)
they are not republishing the pictures in any way, just using them to train models
Some of these companies are. I think the nVidia team might be using the same dataset for their work related to deep fakes/facial morphing/generating fake faces, and I recall scrolling last month through a Google Drive folder they had shared with 1 million photos in it that they used as their source material. Even so, the original license allowed them to do so, so these people don't really have much in the way of legal recourse.
Re: (Score:2)
I recall scrolling last month through a Google Drive folder they had shared with 1 million photos in it that they used as their source material.
Interesting, I hadn't thought they would actually share the dataset used for training so many faces as it is so large...
In that case you are probably right that the license ends up mattering, but I wonder if there is not some kind of fair-use argument to be made here since the image is used in an educational context.
Re: (Score:2, Funny)
It's pictures available for public conniption ("conniption" was an autocorrect error too funny to correct).
Consumption is just what model training is doing; they are not republishing the pictures in any way, just using them to train models - which do not contain any element of images they train from.
If you put your image in public, how can you be aghast someone has viewed it?
Have you ever tried to train a model? They can generally handle instructions like "stand still and look pretty", but more complex training can be difficult.
Re: (Score:2)
But aren't they profiting off of the use of the pictures? If so, and the pictures happen to be copyrighted, isn't that a violation?
Re: (Score:3)
Many of us are not narcissistic attention seeking whores and do not want to be in the spotlight for any reason.
(Not sure why you were down-modded?)
If that is true, you wouldn't have any photos up they could use to train right?
Also any photos used for training, are never in the spotlight as it were.
User issue, not company issue. (Score:5, Insightful)
Although the photos in question were shared under a Creative Commons license, many users say they never imagined their images would be used in this way
Just because you lacked the creativity to consider what was possible with your data doesn't mean there is anything improper has happened when they do use it in such a way.
Also, if you have given away your data thinking that somehow corporations would respect you then you don't really understand what drives corporations.
The reality is that if it's profitable then a corporation will do it. It doesn't matter if it's morally repugnant, illegal or downright evil because if it's possible to make a profit then there will be a corporation that will do it. Note that being illegal typically means they will be fined which they consider a business expense.
Re: (Score:1)
Well, it DOES matter to some extent or we'd be just as bad as failed states like Venezuela, but the truth is that there was no law that anyone has identified which forbids this and the license specifically allowed that. If you don't like this, then don't license images of your face under a license that lets them do anything with it. Nobody made them put their face under a CC license.
er? (Score:5, Interesting)
Although the photos in question were shared under a Creative Commons license, many users say they never imagined their images would be used in this way.
Since when is licensing about what you "imagine"?
Re: (Score:2)
Depending on the CC license When the license says : "Licensees may copy, distribute, display and perform the work and make derivative works and remixes based ..."
One can argue they are creating a work derived from these images, and are complying fully with the terms of the license.
On a more abstract level, training images is simply looking at an image, and looking at each pixel. (for colour, relation to other colours)
There are no laws preventing people or companies from "Looking at pictures, and analysing t
CC-as-Foreseen license (Score:4, Funny)
Funny that I've not yet heard about the CC-as-foreseen license, which apparently billions of people have been using, in earnest, all along.
That's not covered by copyright (Score:3)
It might be governed by personality rights [wikipedia.org] - your right to control how your image is used. You could argue the model's consent is needed before using their facial geometry. But personality rights are generally concerned with control over how others perceive your image. Since there's no public perception or exploitation here, it would be an uphill argument.
AFAIK, there is no basis for prohibiting people from using things you make publicly available (your face every time you walk out in public, unless you wear a burka) to train computer algorithms. Photographers and the press have worked pretty hard to enshrine their right to record images of people in public places [aclu.org]. If we want there to be restrictions of using images of people in public places, it'll need to be a new law.
Re: (Score:1)
AFAIK, there is no basis for prohibiting people from using things you make publicly available
What if you didn't make it publicly available? What if someone else did without your consent?
Re: (Score:2)
How does 'someone else' make you appear in public without your consent?
No shit (Score:4, Insightful)
Let me guess, the whole quote should he been something like:
"None of the people I photographed had any idea their images were being used in this way, but it's all because I decided to put it on the internet with a licence that allows anyone to do anything with it, without explaining to them what I was going to do."
A prime case for Solid (Score:1)
Sir Tim Berners Lee's new proposal called Solid [https://solid.inrupt.com/] would be a prime choice for managing the access to your data that companies like IBM have for their scanning. Fine grained access to share and audit data in a decentralised distributed network using blockchain.
As a photographer (Score:2)
Re: (Score:2)
Just wait till you get targeted with ads because of something about your appearance. And later on, maybe some government might find and target say people in photos having semite shnozzolas for a little extra attention, or [other demographic group with obvious physical features]
Re: (Score:2)
Wait until the military version mis-identifies some target and your face was part of the training. Some lawyer is probably already drooling.
Idiot. (Score:4, Insightful)
“It seems a little sketchy that IBM can use these pictures without saying anything to anybody,” he said.
It seems a little sketchy than this photographer didn't explain to the subjects that he was going to post their image online with a licence that allows anyone to do anything with it for any reason.
Re: (Score:3)
It's up to the photographer to get the model release for distribution of the persons depicted, and to publish.
But on a more fundamental level:
Can you stop a person from looking at many pictures of you and people similar to you that are posted online?
Can you stop a person from looking at many pictures of you in order to learn to recognize your face from any picture as "Anonymous Person 1"? Not really.
Can you stop a person from looking at many pictures of you, and people similar to you, in order to recognize
Re: (Score:2)
And in both cases, sketchy != illegal, and that's all that really matters.
Why creative commons? (Score:3)
The fact that they only used creative commons images suggests there's an actual legal issue with proprietary images, but why? If I save an image from a website to my hard drive, without sharing it, does that make me a criminal? I've been training my brain on face recognition with proprietary images for decades. I've even occasionally indirectly made money from the viewing of proprietary images, as has everyone else.
Should I pay a royalty every time I imagine a proprietary image I've previously seen?
Re: (Score:3)
If you save an image from a website to your hard-drive for later retrieval -- technically yes, that is copyright infringement.
Which is one reason many websites require a "right to copy, distribute, and transform" anything you upload. Because computers fundamentally require copying ( to memory, to hard disk, to CPU) and transforming (different formats for storage, display)
Copyright law needs fair-use exemptions to allow for this to work.
Unfortunately, many lawsuits have been won in the US for computers mere
Re: (Score:2)
The fact that they only used creative commons images suggests there's an actual legal issue with proprietary images, but why? If I save an image from a website to my hard drive, without sharing it, does that make me a criminal? I've been training my brain on face recognition with proprietary images for decades. I've even occasionally indirectly made money from the viewing of proprietary images, as has everyone else.
Should I pay a royalty every time I imagine a proprietary image I've previously seen?
I've asked this question of at least three different IP Lawyers. And for once I got exactly the same damned answer. If you aren't redistributing the images and are just using them as training data and the images can't be reproduced somehow from the result of the training you did not infringe copyright.
So as far as I can tell it does not really matter what the copyright status of the images are with respect to using them for machine or deep learning. It seems reasonable to me that you could rip all the _S
Violation of Canadian and WA St Constitutions (Score:2)
This activity is an express violation of the Privacy rights embedded and explicitly described in both the Canadian and Washington State Constitutions.
Period.
Re: (Score:2)
I stand by my statement.
No sympathy from me (Score:3)
""None of the people I photographed had any idea their images were being used in this way," said Greg Peverill-Conti, a Boston-based public relations executive who has more than 700 photos in IBM's collection, known as a "training dataset." "
Why are you whining? YOU explicitly made that possible. YOU had to elect for each image to be licensed under CC. If the people you photographed are upset by this, they should sue YOU.
It's not that simple (Score:3, Insightful)
Almost all responses here are along the lines of "what did you expect". But it's not that simple.
If I go up to a window in your house and photograph the inside, you don't say "well, I have no problem with that, the windows are transparent after all".
Saying "it's technically possible, so of course someone did it" makes you no better than databrokers like Cambridge Analytica who create psychological profiles based on your Facebook likes and then sell them to, well, anyone really.
Is it technically possible? Yes. Was it something the average user could have anticipated when they pressed the "I agree" button? No.
This is about norms and values. Privacy is a form of "contextual integrity". We have expectation of how much we will get for different situations. People have similar expectations online.