YouTube Video-Fingerprinting Due in September 115
Tech.Luver writes "The Register is reporting on Google's statement to a presiding judge that video-fingerprinting of YouTube material will be ready in September. The development is required to head off a three-headed suit against the company, currently being debated in a New York City courthouse. The system will, according to Google, 'be as sophisticated as fingerprinting technology used by the Federal Bureau of Investigation.' From the article: 'As Google told El Reg in an earlier conversation, the company already has two systems in place for policing infringing content - but neither are ideal. One system allows copyright holders to notify Google when they spot their videos on the company's sites. When notified, the company removes the offending videos, in compliance with the American Digital Millennium Copyright Act. A second system uses "hash" technology to automatically block repeated uploads of infringing material.'"
Hard AI ftw (Score:5, Interesting)
Others pointed out that, no, it's not a hard AI problem to just compare some kind of checksum of the video against a set of banned checksums. That's true. But what about once people know they're using this system? They can just trivially re-encode. Perhaps add a scene break here or there, and totally mess up the fingerprint. To prevent that, it seems, you would need to solve a hard-AI problem: that is, be able to determine if an arbitrarily-encoded video appears to a human to match some copyrighted work. It would have to be robust against minor scene shortenings and lengthenings, scene breakups, color gradients laid over the video, etc.
Anyone know how difficult this program is to circumvent? (Just hypothetically -- not advocating criminal activity here.)
Amazon Mechanical Turk (Score:5, Funny)
Re: (Score:2, Interesting)
Nobody would know what the keyframes are, so it would be hard/impossible to black out that specific frame.
Re:Hard AI ftw (Score:5, Interesting)
Of course a little bit of coding and you have a program that takes that 10 minute video, splits it into 10 1 minute videos and uploads them. The ones that got rejected it splits into 10 6 seconds videos and uploads them. Rinse and repeat until you have however small an set of rejections you asked it for. Then it cuts out just the necessary fragments of videos (replacing them with the last good frame or something?).
Of course that can be worked at google's end by adding a delay to the report rejection step, and by banning those who get lots of rejects.
Re: (Score:2)
That'll also help protect against mismatches. If my yet-another-guy-getting-hurt home vid matches The Matrix in one frame (according to whatever algorithm they use), I'm fine.
There's no way it'll match a dozen frames.
Re:Hard AI ftw (Score:5, Interesting)
Re: (Score:2)
"those that claim it is impossible should stay out of the way of those that are doing it"
Re: (Score:2)
Re: (Score:1)
Did you even read the summary? That's mentioned as one of the two already in place. YouTube is adding a different system.
fingerprinting video is trivial (Score:2, Informative)
The system is robust against severe degradations like low bit rate video compression, scaling, rotation, cropping, noise addition, median filter and noise removal. [...]
A 5 second video fingerprint on any segment of video content is sufficient to uniquely identify that segment.
You obviously need more than a simple re-encode to get around that and I'm sure Googles system won't be fooled by simple tricks either.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
Fingerprinting a movie is much simpler. For example, to create a useful hash for a series of frames, just throw away the color information and work on the grayscale channel only. Then reduce the size to something like 8x8 pixels, and compare these to the previous frame. Encode the difference (e.g., set a bit to one if the level goes up, and to 0 if it goes down, this gives a 256-bit number for those two frame
Re: (Score:2)
Say an encoder inserts a unique watermark that can't be seen by eye but is part of the data stream. Google isn't looking for it and doesn't recognize it when the video's uploaded, so it allows the video. Somebody would then have to complain, and Google would take down the video and add it to their "banned" database. The problem is, they would then have added basically a garbage entry into their database because it only
The problem with keyframe detection (Score:2)
Pity the MPAA doesn't believe in fair use....
Re:Hard AI ftw (Score:5, Insightful)
1) Is there a way around the system? Yes.
2) Does that matter? No.
3) Why is that? This solution shows that Google is making reasonable efforts to comply with the legal issues.
The majority of folks aren't going to take the effort to circumvent these controls. Rates will drop significantly. Google can honestly say they are making every effort to comply with copyright protection. Lawsuits will go away.
Re: (Score:1)
Re:Hard AI ftw (Score:5, Insightful)
Note, this can also be applied to "kitchen knives can kill so we should ban kitchen knives." and "people can die in cars so we should ban motor vehicles"
and uh... "People who have killed a lot of people have played video games, so we should ban video games." The states needs to get over the damn prohibitionist culture that's removing any sense of personal responsibility from our great nation.
Re: (Score:1)
Re: (Score:1)
Not that I wish to imply that corporations estimate that human life is worth more than money
Re: (Score:2)
Re: (Score:2)
IANAL or law student, but I'd think the "above and beyond" would work in Google's favor in a court case. They can tell a copyright holder that is suing them: 1) You never bothered to use the existing laws and just ask us to take the offending material down and 2) We're making every practical effor
Not so fast, Slick. (Score:2)
It WILL bite them in the ass for the very reason laid out in the DMCA itself:
Rates will drop significantly. (Score:2)
Re:Hard AI ftw (Score:4, Insightful)
Re: (Score:2)
Re: (Score:2)
As for image recognition, usually this is done using Fourier transforms as an edge finding algorithm. Basically, you can use a computer filter to throw out the bulk of the information, and keep only the most visually identifying features in the video. Changes in tint or timing won't affect the shapes and movements of these predominant edges.
Umm. To visualize this, imagine you take a full colour photo, and you trace it as best you
Re: (Score:2)
Re: (Score:2)
Wouldn't some sort of soft AI (expert-system, neural-net) do just as well?
I could be wrong, but doesn't "hard" AI refer to a system that is conscious?
Why would you have to be conscious to recognize movie clips?
Re:Hard AI ftw (Score:4, Insightful)
Re: (Score:3, Informative)
Re: (Score:1)
Re: (Score:2)
Off the top of my head... I would throw out the whole notion of "checksums". They aren't really applicable because their purpose is to compare exactly. Even if doing key frames, all one would have to do is lighten/darken the video and the whole che
Re: (Score:1)
E.g. if it contains copyrighted material it should get posted to an indexing site. That should bring people from all over the world. Then you tag it and get a human to watch it and check if it is copyrighted. Whereas original material is probably viewed by a small circle of friends.
Just dumbly checking the popular stuff would help a bit, but I think you need to look at referrer information, or the location of
Re: (Score:2)
instant circumvention - just add random noise! (Score:1)
But yes, it will reduce it. By how much?
I don't know. Maybe a lot.
Re: (Score:2)
Re:instant circumvention - just add noise! (Score:1)
However, I think only a little noise would be necessary. Eg. +1 or -1 brightness, on one pixel per screen.
Hmmm... or maybe just different compression parameters? This changes the artifacts introduced into the video, and so the uncompressed video would look slightly different.
It's not hard AI (Score:1)
Re: (Score:2)
Re: (Score:2)
This is trivially easy to work around. Just put a description of the video at the beginning or end, or use freely available video editing software to make a scrolling image across the bottom. Heck, you could just insert a couple black frames randomly across the film. Or scale it down and put it on a background of white noise.
Checksums are horrible ways of checking non-text data.
Re: (Score:1)
Re: (Score:2)
As for the later...
"okay, okay, you *technically* passed a Turing Test, but only hy having it basically ignore me and ridicule everything I did wrong."
"Sir, you were talking to your wife the whole time."
Re: (Score:1)
Google could just re-encode uploaded videos with a very low resolution (say 8*8 pixels) and use the result as a fingerprint. This is trivial to implement and makes re-encoding useless. I guess that cropping, stretching and many other modifications are detectable as well without tackling any AI problems at all.
Google is certainly able to make uploading of banned videos at least very inconvenient.
Re: (Score:1)
>They can just trivially re-encode.
No. You're thinking of cryptographic hashes where a one-bit change in the input leads to a totally different signature. This wouldn't be that kind of hash. It would most likely be a collection of a lot of hashes for each video, amalgamated into one or more signatures for each video.
If I explained the basics of the problem to my eight year old and poi
Re: (Score:1)
Re: (Score:2)
No, I don't, and I don't think my statement implies that. As a matter of fact, I recall this post [slashdot.org], where I was highly critical of those who assumed Google used a crude quickfix to solve their Googlebomb problem.
I do, however, think Google's *public statements* *imply* a belief in a higher level of robustness against circumvention than they can provide, even if they are aware of the actu
You confused yourself by mistating the problem (Score:2)
(1) that the frame will compress well (making it a better match), and
(2) that you only have to look at 1 out of three frames in uploaded material - and even that you can ignore high motion scenes entirely..
Combined the
Re: (Score:2)
A low resoution DCT comparison would be able to flag frames of a video with some ease, thus fuzzily* filtering out a large amount of negatives. Then, the paid turks sort out what's a positive from what remains.
Mind you, when I say 'DCT', I'm not talking the 8x8 quanta in an MPEG4 stream, I'm talking a single DCT of the whole frame. More computationally intensive, sure, but you get a somewhat scalable data picture that, given a hash that can retain meaningful proxim
Not that hard (Score:2)
separation of the web (Score:5, Insightful)
Re: (Score:1)
Re: (Score:2, Informative)
Re:separation of the web (Score:4, Insightful)
Re: (Score:1)
Re: (Score:3, Insightful)
Re: (Score:2)
I'm against long term copyrights, but I'm for showing basic respect to others. I think the most damaging thing to the cause of copyright reform is the childish behavior of its supporters.
Re: (Score:2)
Really, the only thieves are the powerful corporations who maintain an ironclad, if weakening, grip on what ought to be in the public domain by now anyway.
Re:separation of the web (Score:4, Interesting)
Re: (Score:2)
LOL! UR2 Dum.
You just demonstrated one of the basic reasons your generation isn't getting taken seriously. You can't write a sentence.
Now to answer what I think your point was - Most people under 22 haven't worked to create content and then seen it get used by others without compensation. And no, YouTube uploads of you singing into a hair dryer don't count as actual content. Like it or not, hollywood movies, rock stars, and quality books
Re: (Score:2)
Whereas you can't compose a sentence properly.
I keed, I keed.
Re: (Score:2)
Now, getting back to the substance of the argument, I'm not advocating the abolition of copyright; I don't think many are. However, quite a few people are beginning to see the harm caused by excessively long copyright terms and brutal restrictions on personal use, and I suspect they'll change copyright law sooner or later.
As far as Hollywood goes, reducing c
Re: (Score:2)
How exactly is automatically removing copyrighted content that shouldn't be posted in the first place evil?
Re: (Score:2)
I believe you missed my point.
"Do no Evil" - company motto, remove all copyrighted content
vs.
Survival - being the engine people use to find things
I was trying to point out that they may bo oposing forces in this case. the more they remove, the less people will use google to search.
Re: (Score:2)
Obfuscation? (Score:3, Funny)
Two-part Protection (Score:4, Interesting)
The second part sounds more promising, but someone may be able to get around hashing the videos, such as inserting random one-frame images, as in the Fight Club movie, or adding in overlay text, or possibly adding in effects. If they try to hash a few selected time slices, someone will figure it out eventually. As with all digital protection, this just pushes off the inevitable. At least it will make Google look good in court, since they're attempting to comply with Viacom and the other copyright holder's requests for not posting their material.
In the end, it won't count for much. It would make more sense to add in additional protections for false or malicious takedown notices, such as adding in a $50K fine for false claims. This would at least make the big companies scrutinize the videos that they're issuing a takedown notice for.
Re: (Score:2)
My impression is that most of the unauthorized clips on YouTube are put there by fans of the shows in question. They do it because it is easy and fun, and they want to share the thing they like with others. I don't think they will continue to bother if it becomes onerous to post the clips (requiring constant editing, posting, re-editing, re-posting, etc.). This is different from P2P networks, where once someone goes to the effort of making a
As "sophisticated" as FBI fingerprinting? (Score:3, Insightful)
And since they are making the comparison... just how reliable [truthinjustice.org] are fingerprints, really?
True, a character in Mark Twain's 1893 novel Pudd'n'head Wilson tells a court
"Every human being carries with him from his cradle to his grave certain physical marks which do not change their character, and by which he can always be identified -- and that without shade of doubt or question. These marks are his signature, his physiological autograph, so to speak, and this autograph canImage available not be counterfeited, nor can he disguise it or hide it away, nor can it become illegible by the wear and mutations of time. This signature is not his face -- age can change that beyond recognition; it is not his hair, for that can fall out; it is not his height, for duplicates of that exist; it is not his form, for duplicates of that exist also, whereas this signature is each man's very own -- there is no duplicate of it among the swarming populations of the globe! This autograph consists of the delicate lines or corrugations with which Nature marks the insides of the hands and the soles of the feet."
and ever since Mark Twain said so everyone has believed it, but that doesn't necessarily make it true.
Re:As "sophisticated" as FBI fingerprinting? (Score:4, Informative)
The Newman link is from 2001.
The judge who decided the original Llera-Plaza motion, which is discussed and critiqued in the following article, reversed himself on March 13, 2002, holding that expert evidence of a "match" was admissible. Judge Pollak had granted the Government's motion for a reconsideration that is mentioned above, and he also reopened the record to hear additional testimony for the prosecution as well as for the defense. In reversing himself in a 60-page opinion, Judge Pollak stated, in part, "In short, I have changed my mind.' The Reliability of Fingerprint Evidence: A Case Report [forensic-evidence.com]
You'll find links here to many articles on Identification Evidence. For example: Phenotype vs Genotype: Why Identical Twins Have Different Fingerprints [forensic-evidence.com]
Re: (Score:2)
Dumb. Really dumb. (Score:4, Insightful)
Re: (Score:2)
In other words, I have my hopes up that we might get rid of them pretty soon.
Re: (Score:2)
Why isn't Google fighting this out in court? (Score:2)
When notified, the company removes the offending videos, in compliance with the American Digital Millennium Copyright Act.
...
The trouble with the first system is that neither Google nor the copyright holders can possibly keep up with the vast number of copyrighted videos uploaded each day.
What exactly is the compelling legal argument that spawned three lawsuits?
That GooTube isn't complying with the DMCA?
That complaince with the DMCA isn't enough?
Depending on your POV, the 'right' thing to do is either to create new filters (business), or to try and win the lawsuits (users).
Re: (Score:3, Insightful)
Take that and the fact that Google is actually a big fat cash cow with a bulls-eye on the side of it and it becomes obvious that the best strategy is one of accomadation. Rather than a long drown out battle that would also hurt googles stock price because of the uncertainity it creates.
So anyway you cut it, this looks like th
Just the start (Score:5, Funny)
Re: (Score:2)
Anonymous uploads, here we come... (Score:1)
Step1. Get out your laptop & your random MAC address generator toolkit
Step2. Drive down some random street until you find...
Step3. A neighbour with unprotected WIFI (or just crack their non-WPA2 secure connection)
Step4. Carry on & upload your Simpson's episodes to Youtube.
Step5. Cause profits (loss) for Simpson's authors
NEXT!
Re: (Score:1)
Re: (Score:2)
Step3. A neighbour with unprotected WIFI (or just crack their non-WPA2 secure connection)
Cheers!
'infringement' (Score:3, Interesting)
But youtube is a little different in that many of the things people go there for are unique or one-time things that the only way you'll ever get a chance to see them again is if you recorded it yourself, or somebody else does and you are lucky enough to find it online.
The biggest issue I have is stuff that you'll NEVER BE ABLE TO ACTUALLY BUY OR SEE AGAIN being taken down. My favorite example is prince performing at half time for the superbowl. Now, not only are the videos gone from youtube, but also all of the comments (which IMHO are equally as valuable to the community) about the videos.
Taking things like this down erodes our culture and destroys valuable records of what has gone on in our lives.
Re: (Score:2)
<stupid question>I seem to remember that one could buy DVD sets of the superbowl, no? Wouldn't said DVD sets include the half-time show?</stupid question>
I only ask because
Re: (Score:2)
Isn't not being able to see Prince perform a feature?
Re: (Score:2)
Why is Google doing this? (Score:2)
Doing this is manifestly against the interest of the people who made Google what it is today. What happened to doing no evil?
Just had my first experience with this on Soapbox (Score:1)
What next? (Score:1)
Google loses Common Carrier Protection (?) (Score:2)
Re: (Score:1, Informative)
Re: (Score:1)
Um, no. A carrier passes traffic. YouTube hosts content. A big difference.
YouTube is an Internet Service Provider. Generally they are not liable for content hosted on their servers as long as they honor removal requests from copyright holders in a timely manner.
IPS's are also not required to develop copy protection technology for the benefit of copyright holders.
ISP's do have several requirements in order to maintain their ISP status, but none of these have anything to do with copyright infringement or developing software.
Faster than you can say Napster... (Score:2)
The content industry should take a lesson learned from the past. Right now they have a large concentration of people looking at grainy, low resolution video in one place. Remove that and the sites will go underground, and maybe with even better quality video which would be a real threat to their model. They should take the opportunity to promote their product - h
Stage6 ! (Score:2)
http://stage6.divx.com/ [divx.com]
The image quality is amazingly good, totally opposite to Youtube and flash video in general.
Also, and very important to me (because I multitask), it consumes about the same CPU in both cases. Probably flash video consumes a little more.
Another Consideration (Score:2)
Fair Use and the Justice Droid (Score:2)
It's like being found guilty for murder because your fingerprints were found at the scene... on some groceries you bagged for the victim at the supermarket a week earlier, but that is of no concern to the justice droid.
watermark with hue shift (Score:2)
Wide open for abuse (Score:3, Insightful)
If Google are not going to check it, what is to stop me downloading a Quicktime trailer of a movie, generating the data and submitting it to Google for blocking? It will quickly become impossible for even sanctioned videos to appear. Cultists/Scientologists will be screwed too.
As usual, media companies are being idiots. They paniced about the VCR, they paniced about P2P, they are panicing about DVRs and YouTube. In the end, new technology tends to do them good in the long run and besides which, you can't fight it.
Kadokawa holdings deal (Score:1)