Slashdot Log In
Copyright Tool Scans Web For Violations
Posted by
Zonk
on Tue Dec 19, 2006 11:31 AM
from the he-knows-when-you've-been-bad-or-good dept.
from the he-knows-when-you've-been-bad-or-good dept.
The Wall Street Journal is reporting on a tech start-up that proposes to offer the ultimate in assurance for content owners. Attributor Corporation is going to offer clients the ability to scan the web for their own intellectual property. The article touches on previous use of techniques like DRM and in-house staff searches, and the limited usefulness of both. They specifically cite the pending legal actions against companies like YouTube, and wonder about what their attitude will be towards initiatives like this. From the article: "Attributor analyzes the content of clients, who could range from individuals to big media companies, using a technique known as 'digital fingerprinting,' which determines unique and identifying characteristics of content. It uses these digital fingerprints to search its index of the Web for the content. The company claims to be able to spot a customer's content based on the appearance of as little as a few sentences of text or a few seconds of audio or video. It will provide customers with alerts and a dashboard of identified uses of their content on the Web and the context in which it is used. The content owners can then try to negotiate revenue from whoever is using it or request that it be taken down. In some cases, they may decide the content is being used fairly or to acceptable promotional ends. Attributor plans to help automate the interaction between content owners and those using their content on the Web, though it declines to specify how."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Wager (Score:4, Insightful)
Raise. (Score:4, Funny)
127.0.0.1: $ cat robots.txt
# robots.txt for 127.0.0.1
# This file is copyright 2006 by me.
User-agent: AttributorCorporationDMCABot
Disallow: *
And if they do honor robots.txt, I'll be able to sue the fuckers for infringing on my copyright, because they must have read it in order to honor it.
Parent
Re: (Score:2)
Unless you also sell a few companies and put together a few billion as a stake to hand over to attorneys I suspect you'll fare as poorly as everyone else does.
Re: (Score:2, Insightful)
Most web sites have a copyright statement on them some where (even this one!). Technically speaking, if I go to that web site, my browser copies the page along with all it's media content and caches it. Since many of those sites do not have a terms of service posted
Re: (Score:2)
Re: (Score:2)
how can you read it on the web then without having made a copy of it somewhere on your computer... you've pulled in a copy of it using your browser, there is now a copy of it in ram and also maybe in the cache... so you've made at least two unauthorised copies.
Re:Raise. (Score:5, Funny)
# robots.txt for 127.0.0.1
# This file is copyright 2006 by me.
User-agent: AttributorCorporationDMCABot
Disallow: *
Hahaha! You screwed up! I have your IP address now! I will send 127.0.0.1 to every company that uses the sniffer and tell them the person at that IP is an evil, evil person who exploits innocent people for their own profit and power!
Parent
Re:Raise. (Score:4, Interesting)
Parent
Re: (Score:3, Insightful)
You have to have more links than they have IP
Re: (Score:3, Informative)
Here's how to block two subnets
Re: (Score:3, Informative)
Re:i don't like robots.txt anyway. (Score:5, Informative)
Let's take a fun legitimate site like, oh... Wikipedia [wikipedia.org]:
(They also disallow certain specially generated pages like Special:Random, and any of the pages which actually let you edit the site).Let's see, what are some other sites? Ooh. Take a look at Slashdot's robots.txt [slashdot.org]! (disallows a variety of fun pages.) Microsoft's? [microsoft.com] How about whitehouse.gov [whitehouse.gov]? Google [google.com]?
Parent
Re:i don't like robots.txt anyway. (Score:5, Informative)
And dynamic content is, of course, the answer. If I'm going to put up copyrighted content in the future, I'd use one of a dozen schemes that regenerate the download link on a per-session basis. Obviously they're not going to honour robots.txt, but why are your links readable by such a basic spider? You need to:
Anyone who follows the above steps (and most sites already do most or all of this) won't be found by the spider. Period.
The only thing I can think of that this product would be useful for is to find people who have blatantly copied my website, but I'm sure you could find those people equally easily with Google.
mandelbr0t
Parent
Can't they just use google or torrent sites? (Score:3, Informative)
If users can find items they want, presumably the copyright holders could use the same methods...
Re: (Score:3, Funny)
Imagine a tool where you could reliably return accurate and search results for images and video. Does this exist yet? No, as one who searches the web daily for pics and video for my own sordid uses, let me assure you that it most certainly does not yet exist.
And what an horrific waste to have such a tool - if it works - for policing content for copyright violations. Bearing in mind also that such "violations" are no such thing in some
Re: (Score:2)
or else make it yourself... but then again you've got to pay the nickel for the bl00dy sheet music or tabs... and they don't half try to rip you off there as well... it's that or write your own... and then try and stop them from ripping you off...
buh (Score:5, Insightful)
Like quotations in a paper, or video snippets in an educational presentation?
Re:buh (Score:5, Insightful)
This is a scary product. Not so much because of the technology behind it, but because of how it is going to be implemented and (ab)used.
Parent
Spam obfuscation techniques suddenly useful... (Score:2)
Yeah.. good luck with that. (Score:2)
Fighting an avalanche with a snow shovel (Score:5, Insightful)
Re: (Score:2)
Today's world of copy protection is voluntary. You have the right to produce content that people want and to waive copyright on it. That's your free choice. Are you doing that? If not, then why not?
Yeah (Score:4, Interesting)
Its purpose aside, yes, it would be a fantastic thing to be able to scan the entire web and reliably identify the context and content of any specific media file type. Video, audio, image, etc. Particularly if it could identify purposely obfuscated content.
I'm in what is almost certainly a tiny minority of Slashdotters in that I actually create copyrightable material rather than only consume it. I'm again in the minority in that I think copyrights are a good thing and again in the minority in that I can separate out the purpose of copyrights and the evil actions of the legal arms of **AA companies.
Regardless, while scanning the internet for improperly used material sounds great on paper this will probably end up being as effective as finding water with a divining rod. The current tactic of locking down things at the hardware and OS levels will get more support from the media companies, not that they seem all that good at choosing tactics when the internet is involved.
Re:Yeah (Score:4, Insightful)
Not everyone that creates content thinks that draconian enforcement attempts are a good idea, or even in the best interests of those that create content.
If your work can't survive in the marketplace, which includes the prospect of everyone on the planet getting to use it for free, then perhaps you should get some sort of more conventional day job.
The difference between a game that sells 50K and one that sells 5 Million has nothing to do with DRM.
Parent
Re:Yeah (Score:4, Interesting)
The **AA lawsuits are ridiculous, yes. But the ridiculous part is not the litigation itself, it's the laws on which the lawsuits are brought under.
Parent
Re: (Score:3, Interesting)
and in little pieces, they will consume bandwidth (Score:2)
think about it, to do what they say, they have to request ALL the data they can lay their hands on,
and then chuck it.. and for comparative purposes, they'll have to do it again.
so Sony hires 'jfm copyright trackers'
and microsoft hires 'sco copyright trackers'
and mgm hires yo momma
and each of these 'ip owners' representatives have to scour the entire net, bit by byte by megabyte, for their clients.
holy crap! think about the potential
Software is in beta (Score:3, Funny)
Attributor plans to help automate the interaction between content owners and those using their content on the Web, though it declines to specify how.
And apparently being written by underpants gnomes.
Some interesting questions... (Score:5, Insightful)
Actually, can they even scan torrents without downloading the entire file? And whats to stop everyone from just blocking them from accessing their websites? Are they going to go in covertly, pretending to be actual users? I can see every legit website blocking their access as well, why pay for bandwidth to supply that?
Sure, youtube can be more efficiently attacked...but youtube has been dancing in front of the cannons since its inception, we all knew it was going to get shot eventually.
Dashboard (Score:2)
search by hash? (Score:4, Interesting)
But it looks like the real "innovation" these guys are pushing toward is fully automated filing of lawsuits. I think that was in Accelerando, which is fantastic, and which you can download it free. [accelerando.org]
Re:search by hash? (Score:4, Informative)
But yeah, it might make sense for Google to become "aware" of unique content and variations of it.. but I doubt they'd ever use that openly for (aiding in) hunting down copyright infringement, simply for PR reasons.
Parent
Re: (Score:3, Interesting)
Copying is great! (Score:2)
Re: (Score:2)
Negotiate Monitization? (Score:2)
Re: (Score:2, Funny)
I don't understand... your post seems to imply this is a Bad Thing?
Ringtone (Score:2)
If you want an album, buy it. If you want software that costs something, buy it or learn to use free/open software.
So where's the free/open alternative to an album?
Or... someone uses a popular song as the music bed in their Youtube video and the entire video clip is only 25 seconds long
A ringtone is 25 seconds long, as that's how long it takes for the call to be routed to voice mail.
or the quality is so poor that no one in their right mind would consider keeping it as something to put on their iPod.
Over a mobile phone's ringer, quality matters little.
Whatever happened to the concept of fair use and encouraging people to build upon the works of others?
Sonny Bono happened [pineight.com].
It's just a tool (Score:2, Insightful)
It all depends on how it's used. Many companies would prefer to avoid coypyright infringing material, and will take it down if the existence is pointed out to them. Many companies will simply be asking others to remove material which clearly and flagrantly breaches their copyright. This is perfectly reasonable behaviour.
Fair Use Issues (Score:2)
what's their probability of false alarm? (Score:2, Insightful)
First of all, what's the their probability of a false alarm? Even if they false alarm fairly infrequently, the vast amount of content on the Web means they could easily have a flood of false alarms, in addition to whatever actual copies are found. The user of the system is then going to have to have human beings sift through that flood to identify what's A) really a copy, B) whether that copy is infringing or not, and C) if so, is it worth taking actio
Wait a minute (Score:2)
Ok, it's supposed to be unlawful to access copyrighted information on the Internet without the copyright holder's permission, right? I mean, that's the gist of the *AA's arguments right -- we hold the rights, you can't access this material unless we say so. So if the tool has to access the information to determine the copyright, wouldn't it be violating that principle? Nitpicking I know, but an interesting thought. They'd have to get dispensation from the *AAs to do it, wouldn't they?
If you value your "property" so much... (Score:2, Insightful)
...then do not put it to the Internet.
In fact, burn it to a DVD and lock it up to a safe, and never talk about it. That way nobody else will ever have access to your "intellectual property".
Scan Blocking (Score:2)
On another note, so now they are going to throw more traffic over the Internet?
Now SCO can continue... (Score:2)
This is the tool Micros - um, I mean - SCO has been waiting for. They can now just scan all those millions of Linux Servers on the intraweb and see their copyrighted code right there in the open....
...or maybe not.
What a waste (Score:2)
I've experienced it from both sides. (Score:3, Informative)
I've experienced this from both sides.
I have a bunch of my books on the web, and every once in a while I do a search on some text from my own books to see who else is mirroring them. The books happen to be copylefted (dual-licensed GFDL/CC-BY-SA), but I'd like to know who's mirroring them, and check whether they're violating the license. A lot of people just seem to be hoarding the PDF files on their university servers, maybe because they're afraid my web site will disappear; that's flattering. One guy was selling them on CDs on e-bay, violating my license (claimed they were PD, didn't propagate the license). Another guy translated them to html, with lots of errors, changed the license to a more restrictive one, and put his own ads up; he fixed the licensing violation when I complained, and in a way it was a good thing, because it motivated me to make my own html versions (which are now bringing me a significant amount of money from adsense every month). One kind of annoying thing about mirroring is that the people who are mirroring never bother to update their mirrors, but in general I just figure there's no such thing as bad publicity :-)
From the other side, I once received an e-mail from a museum in the UK that was complaining that I was using a 17th century oil painting of Isaac Newton. I guess they own the original, and they may also have been the ones who did the scan that I found in a google image search, but under U.S. law (Bridgeman Art Library, Ltd. v. Corel Corp.), a realistic reproduction of a PD two-dimensional art work is not copyrightable. What really surprised me was that they came across it at all, because at that time I think my book was only in PDF format, and hadn't been indexed by google because the file size was too big.
The whole thing doesn't seem negative to me in general. It makes just as much sense as people doing a vanity search in Google before they apply for a job, or authors watching their amazon.com sales rankings obsessively. I guess the most obvious potential for abuse would be if they send a nastygram to your webhost, and your webhost is a low-end one that figures it's not worth their time to keep your account, so they just shut off your account.
Re: (Score:3, Interesting)
It's not a dupe. (Unless you count anything that appears on Digg first to be a dupe.) However, it's also not the first story of its kind. About a gazillion companies have formed with the exact same business plan (save for the "hotness" at the time being digital music) and about a gazillion of those companies have failed to develop software that catches anything but the most obvious infractions.
Every so often, some RIAA/MP
Re:Dupe (Score:4, Interesting)
Parent
A real use on /. (Score:3, Funny)