Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Graphics Software Your Rights Online

Image Detecting Search Engines' Legal Fight Continues 220

Mr. steve points to this New York Times article about sites like ditto.com and the new google image-search engine, writing: "Search engines that corral images are raising Napsteresque copyright issues." Expect to see a lot more sites with prominent copying policies and "no-download" images, and trivial circumvention of both. If an image is part of your site's design, you wouldn't truly want to prevent downloads, would you? ;)
This discussion has been archived. No new comments can be posted.

Image Detecting Search Engines' Legal Fight Continues

Comments Filter:
  • the question is: are images on sites intellectual property
    • Re:property (Score:3, Interesting)

      by anticypher ( 48312 )
      The answer is YES! Maybe.

      Images on my site are my property. In every jpeg image (and powerpoint, word and text file) I create, I place my copyright statement. I also have a robots.txt file to prevent copying by search engines. To google's credit, they obey the robots.txt file, but others are not so considerate.

      Recently, I had the occasion to place a number of images and other copyrighted works on a website hosted on one of my machines. The copyrighted works were available for a period of about 20 minutes, long enough for my friend (who paid me in beer, including many pints tonight just before I typed this, apologies for typos and bad grammar) and his brother to retrieve the works. My friend used AOL Instant Messenger to tell his brother which URL to find the images, including the obscure URL.

      After I saw the two of them had retrieved the images, I left the site up for some stupid reasons (end of work day, beers calling, phone calls from idiots). Apache was running on an obscure port (28962) on an IP address with no DNS/reverse DNS entry. About 14 hours after my friend has sent the URL to his brother via AIM, I saw an AOL spider crawled my site for those works.

      Its pretty fucking obvious that AOL is sucking up every copyrighted work they can, presumably to have copies of everything of value that passes by AIM. Their EULA allows them unlimited copyright to anything that passes by their systems, even if it is hosted on a third party system that doesn't agree to their EULA.

      The machines involved slowly crawled the site, about one hit per minute from 4 different IP addresses. Machines like:
      spider-loh-ta012.proxy.aol.com, spider-loh-ta016.proxy.aol.com, cache-loh-aa01.proxy.aol.com, and
      cache-loh-ab02.proxy.aol.com carefully worked the site, following every link, and grabbing every (huge) jpeg and ppt file. Stupid of me to not filter AOL from my website, but I've learned. From now on, only password protected protocols that can't be easily picked up in plaintext streams.

      Since that incident, I've been able to work this demonstration into my security reports. A client can set up a totally fake URL on a random port, send a message by AIM, and within 24 hours, the site is spidered by AOL, regardless of the robots.txt file. Sending an FTP username and password will result in the site being accessed within 24 hours. AOL hasn't responded to any of my queries, so that makes the whole thing even more interesting from a security aspect, and makes me even more money.

      So don't place any intellectual property on any internet connected machine, if you want to retain control of your copyright. Large corporations will take your works, and if they happen to have great value later on, you won't see any recompense. I actually feel bad for the RIAA/MPAA giants, because they can't defend themselves, even with the DMCA and new European laws. You may own the IP for a work, but the internet doesn't care. "Get over it".

      the AC
  • What with all the policies and the gigantic ads taking up most of the room, pretty soon you won't be able to see anything on the site anyway.
  • bullsh*t (Score:2, Informative)

    by teknopurge ( 199509 )
    Google clearly posts comments about the copyrights possibly associated with the images that it returns.

    http://techienews.utropicmedia.com [utropicmedia.com] help us beta!!
    • Huh? So what? I can't legally redistribute a Walt Disney movie with the disclaimer "oh, by the way, this is copyright by Disney."
  • by cavemanf16 ( 303184 ) on Thursday September 06, 2001 @04:17PM (#2260383) Homepage Journal
    Here's the story without the signup restriction: http://archive.nytimes.com/2001/09/06/technology/c ircuits/06IMAG.html
  • If an image is part of your site's design, you wouldn't truly want to prevent downloads, would you? ;)

    I wish I could say the same, but that would involve thinking about the client (the web-surfer), which is against corporate policy.
    • A browser can't display anything unless it downloads it first.
      If anybody don't want their images or text to be downloaded, don't put it on the net.
  • Let's do the same for text. God forbid we end up with the napsterisation of copyrighted text.
  • well... (Score:4, Insightful)

    by fjordboy ( 169716 ) on Thursday September 06, 2001 @04:20PM (#2260405) Homepage
    my site (Peterswift.org [peterswift.org] is cached on google and they have my images and pretty much everything i have on my site on their site. However, this doesn't bother me at all. They don't claim ownership to any of it, in fact, they blatantly say that they don't own any of it! I don't have a problem with them taking my page and putting it on their site. That just means more people access my page, and if my site ever happens to be down, then I don't have to worry as much. In fact..I hope google caches my site today, because I just uploaded about 40 or 50 images in the last week in my pictures folder, and if they cache it, then i don't have to worry about me screwing up html or anything...i can always pull my site from google. it is just another backup. :)
    • one other thing...if people are looking for an image on my page and they search for it through google, they can find what they want easily without having to download other images and stuff from my page. it is easier and better this way.
    • Just because you (and probably most webmasters) benefit from this wont stop someone from stepping in and putting a stop to it. Someone will decide that it's violating their copywrite, and disclaimers or not, will decide to sue.

      After all, one could argue quite effectively that the copywrite violations of Napster benefitted the music industry, and the artists. That didnt stop them from putting the smack down.

      Sooner or later we'll all realize that intellectual property is crap, a myth. People will still innovate because it's a natural human desire to, not because of any potential gain. If you're innovative solely for personal gain, chances are you're not coming up with anything very useful anyway.

      -J5K

      • That made no sense whatsoever. The discussion is not about whether or not these practices will be put to an end, the discussion is why they shouldn't be put to an end! My point still stands that these are useful services that benefit everyone....i am not denying that these services might be put out of business..they might be if these idiots get their way. I think you are on the wrong level of this discussion...the article (/. version) is talking about the possibility of these services being shut down...noone denies that. What I am saying is benefits of these services and why they should stay.
        • My point still stands that these are useful services that benefit everyone.


          As shown by giving your site as the only example? Here's an example. What if I run a "Nicole Kidman nude pictures site" with permission from Nicole Kidman and the photographers. I place ads at the top of my site, and nude pictures of Nicole Kidman at the bottom. Now google comes along and let's people type in "Nicole Kidman nude" and see my pictures without my ads. Not only is that an illegal derivitive work, it is harmful to my business.


          Caching is one thing. Extracting just the images without the rest of the site is another thing entirely, and as long as we continue to have copyright law it should not be legal.

    • Re:well... (Score:3, Informative)

      by jesser ( 77961 )
      Google cache does not contain your images. When you view the page from the google cache [google.com], Google adds <BASE HREF="http://www.iceball.net/peter/"> at the top of the page to instruct your browser to treat all relative URLs in the page not as relative to Google's cache of the page, but to your page. So when your browser sees <img src="PSORGLOGO.jpg"> later in the document, it interprets that as <img src="http://www.iceball.net/peter/PSORGLOGO.jpg"&g t; and loads the image from your server. If your site was down, and I went to Google's cache of your site, I would not be able to see the images.
  • I think this is just a simple stumbling block. In the future just about anything and everything will be searchable. Our society demands it! (execpt what uncle Sam doesn't want you to see)
  • The main issue I see here is removing the web designers right to say "no".

    I spend a lot of time editing my images in photoshop until they are perfect. I generally don't care if someone thinks they are cool and would like to use them. I do have an issue with people who take my images and claim them as their own work.

    These search engines make it seem as though it is OK to take what ever you want and not credit the source.

    That is not cool at all. If you want an image ask the developer if you can.. 9 times out of 10 they dont care as long as you give htem credit and a link.
    • That is why images.google.com gives complete credit to the site with links and all. Go and check it out sometime. They even show the full webpage when you click on the picture.
    • Not credit the source? they link right to it!
    • You can use a little thing called robots.txt - look it up here [searchengineworld.com] or here [robotstxt.org] if you don't know what it is.

      Allows really useful features like marking given directories, pages, or files off-limits to a specific robot or all robots in general. Boy... a technical solution to a technical problem instead of a new round of lawsuits?

      Quickie examples (this is SO simple folks):
      User-agent: *
      Disallow: /

      Boom! No more google telling that horrible world of pirates and thieves about your site. Not many visitors either though....

      So maybe you want to exclude just googlebot from your images and image directory with the following:

      User-agent: googlebot
      Disallow: /image

      This will still allow your main pages to be indexed according to your meta keywords, but will disallow any 'napsterization'. Of course since it requires people running sites to do work and understand technology lots of people will probably decided lawsuits are easier.

      Robots.txt DOES require you to run your own domain. If you don't, try using meta tags in the head of the html code for a similar effect, but it is harder to implement (must be on each page rather than site wide) and less supported. Info here [robotstxt.org].

      If you spend that much time on the images... spend 5 minutes making a robots.txt file to indicate you don't want them taken by bots. But always consider anything you put on the net as published, if something's private don't put it on the net.
      • And he should also check out Robot Exclusion Standard Rivisited [kollar.com], where it specifies META tags you can put in a page to prevent indexing a certain page, or following links from it.

        And he should also look at the trivial ways of setting up his webserver to prevent serving an image, if the referer isn't from his local site.

        This is a textbook example of an overcaffeinated ignoramous. God bless america, land of the dumb.

      • Robots.txt is a request, not an order -- it assumes that legitimate spiders are going to honor your request. There's nothing there that will prevent a spider from walking your site.

        If you run your own server, you could use .htaccess to require a password in order to access the site. The password could be displayed on the homepage as plain text or in an image; this would allow humans to get to "the good stuff" with a minimum of extra effort while making it nearly impossible for a generic 'bot to access the site

    • That is why you should put a visible watermark / copyright notice on all your images, as well as an imbedded copyright notice (in image formats that support it). That way if someone puts your image up on their own site or posts it to Usenet it's immediately obvious where the image originated.



      Granted, this won't stop someone from taking the GIMP and cropping or airbrushing the image to remove your logo, but at least you are making them work at it -- it would be very difficult to automate this process, particuarly if you vary the placement, size, and color of your watermark. Yes, it obscures part of the image and pollutes the artistic purity of your work -- but it's the only simple way to discourage wholesale theft of your work.


      Of course, if you are really concerned about your art, don't put it on the web. Putting an image out on a public website is like putting cookies on a table with a big sign that says "Free Cookies, please help yourself".

  • by Mike Schiraldi ( 18296 ) on Thursday September 06, 2001 @04:22PM (#2260431) Homepage Journal
    Putting a picture on the web is like walking around in a public place.

    If someone takes a picture of me out on the street, i have no right to keep them from publishing it. If i don't want people to take pictures of me doing something, i don't do it in a public place.

    If you don't want Google picking up your pictures, and you don't want people saving your pictures to their hard drives, don't put the pictures on the web.
    • Putting a picture on the web is like walking around in a public place.

      If someone takes a picture of me out on the street, i have no right to keep them from publishing it. If i don't want people to take pictures of me doing something, i don't do it in a public place.


      I think you might be mistaken here. I believe they call these "slice-of-life" photos, and while generally they don't have any rights issues involved, I have heard of a number of legal cases where the person photographed successfully sued. It had something to do with a failure to attribute the photo *of* the person, *to* the person.
      Wish I could remember the details; everyone knows that legal rulings are all about the details so..
    • by MemeRot ( 80975 ) on Thursday September 06, 2001 @05:07PM (#2260717) Homepage Journal
      A little thing called robots.txt - look it up here [searchengineworld.com] or here [robotstxt.org] if you don't know what it is.

      Allows really useful features like marking given directories, pages, or files off-limits to a specific robot or all robots in general. Boy... a technical solution to a technical problem? Who'd a thunk it?

      Quickie examples (this is SO simple folks):
      User-agent: *
      Disallow: /

      Boom! No more google telling that horrible world of pirates and thieves about your site. Not many visitors either though....

      So maybe you want to exclude just googlebot from your images and image directory with the following:

      User-agent: googlebot
      Disallow: /image

      If you want to do this for multiple directories, you add on more Disallow lines:

      User-agent: *
      Disallow: /image
      Disallow: /cgi-bin/

      Now if you put

      meta name="robots" content="All,INDEX"
      meta name="revisit-after" content="5 days"

      in your code to show up high on the search engines, you shouldn't be surprised or upset when you SHOW UP HIGH ON THE SEARCH ENGINES.

      Not all robots follow the robots.txt standard, and there's no way of forcing them too. But google does, and that seems to be the big concern here.

      A real life example, slashdot's robot.txt file (at slashdot.org/robots.txt [slashdot.org]):

      # robots.txt for Slashdot.org
      User-agent: *
      Disallow: /index.pl
      Disallow: /article.pl
      Disallow: /comments.pl
      Disallow: /users.pl
      Disallow: /search.pl
      Disallow: /palm
      Disallow: index.pl
      Disallow: article.pl
      Disallow: comments.pl
      Disallow: users.pl
      Disallow: search.pl
  • If you took a great deal of your time to create yet ANOTHER Natalie Portman collage...would you really want that "sucked up" by someone's search engine or image archive? I mean, what's fair about that???

    No credit for all that hard work...for shame, for shame... you might want to check out Digimarc [digimarc.com], though

    -PONA-
  • Search Engines that display blurbs of matching hits are violating copyright.

    Search Engines that keep the data of webpages stored in a database are violating copyright.

    What next? Copyrighted URL's?
    • Exactly!

      Copyright is simply NOT an appropriate guide to IT policy. Society has spent trillions creating technology allowing information to be instantly copied.

      Copyright law was created to regulate BOOKS, not ELECTRONS. And it wasn't aimed at individuals, but at publishing houses.

      This is why we have absurd situations where publishers claim that the information in buffer memory represents a copy - that streamed audio is creating a copy in fixed tangible form. What copy? Where?

      Craziness... And of course we certainly wouldn't want to consider creating a copy to allow indexing by search engines to be fair use would we? Why that would instantly destroy our whole society and open an interdimensional gateway for demons to pour forth and devour our children :)
    • DMCA Section 512(d) explicitly exempts "information location tools", such as search engines, from liability for linking, indexing, referencing, etc., copyrighted material. Furthermore Section 512(b) exempts most forms of temporary caching (though not if you modify the material which might be important here).

      If you wondered why DMCA wasn't mentioned explicitly in the article, this is probably why. DMCA is relatively nice to search engines, but the issues here go beyond that. We are concerned with whether images are different from text and what is a fair use. The article does a good job of outlining the issues.

      For more on DMCA you might try this summary [loc.gov].
      And for the really adventurous there is always the full text [loc.gov].
      Note: Both links are PDF files.
  • Either one is implicitly copyrighted when it is published on a web server, and it both cases, barring situations where the access to the website itself is restricted, it is also implicitly licensed for copying by client side programs. On top of this, robots.txt can still apply just fine in this case, and any robot "violating" a robot directive not to download images would create a more complicated situation -- but the issue as it stands is really quite cut and dry.

    It's just a bunch of people wanting to raise a big hooplah and creating a big stink about the bandwidth consumption problem that this poses.

  • by wbav ( 223901 ) <Guardian.Bob+Slashdot@gmail.com> on Thursday September 06, 2001 @04:24PM (#2260441) Homepage Journal
    Wouldn't that make ie/netscape/mozilla/opera/ect the program you are downloading with?

    I can just see it now:
    Judge shuts down microsoft for distubting software that allows you to violate copyrights by downloading images. Microsoft was shut down Monday for it's popular browser Internet Exploder. A representive from the company said "We were shocked. I mean, we didn't really expect the software to work in the first place."

    Of course we won't see such a headline, but still, turnabout is fair play.
  • I browse with most cookies filtered out by way of JunkBuster and have noticed that some sites will not let you view some of their images if you have cookies disabled. Enabling all cookies makes the problem go away. By requiring a cookie to be set, these sites are effectively disallowing web crawlers which ignore cookies from caching their images. Expect to see more of this, especially in sites full of copyrighted images or in sites which rely on advertising and where images are the main draw (ie. the porn industry).

    • except you can hace cookies enables, but have the cookies directiry set to read only. hehe. I love that.
      Alls they have to do is make the site think your accepting its cookies.
      • Opera has a wonderful setting allowing sites to set all the cookies they like, and when you close the browser every single one goes into the trash. No problems viewing pages or placing orders and it makes tracking you a little bit harder.
        • Netscape also has this feature.

          Of course, it's undocumented and in fact is a hack. Here's how it works:

          • Win16/32: Find the file cookies.txt. Delete it. Start Netscape and exit it. cookies.txt has now been recreated as an empty file. Edit cookies.txt's file properties and set it to read-only.
          • Unix: Mostly the same as Win16/32, except: chmod 400 cookies.txt
          • MacOS: The file is now called MagicCookie instead of cookies.txt. Delete it, start Netscape, exit, and then lock MagicCookie.

          Same basic principle. The net effect is that when Netscape exits, it will lose any cookies that are set.

          It is also possible to set permanent cookies: for example, your /. login cookie. Simply logon to /. with Netscape prompting for each cookie that is set: only accept the login cookie, and then quit. The cookies.txt/MagicCookie file will remember it forever more. My Netscape has, I believe, three permanent cookies. However, I've mostly transitioned to IE5: partly because of its cookie powers, which are much better.

  • Child: Mommy, why was daddy taken?

    Mother: Because he forgot to disable his browser cache honey.
  • by BrookHarty ( 9119 ) on Thursday September 06, 2001 @04:27PM (#2260465) Journal
    I just started using IE's image toolbar, nice thing, i was on a site that tried to protect the images with javascript, i just clicked on image and the toolbar popped up, clicked save picture...

    Is m$ breaking the DMCA with thier circumvention?
  • robots.txt (Score:5, Insightful)

    by mj6798 ( 514047 ) on Thursday September 06, 2001 @04:27PM (#2260469)
    The guy's site (http://www.goldrush1849.com/) still does not have a robots.txt file. Either Kelly is incompetent, or he does this deliberately to get other people to trick other people into "using" his content and sue them later.
    • Sounds strikingly similar to the argument that stuff like Microsoft SmartTags or other content-warping stuff is ok, since there'll be an opt-out for sites.
      • Is that robots.txt has been around for longer than most of you young'uns have been on the net, not something that's being added now. If you don't know how to make a standard web site using standard technology.... too bad.

        This seems so absurd to me.... I remember when the hottest programs were ones to get you higher-ranked on the search engines to drive traffic... has concern over ip really overwhelmed a desire for more visitors this much?
      • There is an important difference. Copyright confers two distinct rights: exclusive authorization for copying and exclusive authorization to make derivitive works.

        By putting you work in a web server, you clearly authorize some copying by your action. You do not however authorize somebody to change your work.
    • The problem with your point is that there in nothing under US Law or International Law that states that a Robot/Crawler/Spider must read and obey the "robots.txt" file. It is a courtesy that programmers follow but there is nothing to compel them to observe the request in "robots.txt".
      • To the best of my knowledge at least, don't know about ditto - anyone have any info on them?

      • The problem with your point is that there in nothing under US Law or International Law that states that a Robot/Crawler/Spider must read and obey the "robots.txt" file.

        It's called copyright law, specifically 17 USC 106(1) [cornell.edu]. In practice, the need for explicit authorization would be conclusive.

        You must have authorization to make a copy, (assuming it's not fair use). Clearly the authors of the robots.txt standard did not have the authority to make law, but contextual standards do have meaning. Any judge would start the analysis by placing the burden of proof on the copier to prove they had authorization. An explicit denial of permission in the standard place would require pretty strong counter-evidence. The mere act of placing it in a web server probably would not suffice, given the more granular meaning expressed by an entry in robots.txt.

        You would get laughed out of court if you said you wanted to hold the copyright owner to contextual opt-in but not contextual opt-out.

        The issue in this case is the converse. If no robots.txt is present, is recopy authorization granted from the fact that the file was placed in a websever? I argue "yes", because a convention is required for every opt-out system. After the initial opt-in, it's fine to have an opt-out, and local tradition and convention *has* to define that.
    • Re:robots.txt (Score:2, Interesting)

      I don't know whether or not Kelly is incompetent, but I don't see how this can be interpreted as a trick. He has an explicit terms of use statement on every page that bans reproduction, modification or storage of his images (along with about ten other possible uses).

      If Kelly were complaining about misuse of a paper copy of his images, it would be clear that the copier had deliberately violated his copyright. However ditto.com is collecting, processing and republishing images without a real person looking at the bottom of the page for this copyright statement.

      The real question here is responsible for preventing violations of a clearly expressed copyright. Is it Kelly, who will have to track all image cataloging spyders and manually disallow them while still allowing text indexing if he wants to promote his site? Or is it ditto.com, who would have to instruct their spyder to look for phrases like "Images copyright Bob Plaintiff 1999"?

      • by MemeRot ( 80975 ) on Thursday September 06, 2001 @05:53PM (#2260966) Homepage Journal
        User-agent: *
        Disallow: /image
        Put all image files in the /image directory.

        or I would recommend for him:
        User-agent: *
        Disallow: /
        - i don't think he has any 'right' to use the search sites to promote his site if he doesn't consent to them copying his data. Is html code protected by copyright? This would make all search sites illegal, and destroy the internet as a usable resource. So because the consequences would be untenable, we should answer no.

        That's all. Meta tags, which you seem to be thinking of, are a pain in the ass, poorly supported, and only worth using if you don't control the domain and can't put up your own robots.txt file.

        If I put 10 pizzas on a picnic table with a note saying 'please dont eat my pizza' and leave it there for 3 days - it will be eaten. If I do this ignoring the safe that's right there that I could use to lock them in, then i'm an idiot.
      • However ditto.com is collecting, processing and republishing images without a real person looking at the bottom of the page for this copyright statement.

        Yes, and that kind of functionality is very useful. Arguably, it falls under "fair use", whether or not Kelly likes it. But the web actually gives him a way of expressing his preferences in a machine-readable way that imposes no burden on him.

        If natural-language statements like Kelly's are found to be sufficient to exclude indexing robots, the web would suffer greatly, and for no good reason whatsoever.

        Is it Kelly, who will have to track all image cataloging spyders and manually disallow them while still allowing text indexing if he wants to promote his site?

        Kelly has to do no such thing: the robots.txt mechanism is flexible enough that he can include and exclude parts of his site from indexing according to his preferences; he doesn't have to know what robot is used for what purpose.

        Not that Kelly has any legal right to make such choices to begin with: text search engines are under no obligation to index part of his site (in fact, I think any self-respecting search engine should blacklist him). Giving him an all-or-nothing choice would be entirely sufficient. He should count himself lucky that the mechanism he actually has at his disposal is so flexible.

      • Re:robots.txt (Score:3, Informative)

        by prizog ( 42097 )
        See the Ticketmaster case: copyright notices are not binding on spiders.

        Grep for "terms and conditions" in:
        http://www.gigalaw.com/library/ticketmaster-tick et s-2000-03-27.html
    • There's lots of talk about robots.txt, but that's not a viable option for people who don't control the web server but just have control over a subdirectory.

      From using assorted mirroring software in the past and from what I recall of the robots.txt documentation I've seen, it needs to be at the root of the domain, not in a subdirectory. So, does that mean that only people with a domain of their own get to protect photos or artwork that they've created?

  • Come on now, this is getting ridiculous. If I can view something on a web page, why shouldn't I be able to save it to view later?

    How is posting a picture on a web site any different than putting out a table on the side of the road, with a pile of photographs and a sign that says "Free"?

    Now, I'm totally in favor of artists' rights and all... but let's ease off on the pervasiveness and invasiveness of copyrights.
    • You, an individual, have a right to save it and view it later. The issue is about a company that saves your image and redistributes it while possibly making profits on banner ads, etc.

      Posting on the web is not the same thing as having a table on the side of the road with a sign that says "free." By posting on a website, you let people look at your stuff, make a copy for personal use/commentary, etc. You don't give up ownership (i.e. copyright.)
      • >>The issue is about a company that saves your image and redistributes it while possibly making profits on banner ads, etc

        If Google makes money with banner ads, that doesn't really have anything to do with posting thumbnails of images. They AREN'T making money from the images. They are providing a FREE service pointing out where to find these images. If the artist doesn't want visitors to his site, I don't know why he has a web page. If he does want visitors, I don't know why he has a problem with a search engine pointing people his direction.
        • This is a hard position to defend... if a thumbnail (lower resolution) image is copyright free, this implies that I can, for example, release a VHS version of a DVD and be immune from copyright law.
          • Bad analogy. A better analogy would be to take a short clip from a DVD and distribute that -- which IS legal under fair use if it is done for commentary, criticism, teaching, review, etc. It's much safer, however, not to rely solely on fair use law and to get explicit permission to redistribute.
    • The problem with this is that sites such as ditto.com and google are making copies of the picture on their servers. Most often, if not always, it is to create a thumbnail of the image which can then be shown to the user.

      This doesn't bother me. I like looking up pictures. But I am going to play devil's advocate. If we were to extend this in to the future, we may find that sites no longer reduce the image size to a thumbnail. Let's say your search results only returned a few hits. No need for a thumbnail, right? So far, all is ok. The user is happy.

      All is not ok, though. The person who created that image is left out completely. What if they wanted to know how many users were viewing their images to judge whether they should release it to a major magazine? The image could be generating a lot of hits but only through the search engine. The creator of the image never sees those users.

      I liken it to the TMBG issue a while back. They Might Be Giants freely gives away music via their web page. But they do it to create a community. They didn't like napster because it stole directly from that community (I am going off of memory, I hope this is correct).

      What they said in the article hits the nail on the head. Their picture has been reduced to that of clip-art.

      I like the idea of everything being free, but if the creator doesn't want it to be, well.. tuff luck for us, I guess.
  • by graveyhead ( 210996 ) <fletch@fletchtr[ ]cs.net ['oni' in gap]> on Thursday September 06, 2001 @04:31PM (#2260503)
    There is a very simple answer for the artist in the ditto.com case. Watermark all your production images. You can create yourself a Photoshop action to automate this very easily, and a GIMP script version wouldn't be all that tough either. Make them unusable unless they obtain a (non-web based) copy from you. I couldn't even finish reading the horrible article [nytimes.com] because they compared the pitiful ditto.com vs nobody case to Napster vs. RIAA twice before the article was half-finished.
  • Those who set up free porn sites realize quickly that bandwidth/image protection is an essential part of site development. Fortunately, Apache and .htaccess allows you to prevent hotlinking, at least with browsers that pass along the HTTP referrer.

    Example:

    RewriteEngine On
    RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.com [NC]
    RewriteCond %{HTTP_REFERER} !^http://yourdomain.com [NC]
    RewriteRule .*\.(jpg|gif|png)$ /404.html
  • Can you have your image and download it too? The answer is YES! Actually, it'd be pretty easy to set your site up so that crawlers dont index your images .. can't you control referring access via .htaccess? I'm pretty sure you can, to make sure your www server make sure that your own site is the only valid referring party before returning the image. Sorry, but I'm too lazy to find the docs to proove it tho. :)
  • I mean, don't we suffer from exactly the same problems with all text that's online? Google caches it all for Pete's sake! Isn't that making a copy? Isn't that making it easier to find? Aren't these exactly the same complaints people are making about the images?

    Authors are losing control over their works which can be easily found and copied now they they're catelogued by search engines! Outrage!

    But is it the Right Thing to ban or penalize this?

    These are exactly the types of problems that we're coming up against now that copyright has been deemed a control mechanism. We've gone and screwed up the whole system to the point where it's going to be virtually unusable.

    But personally, I just want to know who I can sue for "copying" text and images from my site when they visit it. I need the money.
  • There's already a mechanism in place that, while informal, is supposed to prevent any content on your site from being indexed by a search engine spider, and that's the oft-forgotten robots.txt. Not only should it be able to work to block whole directory from search engines, but for specific file types as well. This ought to be a non-issue.

    However, while I would suspect that Google does the Right Thing with this, I know several newer search engines that completely ignore robots.txt and grab everything without even checking for this file. In addition, those new to the website game don't know about this mechanism, and thus don't know how to take steps to 'protect' their work.

    IMO, the robots.txt thing ought to be a standard in place by both search engine software and publically offered site-mirroring software. Particularly in the latter case, most of these clients ignore robots.txt completely and grab all content including dynamic pages.

  • by geekoid ( 135745 )
    If you don't want people to see your stuff, don't put it on the internet.

    I got news for you, if I look at your site, I am saving it digitally, in my ram.(and of course cache).
  • Free Advertising (Score:3, Insightful)

    by B.B.Wolf ( 42548 ) on Thursday September 06, 2001 @04:43PM (#2260581)
    I spend much time and effort on my graphics. For me it is a form of art. When I see one of my Backgrounds on a coworkers system, I am tickeled pink. I would like to make money at this someday. The more my work spreads, the easier it will be to sell my services. As for downloading and distributing my backgrounds as art theft, get real. The only valuable item is the 16 to 30 layer 1600X1200 gimp file and all the variouse auxillary files I used to generate the 1024X786 .png. It is just like photograph. It is'nt the print, but the negative that is important, despite what some anal Photo trade organizations are pushing for.
  • Can't someone easily prevent their site from being indexed by using a empty file named robots.txt ?

    Many sites do this simply to get you to search from their site.

    Altalavista and Google (as I now notice) both make you visit the site to get to the pictures. Chances are if you are complaining about someone stealing your pics - you are getting them with banner ad views.

    I know i've put a pic here and there on the net, my own works (mostly) but I would notice someone trying to pass off my work as their own. If the concern is over pr0n whats the point; your pics are already on alt.binaries.great.ass.paulina or other such newsgroup. Pr0n pics are traded over IRC, Kazaa, Gnutella, what ever.

    If someone has downloaded your company logo... whats the fuss? Either they are making a desktop background or something and if you still don't like that... sue them! Not the search engine!

    I have a Athlon boot screen for winblows 98 and I bet I broke a copyright law while making it! But why would AMD even think of getting mad when I'm advertising for them??

    If someone is editing your logo and putting sucks under it or some such thing... you probably do suck but have no time to sue anyone because you're doing The Next Horrible ThingTM.

    Case closed. People wanted publicity... don't give it to them. They just use the word Napster to get attention from John and Sally Newbite.

  • Heh, I always get a chuckle out of those pages that say "Images are copyright XXXX and may not be downloaded".

    Oops, too late! I already downloaded it so I can look at it! It's in my cache! It's in my RAM! It's in my squid proxy! I guess I better go turn myself in to the Kopyright Kops, eh?

    This is all just silly.
  • Next step.... (Score:4, Informative)

    by www.sorehands.com ( 142825 ) on Thursday September 06, 2001 @05:01PM (#2260683) Homepage
    What about the companies that build databses of images, websites, etc. from spidering the web?

    They sell access to these databases to their clients to search for illegal copies of their works, or to see any mention of them in an unfavorable light. Is this an infringement?

    • Re:Next step.... (Score:3, Insightful)

      by Angst Badger ( 8636 )
      Imagine a world in which there are no search engines impartially indexing the web. You'd end up with a few popular outlets that would either showcase their own subsidiaries or sell listings to their partners.

      Bing! You have reinvented TV, but with online ordering capabilities. Having failed to create interactive television, Big Business is systematically destroying those elements of the web that made it better than interactive TV.

      Your spoonfeeding, already in progress, will now resume.
  • I know that there are arguments about how if you place information on the web then it's practically public domain, and there's some merit to that I guess. After all, how can you stop people from downloading it?

    At the same time though, I think it's silly not to allow people to put their stories and their artwork and what is essentially their copyrighted material on the web where people can access it without the ability to tell people not to copy it.

    The napster-like thing with many image search engines is a problem. Even when image search engines including google can give a good indication of where an image is coming from, they often show complete versions of them (even if reduced size) that people can download without seeing any copyright information, and save without going through the source. For text searches there's a fair use issue because most search engines only display a couple of lines, except I've heard some people complain about the google cache. Also it's possible to put meta information on pages saying you don't want them indexed. With image searches it's much more difficult.

    When Babylon 5 [babylon5.com] was around, and probably still, there was a policy that fan websites could use as many promotional images as they wanted to as long as they explicitly stated that they were copyright to the studio, and required everyone grabbing it to say the same.

    Image search engines can't do this because they can't read things like watermarks. What we could do though is have a standard allowing for publishers to associate copyright information with files so that search engines know when and how they should be able to index and display other people's copyrighted work.

    It would be a voluntary thing and if search engines want to make legal judgements about whether copyright claims are going too far, it would be completely up to them. But it would allow for image search engines to operate cleanly and make sure they don't go futher than the copyright holder wants them to, at least with reason of a good legal argument.

    Make it a text file in the same directory, or something. Requiring it to be at the top level directory of a domain would mean that some publishers without access to that dir won't easily be able to set meta information for their own stuff.

    • I think it's silly not to allow people to put their stories and their artwork and what is essentially their copyrighted material on the web where people can access it without the ability to tell people not to copy it.

      People can only view artwork, listen to music etc. only after downloading the data. IE they have to get a copy of it anyway.. is there any clearer way of rephrasing 'Information wants to be free'?

      It's like the hypothetical person with perfect memory and sufficient abilities who can reproduce a piece of music having once heard it. Is he committing a crime by having a good memory? If not, shouldn't normal mortals be allowed 'memory augmenting devices' like hard disk recording?

      Information has no volition to be free or anything else, but its only natural state is that of freedom. But then again, as Nietzsche (or someone else, can't remember) put it, you don't love information enough it you disclose it to others...

      • Meta information isn't restrictive. It's descriptive.

        If you chose to ignore an author's copyright notices then that's completely up to you, and this wouldn't stop you from copying it.

        The main problem I have with the "information wants to be free" crowd is when someone takes my work and starts showing it off and taking credit for it as their own. Giving it away is one thing, but if it goes into the public domain I'd lose most of my incentive to create anything new.

        I'm not in favour of totalitarian restrictive measures like CSS that work like a broadsword in blocking people's rights to fair use. But allowing authors to hold copyrights on their work is perfectly okay the way I see it.

  • The obvious joke (as obviously pointed out in the writeup) is that you make a copy of the image when you display it in your browser.

    The issue is the rebroadcasting of the image by someone who doesn't hold the copyright on it.

    As a photographer, if I put my hard work on the internet and suddenly some business plasters it all over their site without my consent, I would be pissed off. A google image cache with a searchable image database would be similar, and objecting to it is reasonable.

  • Bill Gates bought Corbis [corbis.com] a while ago, and to the best of my knowledge their images (thumbnails really) are NOT indexed by any of these engines (their choice, bla bla).

    It's pathetic, but Corbis actually sells extremely low rez 640x480 images [corbis.com] to "suckers."

    I would argue that anything less than a print quality TGA image is a sample image, analogous to 128kbps MP3. i.e. it's free publicity in the eyes of real artists, and it's "copyright infringment" to greedy middlemen.

    e.g. I happen to have a tangible print of this Pat Rawlings painting [novaspace.com] on my wall......and this [google.com] is called free advertising.

    Anway... Everyone benefits from abundance...except the selfish FEW that would like to profit from artificial scarcity.

  • Are our browser caches going to come under fire next? Where does it end?

    Copyright is just getting totally nuts...

    When is the next shuttle off this rock?

  • Files don't get on the internet by accident! It is inherent in the medium that when you put your files under the control of a web server and have it listen for connections on a network and respond that copies will be made. If that isn't good enough, password protect your website. Using a webserver without access control is an opt-in system that clearly authorizes some copying. The only question is how much copying is authorized.

    I fail to see any meaningful difference between infinite copying for free from the original site and transitive copying from a search engine. Since "deep linking" has been held to not be infringement, the argument that you aren't forced to see the whole page is bogus, since an URL can target the individual image file.

    You can explicitly unauthorize search engines by using robots.txt [robotstxt.org], right? . Any splitting of hairs about the scope of copying authorized by the act of putting your file in a web server can be fully accomodated by using robots.txt. Since this standard is publicly available and well known, doesn't placing your files on the web without restricting via this method constitute a grant of authority to everyone with access to your web server to copy? Now if these search engines ignore robots.txt, then that is another matter, but I doubt that is the case.

    When you opt-in to copying by placing your files in a web server, but fail to subsequently explicitly opt-out after that, you have authorized copying, so tough.

    The photographers say they might have to leave the net. Not so if they follow robots.txt . I don't generally think that forcing off people who won't learn how the net works is a bad thing. These groups are essentially trying to use the courts to create standards. The net already created its own standard for this in 1994. Perhaps we will have the first ruling that essentially says "RTFM".
  • Expect to see a lot more sites with prominent copying policies and "no-download" images, and trivial circumvention of both.

    I especially like those sites with Javascript pop-up message boxes that appear when you right-click so you can't select 'save as.' As if you couldn't just go into your browser's cache and copy it from there. Or, even easier, simply hold the right mouse button down while you hit the spacebar to clear the popup.

  • How is this any different from any other issue facing search engines? Whether it's an image or text being deep-linked is a purely client-side issue, if you screw with your browser long enough you can easily get it to display one as the other or vice versa (YMMV with regards to reading it :) ).

    I could see a problem with sites such as Google that present a preview of the image found... Perhaps unfair-use claims could be avoided if the quality of the preview was lowered. A pixel-doubled image looks enough like the original that a human can make decisions based on it, but it's useless for anything a computer would want to do with it.

  • WARNING!

    You may not download, save, reproduce, or otherwise illegally use images on this website. By clicking the link below, you attest to the fact that you will abide by this license and report all instances of its violation to the copyright holder at once.

    Click to enter FreePics.Com!


    Sigh...

    Hey, here are some images you can have for free:

    http://www.furinkan.net/art/ [furinkan.net]

    I'm the artist and owner, but maybe if my work gets around, people might be willing to pay me for large, high-resolution scans!
  • Perhaps I am being stupid or something, but if an image is visible on your computer, then it has already been downloaded.

    So how can sites say that an image may not be downloaded ? If that were the case then you wouldn't be able to see it.

  • I took all the images from the net and created one big image, then shrunk it down to a 1x1 pixel thumbnail, which is contained in the period at the end of this sentence. Sorry.

"In my opinion, Richard Stallman wouldn't recognise terrorism if it came up and bit him on his Internet." -- Ross M. Greenberg

Working...