Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Electronic Frontier Foundation Communications Crime Network Privacy Social Networks The Internet

EFF: Accessing Publicly Available Information On the Internet Is Not a Crime (eff.org) 175

An anonymous reader quotes a report from EFF: EFF is fighting another attempt by a giant corporation to take advantage of our poorly drafted federal computer crime statute for commercial advantage -- without any regard for the impact on the rest of us. This time the culprit is LinkedIn. The social networking giant wants violations of its corporate policy against using automated scripts to access public information on its website to count as felony "hacking" under the Computer Fraud and Abuse Act, a 1986 federal law meant to criminalize breaking into private computer systems to access non-public information.

EFF, together with our friends DuckDuckGo and the Internet Archive, have urged the Ninth Circuit Court of Appeals to reject LinkedIn's request to transform the CFAA from a law meant to target "hacking" into a tool for enforcing its computer use policies. Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use. LinkedIn would have the court believe that all "bots" are bad, but they're actually a common and necessary part of the Internet. "Good bots" were responsible for 23 percent of Web traffic in 2016. Using them to access publicly available information on the open Internet should not be punishable by years in federal prison. LinkedIn's position would undermine open access to information online, a hallmark of today's Internet, and threaten socially valuable bots that journalists, researchers, and Internet users around the world rely on every day -- all in the name of preserving LinkedIn's advantage over a competing service. The Ninth Circuit should make sure that doesn't happen.

This discussion has been archived. No new comments can be posted.

EFF: Accessing Publicly Available Information On the Internet Is Not a Crime

Comments Filter:
  • by kenh ( 9056 ) on Thursday December 14, 2017 @06:02PM (#55742071) Homepage Journal

    Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use .

    If I'm reading this correctly, I'm not so sure I agree with that last bit, about "violating terms of use". So all terms of use are null and void (if my browser can find it, it's publicly accessible, no matter what I have to agree to in order to get access to it?)? For example, if I have a website that stipulates you must agree not to disseminate the information made available to you by agreeing to these terms of use, you remain free to ignore that agreement?

    Or are they saying that an automated script that can bypass a Term of Use agreement isn't hacking?

    • by ColaMan ( 37550 ) on Thursday December 14, 2017 @06:10PM (#55742133) Journal

      If:

      I can send a simple http request to your server, and

      Your server sends me the information without doing its homework, then

      Sucks to be you.

      Don't want your information to be scraped? Have it behind a login - free or otherwise - then ban accounts that are slurping down 10,000 pages a day.

      Ohhhhh then it wouldn't be easily indexed by search engines and thus findable by the general public and your site would fade into obscurity. What to do!? Courts to the rescue, it seems!

      • by kenh ( 9056 )

        Don't want your information to be scraped? Have it behind a login

        And the protection the login affords you is embedded in the "Term of Use" when you created your account, but if I'm reading this right (and I may not be), using your login and ignoring Terms of Use is A-OK.

        • Re: (Score:2, Informative)

          by Anonymous Coward

          using your login and ignoring Terms of Use is A-OK

          no, because at that point we are no longer talking about public information

        • by Desler ( 1608317 )

          Nope, because that’s no longer publicly available information. Do try to actually keep up with the argument the EFF is actually making not your strawman version.

          • by kenh ( 9056 )

            Thanks - you did notice I questioned my interpretation, right?

            "if I'm reading this right (and I may not be)"

        • by Anonymous Coward

          but if I'm reading this right (and I may not be), using your login and ignoring Terms of Use is A-OK.

          Absolutely no login required.

          Literally, published to anybody who makes an HTTP request.

          Accessed, in violation of a ToS, and someone trying to make that a felony. See the insanity here?

        • by omnichad ( 1198475 ) on Thursday December 14, 2017 @11:04PM (#55743487) Homepage

          if I'm reading this right (and I may not be), using your login and ignoring Terms of Use is A-OK.

          You're reading it wrong. Using your login and ignoring terms of use is a breach of contract (albeit a unilateral EULA). It is not and should not, however, be considered felony computer hacking under the CFAA.

        • by rtb61 ( 674572 ) on Friday December 15, 2017 @12:34AM (#55743753) Homepage

          Dipstick, I can freely ignore all terms of service you specifically do not get me to agree to and by law that means specifically. You must actively seek my agreement and obtain it, prior to claiming I agree to anything. All you can do is deny service, you can not make any claims beyond that. Otherwise numbnuts, I could put a claim below the fold, that to read anything above the fold means you agree to pay me a million dollars. You must actively seek actual agreement to terms of service, prior to making claims, you can only deny service nothing more not make claims for providing a service. You are clearly too wrapped up in the bullshit of post purchase end user licence agreements which are illegal in the majority of countries and only legal in the US because of corruption and bias towards corporations. It's like the old readers digest bullshit of sending you stuff, claiming you bought if and you owe them money if you did not send it back, nope, a lie, they have no right to claim service off you, they sent it to you for free, they gifted it to you. Same as the internet, unless you actively seek agreement and then refuse service if agreement is not achieved, than you can not claim payment for accessing you service.

      • Don't want your information to be scraped? Have it behind a login - free or otherwise - then ban accounts that are slurping down 10,000 pages a day.

        Ohhhhh then it wouldn't be easily indexed by search engines and thus findable by the general public and your site would fade into obscurity. What to do!?

        Seems to me it would be trivial to code in exceptions to the slurp limit for the IP address of known search engines. And leave a link on your site for search engines you don't know about to request an exce

        • by Quirkz ( 1206400 )

          Course the next step in this arms race is to use a botnet to do distributed slurping. Each compromised machine accesses a handful of pages, then sends them to a master server for aggregation. I'm not sure what you could do to prevent that, technically or legally.

          Wait until they post the data, accuse them of running a botnet, and then bring a slightly more valid "hacking" case against them?

        • And leave a link on your site for search engines you don't know about to request an exception.

          The "automated" "homework" that someone else claimed the website needed to do is already a standard. There's a "robots.txt" exclusion file that is a standard way of specifying what a robot is allowed to do and not allowed to do.

          It used to be very poor netiquette for a robot to ignore that file. Now it seems that the attitude is fuck the website operator, he's got to hide his stuff behind closed doors to keep abusive robots from taking his website down. It's A-OK to do anything that a website physically al

      • Your server sends me the information without doing its homework, then

        Sucks to be you.

        Two problems with this:
        1) that's the argument for DRM. It already is password protected, by requiring users to log their bot in. So they're violating terms of services by logging their bot in.
        2) It's like a burglar using the "The door wasn't locked therefore the property owner didn't do their due diligence to keep me out."

        I don't think it should be a felony but it should be something. For instance if I sneak into a movie theater I am not a burglar but I'm also not there lawfully and I am taking a valuable

        • by suutar ( 1860506 )

          You're assuming they had to log in to access the information in question. The impression I had was that they were accessing data that didn't require a login. The materials I've found are not extremely clear on that point, but it does specify that hi-Q was accessing the publicly available portions of LinkedIn's site, which to me implies that it's the parts that don't need a login.

          If a login was required to get to the data then I agree with you, but I don't think it was.

    • by mycroft822 ( 822167 ) on Thursday December 14, 2017 @06:13PM (#55742171)
      I think they are only making the argument that you can't charge someone with felony hacking because they are accessing the information you make publicly available in a way you don't like.
      • Re: (Score:3, Funny)

        by greenwow ( 3635575 )

        CNN confirmed that looking at Wikileaks is a crime, and they helped the public by informing us many times of that. If they hadn't, then more people probably would have been put in prison for reading the DNC leaks and voted for Trump. That would have been horrible.

    • by Ichijo ( 607641 )

      Disseminating information without authorization (violating a license agreement) is not the same as accessing information without authorization (hacking).

      • by jedidiah ( 1196 )

        > Disseminating information without authorization (violating a license agreement) is not the same as accessing information without authorization (hacking).

        Both can be felonies under US Federal law. If you violate a license, then standard copyright rules apply. This is how the GPL works.

    • There's hacking, and then there's hacking. Current law treats any piddly little violation as if you're some l33t mastermind expert criminal.
    • by Desler ( 1608317 )

      They are not arguing all terms of use are null in void. They are arguing that accessing publically available information is not felony hacking just because it violates a ToS. It could very well still be a civil offense, but it should in no way be a felony crime. To think otherwise is insanity.

    • Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use .

      If I'm reading this correctly, I'm not so sure I agree with that last bit, about "violating terms of use". ...

      I'm sure we can all agree that hacking a website and violating their terms of use are not necessarily the same thing. If their terms say you must be wearing a blue shirt to use the site and your shirt is red, I can't really see any hacking going on. I imagine that a bot retrieving information automatically that could similarly be retrieved manually are pretty much the same thing, unless the bot operates so much faster that it affects operation of the site. Even then, no reasonable person would call that "

    • [This is my own take; I'm not with EFF]

      For example, if I have a website that stipulates you must agree not to disseminate the information made available to you by agreeing to these terms of use, you remain free to ignore that agreement?

      You need to get whoever/whatever reads the data to agree.

      If you send the data before they agree, or without finding out whether or not they agreed, then it wasn't really terms, was it? Did you actually stipulate, or did you just plan to do so and then not follow through?

      Me

    • Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use .

      If I'm reading this correctly, I'm not so sure I agree with that last bit, about "violating terms of use".

      Or are they saying that an automated script that can bypass a Term of Use agreement isn't hacking?

      I break down "using automated scripts to access publicly available data is not 'hacking,' and neither is violating a website's terms of use" into two clauses:

      • using automated scripts to access publicly available data is not "hacking"
      • violating a website's terms of use is not "hacking,"

      The outcome being that neither the use of automated scripts to access publicly available data nor the violation of a website's terms of use should be prosecuted under the Computer Fraud and Abuse Act.

    • If you must be logged in to access that data, then it isn't publicly available, so has nothing to do with what we are talking about, however, if you violate TOS and disseminate the data it is not a criminal act, but rather a civil matter, because you violated a contract.
  • robots.txt (Score:4, Insightful)

    by Anonymous Coward on Thursday December 14, 2017 @06:04PM (#55742095)

    Shouldn't a "good bot" abide by https://www.linkedin.com/robots.txt?

    • by Anonymous Coward

      sure, but that doesn't make it a felony to ignore robots.txt

      "There is no law stating that /robots.txt must be obeyed, nor does it constitute a binding contract between site owner and user, but having a /robots.txt can be relevant in legal cases."
      http://www.robotstxt.org/faq/legal.html

      "It is solely up to the visiting robot to consult this
      information and act accordingly. Blocking parts of the Web site
      regardless of a robot's compliance with this method are outside

    • Yes, a "good bot" should follow robots.txt. But failing to implement a standard is not "hacking".
      • Re: (Score:2, Insightful)

        by Obfuscant ( 592200 )

        Yes, a "good bot" should follow robots.txt. But failing to implement a standard is not "hacking".

        Tell that to Volkswagon, who "failed to implement" the standard regarding how their vehicles respond to situations they can interpret as being "testing".

        Robots that ignore the robots.txt standard are not just "failing to implement." They're ignoring the standard for their own gain.

        • It's not a standard with any legal weight - just something some people agreed to do. And failing to implement it is still not hacking - and are you saying Volkswagen was hacking?

        • "Failing to implement" means it was a failure, not an intent to circumvent or violate.

          Volkswagon's violation was intentional.

          While a "robots.txt" file is unknown, and is intentionally hidden from the user. It is intended to only be visible to "bots" which are actively looking for a "robots.txt". As bots are not sentient, they must be programmed by an individual aware of the existence and purpose of "robots.txt", and understands it to be more than a request.

          Though, if "robots.txt" isn't required to
          • While a "robots.txt" file is unknown, and is intentionally hidden from the user.

            Not "robot users". And hardly unknown, to any reasonable practitioner of the craft.

            • A "reasonable practitioner of the craft" is not a new recruit or amatuer.

              It is possible to create bots without being a fully realized "reasonable practitioner", or otherwise being aware of the robots.txt file. Discovering or being introduced to the "robots.txt" file is part of the process of becoming a "reasonable practitioner". Let's not make lacking experience a crime before we can download Bachelor's degree levels of job knowledge and experience.
        • It wasn't a standard, It was a regulation. Two different things.
    • by Desler ( 1608317 )

      It should, but just because one might not abide by it does not make the bot writer a felon.

  • by Anonymous Coward

    As far as I'm concerned, LinkedIn themselves are guilty of massive fraud and deception, by tricking users into providing email contacts so that LinkedIn can send invite spam supposedly from the user. It was a carefully designed "dark pattern" to increase their userbase early on.

    Of course, by the time they eventually got sued over this, they were big enough to shrug off the financial penalty and keep making money off all the data they had collected illegitimately.

    LinkedIn is a socially malignant business an

  • by Chelloveck ( 14643 ) on Thursday December 14, 2017 @06:09PM (#55742129)
    Who's a good bot? You're a good bot! Yes you are. YES YOU ARE!
  • . "Good bots" were responsible for 23 percent of Web traffic in 2016.

    Nearly one-fourth of all internet traffic is from the innocently-named "Good bots"? That's kind of amazing.

    • Re: (Score:1, Informative)

      by Wheels17 ( 780115 )
      How do you think search engines work? from: https://www.google.com/search/... [google.com] "As we speak, Google is using web crawlers to organize information from webpages and other publicly available content in the Search index."
      • by kenh ( 9056 )

        How do you think search engines work?

        Are you trying to claim that one-fourth of all traffic on the web is search engines crawling the network? Doesn't that seem like a lot of traffic?

        That's like saying one-fourth of the cars on the road are "Google Cars" updating Google's Street View database.

  • by Anonymous Coward

    Ironically enough, LinkedIn scrapes its users browser for known extensions. See https://github.com/prophittcorey/nefarious-linkedin for details.

    • by jedidiah ( 1196 )

      > Ironically enough, LinkedIn scrapes its users browser for known extensions. See https://github.com/prophittcor... [github.com] for details.

      Server interrogates clients for their functionality.

      That doesn't sound nearly as nefarious as you think it does.

  • ...not YET, anyway
  • Arrest records... (Score:5, Interesting)

    by b0s0z0ku ( 752509 ) on Thursday December 14, 2017 @06:34PM (#55742319)

    Let's use a different example. Arrest records and mugshots on police agencies' websites. Let's say Jane Doe, born 1/1/1970 got arrested for a particularly heinous crime. Murder, or robbery at gunpoint.

    Six months later, a court ruled her not guilty. She was able to petition to have the public arrest record on the Yoknapatawpha County Sheriff's office website deleted.

    However, in the interim, it's been scraped and archived by database companies using the data for employer background checks. Every time she applies for a job with a large employer, her application either gets round-filed, or she has a lot of explaining to do.

    What's worse, in the state of Winnemac, there are six Jane Does with that same birthday, all of which have the same record in their background check database...

    Does information still want to be free?

    • by Desler ( 1608317 )

      She’s perfectly free to use civil courts to right the matter. Those companies did not get the information through “hacking” and thus those companies being prosecuted under the CFAA would be a gross abuse of the law.

      • See, I disagree. I think invasion of privacy by corporate entities should be strictly punished -- whether it's retention of data past strict legal limits or when a user specifically opts for account deletion, mass surveillance, or dissemination of inaccurate or prejudicial information affecting people's ability to earn a living. How is someone going to sue if they don't have a job because of something incorrect being in a database used by employers?

        Violate data retention laws? In an ideal world, one lash

        • It's a fair argument that people *should* have more rights in this area. However, this still doesn't mean we should expand the definition of computer hacking to be "anything I don't like." We should make rules specific to this.
    • by mark-t ( 151149 )

      So she has explaining to do.

      News flash - life isn't fair. Does Jane Doe seriously think she's the only one in the world with problems that she might not reasonably deserve to have to live with?

      • Why make it easier for predatory data pimps to spread false (and/or outdated) data about her? Why do Slashdotters have a disturbing tendency to take the side of the corepirate entity over the little guy (or gal)?
    • methinks you misunderstand that expression. Information wants to be free, just like nature abhors a vacuum, or water wants to escape a vessel.

      It's totally unfair to Ms Doe, but that's the way the cookie crumbles; and there's nothing that can really be done*. Once it's out there, it's out there.

      *Without extreme social costs to everyone else.

      • What would be the social cost of penalizing the mass distribution of such information? Media articles/journalism would still go on -- they'd just be required to include a disclaimer that anyone arrested is innocent till proven guilty, and include the final disposition of the case in an update to articles about arrests.
        • by Desler ( 1608317 )

          What would be the social cost of penalizing the mass distribution of such information?

          The social cost is that someone down the line will then begin to use this new authority to attack and punish people for purely political reasons. Are you really so naive as to not see that? Public record data is in the public domain which means it’s free for anyone to share. You sound like quite the fascist.

        • by rogoshen1 ( 2922505 ) on Thursday December 14, 2017 @07:46PM (#55742741)

          Every single man, woman, and child in the US has heard the phrase "innocent until proven guilty", and look at the effectiveness of that caveat.

    • However, in the interim, it's been scraped and archived by database companies using the data for employer background checks.

      It sounds like they are finding whatever hearsay they can find, and selling it to employers as though it were relevant background information. That might be an honest mistake, and it might be fraud.

      (That's always the big question, whenever you don't really have a good cache invalidation algorithm and just say "oh, cache it for n seconds." Do you have the right value for n?)

      Either way

    • I am quite confident that if there is a background check company in the U.S. which is using records archived by a non-government organization (for more than an vanishingly short period of time) to provide employers with arrest record information they are currently facing, or will soon be facing, a class action lawsuit for Civil Rights violations. They are also likely being investigated by, or soon to be investigated by both the federal Civil Rights Commission and various state equivalents.
      Perhaps you are
      • (1) You have too much faith in the American system. Cute.
        (2) They won't refuse to hire her -- they'll just ignore her resume before she ever gets called for an interview based on background check data. It's only grounds for her to sue if she knows about the policy. Company's excuse would be they never got the resume.

        • That is not faith in the American system...it is faith in the greed of trial lawyers. If there is a company providing archived arrest records as the basis for employment background checks, sooner or later some trial lawyer will become aware of this fact (and I would bet on sooner). When that happens they will file a class action lawsuit against that company, and their clients. Such a lawsuit would be easy money for the lawyers.
        • They won't refuse to hire her -- they'll just ignore her resume before she ever gets called for an interview based on background check data. It's only grounds for her to sue if she knows about the policy. Company's excuse would be they never got the resume.

          You don't get permission to run the background check until after the job offer. If the offer is then rescinded on the basis of the background check, the applicant has a legal right to a copy of the report and a formal dispute process with whoever did the check.

          • you're assuming thar big corp HR departments follow the rules to the letter. the rules are meant for little people, not big corps.
    • by kenh ( 9056 )

      What's worse, in the state of Winnemac, there are six Jane Does with that same birthday

      Do you have any idea how unlikely it is to have two people share not only the same name AND birthdate?

      People sharing names OR sharing birthdates is quite common, sharing both is rare - not anywhere near impossible, but very, very unlikely.

      • A lot more common in the Latino and Black communities, where the range of surnames tends to be smaller.
  • by Ol Olsoc ( 1175323 ) on Thursday December 14, 2017 @06:55PM (#55742461)
    Linkedin who want you to violate your TOS by giving them the password for your email; account. so they can mine it.

    Seriously what kind of idiot buys into an outfit that has as a basis of operation, asking for something that in most places will get you fired?

    ? I started to sign up, and when they asked for my password it was 1FuckYouLinkedin!

  • Don't put it on the Internet!

    PERIOD!

    I don't give a flying inverse sideways hate-fuckathon HOW secure you're promised it is.
    In the end, YOU are responsible for disseminating it.
    If you put it online in ANY capacity whatsoever, it WILL be compromised and it WILL be disseminated without your say-so.

    END OF DISCUSSION!

  • Just go to your local politician and buy a law. You can make anything illegal!! (Outlaw Lobbyists).
  • by raymorris ( 2726007 ) on Thursday December 14, 2017 @08:19PM (#55742895) Journal

    I'm thinking LinkedIn is wrong here, but a simple, clear-cut, and correct statement of public policy is more difficult than it first appears.

    "accessing publicly available information" sounds pretty clear and simple, but the more I think about it, the murkier it becomes. Suppose in each of the following scenarios the data is by the owner's terms not to be accessed by bots and:

    A) The system pops up a user/ password dialog before allowing access. User "admin" and an empty password works

    B) The system pops up a user/ password dialog before allowing access. User "admin" and password "password" works

    C) The system pops up a user/ password dialog before allowing access. User "admin" and password "correct horse battery staple" works

    D) The system pops up a user/ password dialog before allowing access. Sending 17,000 requests each with a password that consists of a million null bytes followed by carefully crafted machine code to overwrite memory sometimes works

    The thing is, ANY data that has been hacked over the internet was accessible to the public, if they public tried hard enough, and was clever enough in defeating access control measures. That makes it difficult to legistlate a bright-line rule.

One man's constant is another man's variable. -- A.J. Perlis

Working...