EFF: Accessing Publicly Available Information On the Internet Is Not a Crime (eff.org) 175
An anonymous reader quotes a report from EFF: EFF is fighting another attempt by a giant corporation to take advantage of our poorly drafted federal computer crime statute for commercial advantage -- without any regard for the impact on the rest of us. This time the culprit is LinkedIn. The social networking giant wants violations of its corporate policy against using automated scripts to access public information on its website to count as felony "hacking" under the Computer Fraud and Abuse Act, a 1986 federal law meant to criminalize breaking into private computer systems to access non-public information.
EFF, together with our friends DuckDuckGo and the Internet Archive, have urged the Ninth Circuit Court of Appeals to reject LinkedIn's request to transform the CFAA from a law meant to target "hacking" into a tool for enforcing its computer use policies. Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use. LinkedIn would have the court believe that all "bots" are bad, but they're actually a common and necessary part of the Internet. "Good bots" were responsible for 23 percent of Web traffic in 2016. Using them to access publicly available information on the open Internet should not be punishable by years in federal prison. LinkedIn's position would undermine open access to information online, a hallmark of today's Internet, and threaten socially valuable bots that journalists, researchers, and Internet users around the world rely on every day -- all in the name of preserving LinkedIn's advantage over a competing service. The Ninth Circuit should make sure that doesn't happen.
EFF, together with our friends DuckDuckGo and the Internet Archive, have urged the Ninth Circuit Court of Appeals to reject LinkedIn's request to transform the CFAA from a law meant to target "hacking" into a tool for enforcing its computer use policies. Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use. LinkedIn would have the court believe that all "bots" are bad, but they're actually a common and necessary part of the Internet. "Good bots" were responsible for 23 percent of Web traffic in 2016. Using them to access publicly available information on the open Internet should not be punishable by years in federal prison. LinkedIn's position would undermine open access to information online, a hallmark of today's Internet, and threaten socially valuable bots that journalists, researchers, and Internet users around the world rely on every day -- all in the name of preserving LinkedIn's advantage over a competing service. The Ninth Circuit should make sure that doesn't happen.
Wait just a minute... (Score:5, Interesting)
Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use .
If I'm reading this correctly, I'm not so sure I agree with that last bit, about "violating terms of use". So all terms of use are null and void (if my browser can find it, it's publicly accessible, no matter what I have to agree to in order to get access to it?)? For example, if I have a website that stipulates you must agree not to disseminate the information made available to you by agreeing to these terms of use, you remain free to ignore that agreement?
Or are they saying that an automated script that can bypass a Term of Use agreement isn't hacking?
Re:Wait just a minute... (Score:5, Insightful)
If:
I can send a simple http request to your server, and
Your server sends me the information without doing its homework, then
Sucks to be you.
Don't want your information to be scraped? Have it behind a login - free or otherwise - then ban accounts that are slurping down 10,000 pages a day.
Ohhhhh then it wouldn't be easily indexed by search engines and thus findable by the general public and your site would fade into obscurity. What to do!? Courts to the rescue, it seems!
Re: (Score:2)
Don't want your information to be scraped? Have it behind a login
And the protection the login affords you is embedded in the "Term of Use" when you created your account, but if I'm reading this right (and I may not be), using your login and ignoring Terms of Use is A-OK.
Re: (Score:2, Informative)
using your login and ignoring Terms of Use is A-OK
no, because at that point we are no longer talking about public information
Re: (Score:2)
Nope, because that’s no longer publicly available information. Do try to actually keep up with the argument the EFF is actually making not your strawman version.
Re: (Score:2)
Thanks - you did notice I questioned my interpretation, right?
"if I'm reading this right (and I may not be)"
Re: (Score:1)
Absolutely no login required.
Literally, published to anybody who makes an HTTP request.
Accessed, in violation of a ToS, and someone trying to make that a felony. See the insanity here?
Re: (Score:3)
It also reflects badly on Microsoft Corporation, the owner of LinkedIn.
Re:Wait just a minute... (Score:5, Insightful)
if I'm reading this right (and I may not be), using your login and ignoring Terms of Use is A-OK.
You're reading it wrong. Using your login and ignoring terms of use is a breach of contract (albeit a unilateral EULA). It is not and should not, however, be considered felony computer hacking under the CFAA.
Re:Wait just a minute... (Score:5, Informative)
Dipstick, I can freely ignore all terms of service you specifically do not get me to agree to and by law that means specifically. You must actively seek my agreement and obtain it, prior to claiming I agree to anything. All you can do is deny service, you can not make any claims beyond that. Otherwise numbnuts, I could put a claim below the fold, that to read anything above the fold means you agree to pay me a million dollars. You must actively seek actual agreement to terms of service, prior to making claims, you can only deny service nothing more not make claims for providing a service. You are clearly too wrapped up in the bullshit of post purchase end user licence agreements which are illegal in the majority of countries and only legal in the US because of corruption and bias towards corporations. It's like the old readers digest bullshit of sending you stuff, claiming you bought if and you owe them money if you did not send it back, nope, a lie, they have no right to claim service off you, they sent it to you for free, they gifted it to you. Same as the internet, unless you actively seek agreement and then refuse service if agreement is not achieved, than you can not claim payment for accessing you service.
Re: (Score:2)
Seems to me it would be trivial to code in exceptions to the slurp limit for the IP address of known search engines. And leave a link on your site for search engines you don't know about to request an exce
Re: (Score:2)
Course the next step in this arms race is to use a botnet to do distributed slurping. Each compromised machine accesses a handful of pages, then sends them to a master server for aggregation. I'm not sure what you could do to prevent that, technically or legally.
Wait until they post the data, accuse them of running a botnet, and then bring a slightly more valid "hacking" case against them?
Re: (Score:3)
And leave a link on your site for search engines you don't know about to request an exception.
The "automated" "homework" that someone else claimed the website needed to do is already a standard. There's a "robots.txt" exclusion file that is a standard way of specifying what a robot is allowed to do and not allowed to do.
It used to be very poor netiquette for a robot to ignore that file. Now it seems that the attitude is fuck the website operator, he's got to hide his stuff behind closed doors to keep abusive robots from taking his website down. It's A-OK to do anything that a website physically al
Re: (Score:2)
Your server sends me the information without doing its homework, then
Sucks to be you.
Two problems with this:
1) that's the argument for DRM. It already is password protected, by requiring users to log their bot in. So they're violating terms of services by logging their bot in.
2) It's like a burglar using the "The door wasn't locked therefore the property owner didn't do their due diligence to keep me out."
I don't think it should be a felony but it should be something. For instance if I sneak into a movie theater I am not a burglar but I'm also not there lawfully and I am taking a valuable
Re: (Score:3)
You're assuming they had to log in to access the information in question. The impression I had was that they were accessing data that didn't require a login. The materials I've found are not extremely clear on that point, but it does specify that hi-Q was accessing the publicly available portions of LinkedIn's site, which to me implies that it's the parts that don't need a login.
If a login was required to get to the data then I agree with you, but I don't think it was.
Re: (Score:1)
Require authorization through a login to access the information. If people then break into their servers and violate those access controls then it would apply under the CFAA.
Re: (Score:3)
A website that provides public information (no login) could use a rate limit to control how much of its resources can be commandeered by a third party. A website that provides information only after a login (and has terms of service for same) can revoke the login privileges if the terms of service are violated. Neither case seems to warrant aligning violation of terms of service or slurping too much data with committing a felony hack, just as trying to take too many brochures from the distribution rack do
Re:Wait just a minute... (Score:5, Insightful)
Re: (Score:3, Funny)
CNN confirmed that looking at Wikileaks is a crime, and they helped the public by informing us many times of that. If they hadn't, then more people probably would have been put in prison for reading the DNC leaks and voted for Trump. That would have been horrible.
Re: (Score:3)
They sucked you in to their free service, before facebook was popular. MySpace wasn't really appropriate for work colleagues and other professional relationships.
They then made a mobile web site and redirected all mobile users to that site, to keep you investing your time and using their service
Then they removed the mobile website and redirect you to install their app, which you can't install without giving it permission to access to all your contacts.
When you log in to the app, it uploads all your contacts
Re: (Score:2)
Disseminating information without authorization (violating a license agreement) is not the same as accessing information without authorization (hacking).
Re: (Score:2)
> Disseminating information without authorization (violating a license agreement) is not the same as accessing information without authorization (hacking).
Both can be felonies under US Federal law. If you violate a license, then standard copyright rules apply. This is how the GPL works.
Re: (Score:2)
Re: (Score:1)
They are not arguing all terms of use are null in void. They are arguing that accessing publically available information is not felony hacking just because it violates a ToS. It could very well still be a civil offense, but it should in no way be a felony crime. To think otherwise is insanity.
make it an felony crime with court overview of con (Score:2)
make it an felony crime with court overview of contract and one that you have hunt on a web site does not count.
Re: (Score:2)
Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use .
If I'm reading this correctly, I'm not so sure I agree with that last bit, about "violating terms of use". ...
I'm sure we can all agree that hacking a website and violating their terms of use are not necessarily the same thing. If their terms say you must be wearing a blue shirt to use the site and your shirt is red, I can't really see any hacking going on. I imagine that a bot retrieving information automatically that could similarly be retrieved manually are pretty much the same thing, unless the bot operates so much faster that it affects operation of the site. Even then, no reasonable person would call that "
Re: (Score:3)
You need to get whoever/whatever reads the data to agree.
If you send the data before they agree, or without finding out whether or not they agreed, then it wasn't really terms, was it? Did you actually stipulate, or did you just plan to do so and then not follow through?
Me
Re: (Score:2)
Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use .
If I'm reading this correctly, I'm not so sure I agree with that last bit, about "violating terms of use".
Or are they saying that an automated script that can bypass a Term of Use agreement isn't hacking?
I break down "using automated scripts to access publicly available data is not 'hacking,' and neither is violating a website's terms of use" into two clauses:
The outcome being that neither the use of automated scripts to access publicly available data nor the violation of a website's terms of use should be prosecuted under the Computer Fraud and Abuse Act.
Re: Wait just a minute... (Score:2)
Re: (Score:1)
Except that story didn’t say what they did was illegal or a violating of the CFAA. In fact the story said the following:
It’s possible Uber’s data gathering did not violate any laws—much of it occurred internationally, and the data was often collected from publicly-available websites and apps
But, hey, don’t let facts get in the way of your post.
Re: (Score:2)
They're saying if you can access it publicly, claiming it's "felony hacking" is complete bullshit.
The information is there, on a web server to be accessed, is totally unsecured .. the notion that you've committed a felony by pointing a browser or agent at it and downloading a page without bypassing any security mechanism is ridiculous.
It is gross over-reach to call accessing information published on your website (without any authentication required) "hacking" or a "crime".
Imagine if the copyright lobby tried to argue that by looking at someone else's newspaper on the bus you've violated their copyright because you weren't licensed to do so.
The standard of "hacking" has to be more than "made http request and got a response". They may not want you to access it without their permissions, but it's not like anybody was bypassing any actual security ... don't want that? Make your site login-only and don't serve up data.
This is LinkedIn (and by proxy Microsoft) trying to over apply a badly written law.
Look at it this way. If they manage to get their way, then there will be a much quicker and easier means to have people you don't approve of be stripped of not only their right to vote, but also their 2nd Amendment right. Because, you know...felon!
Re: (Score:2)
Imagine if the copyright lobby tried to argue that by looking at someone else's newspaper on the bus you've violated their copyright because you weren't licensed to do so.
It depends on what you then do with what you read in the newspaper. There are several things that you could do:
Re: (Score:2)
I think they've already tried that
They tell the people running the bots they're banned from their website, but have nothing to physically stop them.
After telling them "you're not allowed to access our public website" they then try to get them prosecuted for illegally accessing a computer system when they keep doing it.
what about have links under a pay wall with no log (Score:2)
what about have links under a pay wall with no login needed and changing any one that hit's the paid zones with out paying as a hacker?? even when they can get to them from the out site with not even needing to go the you must pay page. And what if that paid zone was something like /docs or some other common name that some bots just auto scan for when indexing the web?
Re: (Score:2)
Either you're confused or your inability to grasp the english language has turned what I'm sure made sense in your head into gibberish as you typed it.
Re: (Score:2)
experts exchange use to be very to easy to by pass the paid easy as in you did not really need to do any more then just keep going down to the end of the page.
robots.txt (Score:4, Insightful)
Shouldn't a "good bot" abide by https://www.linkedin.com/robots.txt?
Re: (Score:1)
sure, but that doesn't make it a felony to ignore robots.txt
"There is no law stating that /robots.txt must be obeyed, nor does it constitute a binding contract between site owner and user, but having a /robots.txt can be relevant in legal cases."
http://www.robotstxt.org/faq/legal.html
"It is solely up to the visiting robot to consult this
information and act accordingly. Blocking parts of the Web site
regardless of a robot's compliance with this method are outside
Re: (Score:3)
Re: (Score:2, Insightful)
Yes, a "good bot" should follow robots.txt. But failing to implement a standard is not "hacking".
Tell that to Volkswagon, who "failed to implement" the standard regarding how their vehicles respond to situations they can interpret as being "testing".
Robots that ignore the robots.txt standard are not just "failing to implement." They're ignoring the standard for their own gain.
Re: (Score:2)
It's not a standard with any legal weight - just something some people agreed to do. And failing to implement it is still not hacking - and are you saying Volkswagen was hacking?
Re: robots.txt (Score:3)
Volkswagon's violation was intentional.
While a "robots.txt" file is unknown, and is intentionally hidden from the user. It is intended to only be visible to "bots" which are actively looking for a "robots.txt". As bots are not sentient, they must be programmed by an individual aware of the existence and purpose of "robots.txt", and understands it to be more than a request.
Though, if "robots.txt" isn't required to
Re: (Score:2)
While a "robots.txt" file is unknown, and is intentionally hidden from the user.
Not "robot users". And hardly unknown, to any reasonable practitioner of the craft.
Re: robots.txt (Score:2)
It is possible to create bots without being a fully realized "reasonable practitioner", or otherwise being aware of the robots.txt file. Discovering or being introduced to the "robots.txt" file is part of the process of becoming a "reasonable practitioner". Let's not make lacking experience a crime before we can download Bachelor's degree levels of job knowledge and experience.
Re: robots.txt (Score:2)
Re: (Score:1)
It should, but just because one might not abide by it does not make the bot writer a felon.
Fuck LinkedIn (Score:2, Funny)
As far as I'm concerned, LinkedIn themselves are guilty of massive fraud and deception, by tricking users into providing email contacts so that LinkedIn can send invite spam supposedly from the user. It was a carefully designed "dark pattern" to increase their userbase early on.
Of course, by the time they eventually got sued over this, they were big enough to shrug off the financial penalty and keep making money off all the data they had collected illegitimately.
LinkedIn is a socially malignant business an
Who's a good bot? (Score:5, Funny)
Re: (Score:2)
You misspelled 'whose' repeatedly. What are they teaching you Americans in your schools? German is my first language, but even I wouldn't make a mistake like that.
EU education > US education
Sweet. It's an actual Grammar Nazi!
Re: Who's a good bot? (Score:2)
Good for "whom," exactly? (Score:2)
. "Good bots" were responsible for 23 percent of Web traffic in 2016.
Nearly one-fourth of all internet traffic is from the innocently-named "Good bots"? That's kind of amazing.
Re: (Score:1, Informative)
Re: (Score:3)
How do you think search engines work?
Are you trying to claim that one-fourth of all traffic on the web is search engines crawling the network? Doesn't that seem like a lot of traffic?
That's like saying one-fourth of the cars on the road are "Google Cars" updating Google's Street View database.
LinkedIn Scrapes users browsers (Score:1)
Ironically enough, LinkedIn scrapes its users browser for known extensions. See https://github.com/prophittcorey/nefarious-linkedin for details.
Re: (Score:2)
> Ironically enough, LinkedIn scrapes its users browser for known extensions. See https://github.com/prophittcor... [github.com] for details.
Server interrogates clients for their functionality.
That doesn't sound nearly as nefarious as you think it does.
Well... (Score:2)
Arrest records... (Score:5, Interesting)
Let's use a different example. Arrest records and mugshots on police agencies' websites. Let's say Jane Doe, born 1/1/1970 got arrested for a particularly heinous crime. Murder, or robbery at gunpoint.
Six months later, a court ruled her not guilty. She was able to petition to have the public arrest record on the Yoknapatawpha County Sheriff's office website deleted.
However, in the interim, it's been scraped and archived by database companies using the data for employer background checks. Every time she applies for a job with a large employer, her application either gets round-filed, or she has a lot of explaining to do.
What's worse, in the state of Winnemac, there are six Jane Does with that same birthday, all of which have the same record in their background check database...
Does information still want to be free?
Re: (Score:1)
She’s perfectly free to use civil courts to right the matter. Those companies did not get the information through “hacking” and thus those companies being prosecuted under the CFAA would be a gross abuse of the law.
Re: (Score:3)
See, I disagree. I think invasion of privacy by corporate entities should be strictly punished -- whether it's retention of data past strict legal limits or when a user specifically opts for account deletion, mass surveillance, or dissemination of inaccurate or prejudicial information affecting people's ability to earn a living. How is someone going to sue if they don't have a job because of something incorrect being in a database used by employers?
Violate data retention laws? In an ideal world, one lash
Re: (Score:2)
Re: (Score:2)
The terms of use typically prohibit distribution or mass downloading. With good reason.
I'm generally against looser laws (drugs, prostitution between consenting adults), but I think privacy should be sacrosanct. I'm all for throwing the book at corporate entities that violate people's privacy, AND writing new laws having explicit time limits for data retention and mandating deletion. The EU's "right to be forgotten" is a good thing in an age where privacy is slipping away.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
The terms of use typically prohibit distribution or mass downloading. With good reason.
There are no such terms of use on public records.
I'm generally against looser laws (drugs, prostitution between consenting adults), but I think privacy should be sacrosanct. I'm all for throwing the book at corporate entities that violate people's privacy,
But again, public record data is not private. Do you not understand what the term “public” means?
The EU's "right to be forgotten" is a good thing in an age where privacy is slipping away.
I’m not arguing against the right to be forgotten. I’m arguing against stupidity that claims that accessing public record data is hacking. The “right to be forgotten” makes no such conflation. Like I said, she should use the civil court system to address the issue which is exactly what the EU’s “right to be forgotten
Re: (Score:2)
Re: (Score:2)
So she has explaining to do.
News flash - life isn't fair. Does Jane Doe seriously think she's the only one in the world with problems that she might not reasonably deserve to have to live with?
Re: (Score:2)
Re: (Score:2)
I think that whole "right to be forgotten" shit is an idiotic concept as well. Nobody has any intrinsic right to be forgiven for something that happened in the past, or deserves to have any record of a mistake expunged... we only have to appeal to our fellow man's better nature and ask that they overlook past flaws. Ideally, they will, but you can't legislate that we can't form opinions about other people based on arbitrary information, regardless of its source.
And on top of it all, how are we expected
Re: (Score:2)
The issue isn't "someone's mistake." The issue is some dumb and/or malicious cop's mistake in arresting someone for a crime they didn't commit. There's an issue with people being punished for OTHER people's mistakes (or malice) without trial.
Especially if the record only has a name and DOB, and it can "punish" multiple people who weren't even involved in the incident, but happen to have the same name and DOB.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I'm reminded of an expression: "First world problems".
Half of the people in the world have it worse than you or I are ever liable to know, no matter how unfair things might get.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Nobody excluding almost everyone under the age of 16 or 18 in almost every high-income social democracy. In some countries, people under the age of 18 amount to half the population.
Note that I'm using a definition of "expunge" roughly equal to "subject to the least necessary controlled and restricted sharing among professionals who would immediately lose their profes
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
methinks you misunderstand that expression. Information wants to be free, just like nature abhors a vacuum, or water wants to escape a vessel.
It's totally unfair to Ms Doe, but that's the way the cookie crumbles; and there's nothing that can really be done*. Once it's out there, it's out there.
*Without extreme social costs to everyone else.
Re: (Score:2)
Re: (Score:2)
What would be the social cost of penalizing the mass distribution of such information?
The social cost is that someone down the line will then begin to use this new authority to attack and punish people for purely political reasons. Are you really so naive as to not see that? Public record data is in the public domain which means it’s free for anyone to share. You sound like quite the fascist.
Re:Arrest records... (Score:4, Insightful)
Every single man, woman, and child in the US has heard the phrase "innocent until proven guilty", and look at the effectiveness of that caveat.
Re: (Score:2)
It sounds like they are finding whatever hearsay they can find, and selling it to employers as though it were relevant background information. That might be an honest mistake, and it might be fraud.
(That's always the big question, whenever you don't really have a good cache invalidation algorithm and just say "oh, cache it for n seconds." Do you have the right value for n?)
Either way
Re: (Score:3)
Perhaps you are
Re: (Score:3)
(1) You have too much faith in the American system. Cute.
(2) They won't refuse to hire her -- they'll just ignore her resume before she ever gets called for an interview based on background check data. It's only grounds for her to sue if she knows about the policy. Company's excuse would be they never got the resume.
Re: (Score:2)
Re: (Score:2)
They won't refuse to hire her -- they'll just ignore her resume before she ever gets called for an interview based on background check data. It's only grounds for her to sue if she knows about the policy. Company's excuse would be they never got the resume.
You don't get permission to run the background check until after the job offer. If the offer is then rescinded on the basis of the background check, the applicant has a legal right to a copy of the report and a formal dispute process with whoever did the check.
Re: (Score:2)
Re: (Score:2)
What's worse, in the state of Winnemac, there are six Jane Does with that same birthday
Do you have any idea how unlikely it is to have two people share not only the same name AND birthdate?
People sharing names OR sharing birthdates is quite common, sharing both is rare - not anywhere near impossible, but very, very unlikely.
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
You can't make something unpublic. If your state/jurisdiction makes something public that shouldn't be, it really is on them.
Re: (Score:2)
Re: (Score:2)
> A better question is why US police departments are publishing personal details in public arrest reports at all.
Call it transparency. If they didn't publish this stuff, then someone would probably try to say that they have something to hide and are therefore up to no good.
Someone will likely object just because some people will object to what cops do regardless.
Figures (Score:3)
Seriously what kind of idiot buys into an outfit that has as a basis of operation, asking for something that in most places will get you fired?
? I started to sign up, and when they asked for my password it was 1FuckYouLinkedin!
If you don't want it revealed... (Score:2)
Don't put it on the Internet!
PERIOD!
I don't give a flying inverse sideways hate-fuckathon HOW secure you're promised it is.
In the end, YOU are responsible for disseminating it.
If you put it online in ANY capacity whatsoever, it WILL be compromised and it WILL be disseminated without your say-so.
END OF DISCUSSION!
Money says.. NO! (Score:2)
If they aren't using clever tricks to get it (Score:3, Insightful)
I'm thinking LinkedIn is wrong here, but a simple, clear-cut, and correct statement of public policy is more difficult than it first appears.
"accessing publicly available information" sounds pretty clear and simple, but the more I think about it, the murkier it becomes. Suppose in each of the following scenarios the data is by the owner's terms not to be accessed by bots and:
A) The system pops up a user/ password dialog before allowing access. User "admin" and an empty password works
B) The system pops up a user/ password dialog before allowing access. User "admin" and password "password" works
C) The system pops up a user/ password dialog before allowing access. User "admin" and password "correct horse battery staple" works
D) The system pops up a user/ password dialog before allowing access. Sending 17,000 requests each with a password that consists of a million null bytes followed by carefully crafted machine code to overwrite memory sometimes works
The thing is, ANY data that has been hacked over the internet was accessible to the public, if they public tried hard enough, and was clever enough in defeating access control measures. That makes it difficult to legistlate a bright-line rule.
Re: (Score:2)
Re: (Score:2)
Put the information behind a free login or a paywall. Or sue them in civil court instead of abusing criminal statutes that were never meant to apply to publicly available information.
Re: (Score:2)
WikiLeaks did.