Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Censorship Social Networks The Internet

Tumblr Blocked Archivists Just Before Starting the NSFW Content Purge (techdirt.com) 204

An anonymous reader quotes a report from Techdirt: By now, of course, you're aware that the Verizon-owned Tumblr (which was bought by Yahoo, which was bought by Verizon and merged into "Oath" with AOL and other no longer relevant properties) has suddenly decided that nothing sexy is allowed on its servers. This took many by surprise because apparently a huge percentage of Tumblr was used by people to post somewhat racy content. Knowing that a bunch of content was about to disappear, the famed Archive Team sprung into action -- as they've done many times in the past. They set out to archive as much of the content on Tumblr that was set to be disappeared down the memory hole as possible... and it turns out that Verizon decided as a final "fuck you" to cut them off. Jason Scott, the mastermind behind the Archive Team announced over the weekend that Verizon appeared to be blocking their IPs. Thankfully, it didn't take long for the Archive Team to get past the blocks. Scott tweeted on Sunday: "why look at that the archiving of tumblr restarted how did that happen must be a bug surely a crack team of activist archivists didn't see an ip block as a small setback and then turned everything up to 11."
This discussion has been archived. No new comments can be posted.

Tumblr Blocked Archivists Just Before Starting the NSFW Content Purge

Comments Filter:
  • by 91degrees ( 207121 ) on Wednesday December 19, 2018 @08:03AM (#57829440) Journal

    "why look at that the archiving of tumblr restarted how did that happen must be a bug surely a crack team of activist archivists didn't see an ip block as a small setback and then turned everything up to 11."

    Huh? this is the most incomprehensible sentence since "Has Anyone Really Been Far Even as Decided to Use Even Go Want to do Look More Like?". Do I just need mode Covfefe?

    • by Anonymous Coward on Wednesday December 19, 2018 @08:06AM (#57829454)

      If 'Scott' had known of punctuation:

      Why look at that, the archiving of tumblr restarted! How did that happen? Must be a bug - surely a crack team of activist archivists didn't see an IP block as a small setback and then turned everything up to 11?

    • I had the same problem until I realized it should be "well, look at that!" Turns out punctuation is important, despite what Twits thinks.

      • Well punctuation does save lives. It could make the difference between you and Granny sharing a nice meal together; and Granny entangling and strangling herself using her dugs as bolas as she tries to ward you off because she thinks you are a cannibal.

        Let's eat, Grandma! vs. Let's eat Grandma!

        Likewise, proper capitalization can mean the difference between you assisting your Uncle Jack in dismounting from a horse, and you and your uncle going to prison and being registered as sex offenders (if helping your

    • by Anonymous Coward

      It's called eloquence. If you are not a professional grade porn archiver, you probably will not understand.

    • i whipped out my mad literacy skills and translated it forthwith
    • by Headw1nd ( 829599 ) on Wednesday December 19, 2018 @09:47AM (#57830018)
      Somewhat ashamed to know this, but the lack of punctuation is a tumblr convention - Usage indicates that the writer is "speaking" quickly or off-the-cuff, and is frequently used in situations where someone is mocking another's actions or intent. It's meant to imitate a certain speech pattern where one speaks quickly at a consistent tempo.
  • Oh yeah I know: Pornhub, Youporn, Xvideos...

  • by wierd_w ( 1375923 ) on Wednesday December 19, 2018 @08:17AM (#57829522)

    OK. In a legal battle between legitimate archival of content, and the laws governing unauthorized computer system access, which one wins?

    It is quite clear that Verizon DID NOT authorize the archivists to archive the data prior to the mass purge, as evidenced by the imposition of the IP blocking. As such, there is a strong case to be made that Archive.org was in contravention of the CFAA, and the workaround could be said to be a technical means of circumvention of that restriction of access (and thus, technically 'hacking', even though I REALLY hate to use such a word for such a simple solution.)

    It is also quite clear that there is a cultural asset that was going to be removed, purely for PR reasons by Verizon-- which was in need of preservation, and the Archive.org folks acted to accomplish that preservation.

    So... Which wins here? Just curious.

    • by Anonymous Coward

      If a website is publicly accessible 24/7, how can you make a case for unauthorized access?

      • Re: (Score:3, Interesting)

        by wierd_w ( 1375923 )

        Archive.org admits that their IP range was explicitly blocked.

        This is like saying "Hey, I noticed there was a lock on the front door, so I went in the back. Clearly, this was the proper thing. There was never a lock there before!"

        Nevermind that the very presence of the lock, indicates that the building's owner wishes to restrict entry.

        • Eh... if we are going to do a home analogy, it is more like putting up a privacy fence on one side of your yard, but that neighbor can see in the other side's picket fence when walking his dog on the public right of way.
          • A more appropriate analogy would be:

            A bouncer at a night club that is open to the public, has been given explicit instructions not to let a certain person into the club.

            That certain person gets turned away at the door.

            Rather than accept that they were denied entry to the club, they put on a ridiculous fake nose and mustache disguise, and go in anyway.

            • If there was a login, maybe, but I'm not sure something that could be accidentally defeated by a dynamic IP is equivalent to a bouncer.
            • You want the ideal analogy? Archive.org probably just downloaded such a vast amount that the tumblr operators noticed it and classed it as some sort of DDoS or abuse.

              So this is like an all-you-can-eat restaurant banning a customer because they have a stomach like a trash compactor.

        • by Uberbah ( 647458 )

          This is like saying "Hey, I noticed there was a lock on the front door, so I went in the back. Clearly, this was the proper thing. There was never a lock there before!"

          More like a "no Homers" sign on the front, as the web site was publicly accessible from the rest of the internet.

          • Indeed. It is more like a bouncer at a publicly accessible night club.

            "Hey, If this one fat chick stops by, tell her she can't come in. Here's her picture. In the past, she has done things we don't like on premises."

            That one fat chick stops by, and the bouncer says "No, you can't come in."

            The fat chick is undaunted, and puts on some outrageous lady-gaga disguise, (or cross-dresses, if you prefer), then proceeds to do the very things she was barred entry for. Proudly proclaims how she easily circumvented t

        • Well, to make the analogy accurate it's more like saying, "Hey, I noticed that out of the millions of halls that lead to your house, you seem to have blocked off the set of halls I usually use to get there. So I went down one of the other halls."

          The building's owner did not wish to restrict entry. The building's owner wished to bar entry to a very specific entrant but otherwise remain public.

        • by Falos ( 2905315 )

          >there was a lock on the bathroom doors in a public park
          >there was a lock on a door on property marked private

          This analogy has always disingeniously leaned on a hidden implication.

          A public facing server with no credentials or any "authorization" demand is intentfully and openly broadcasting files.

          If your accidentally-posted spreadsheet says "For viewing by Supertech Inc employees only" at the top, I'm willing to acknowledge even that tiny shitstain as a mark of private property.

          If you accidentally sta

        • It was more akin to have a guard at the door telling people they would not be accepted in if they came from the direction from the local library, but it is open door for everybody else not coming from the direction of the library. Then the local librarian went past the house and came back from the other way and the guardsman left him in. Nothing illegal and nothing was broken.
        • by Etcetera ( 14711 )

          Archive.org admits that their IP range was explicitly blocked.

          This is like saying "Hey, I noticed there was a lock on the front door, so I went in the back. Clearly, this was the proper thing. There was never a lock there before!"

          Nevermind that the very presence of the lock, indicates that the building's owner wishes to restrict entry.

          That seems fuzzy though. Explicitly blocked *because they're Archive.org*? Or explicitly blocked because they're making 100's of thousands of connections while they try to download 85% of the entire website in a few hours?

          • Why does everyone think I am Verizon's bitch on this?

            Do I need to point out what a Devil's Advocate is, in the opening statement, from now on?

            devÂil's adÂvoÂcate /ËËOEdevÉ(TM)lz ËadvÉ(TM)kÉ(TM)t/
            noun
            unpunctuated: devils advocate; noun: devil's advocate; plural noun: devil's advocates

            a person who expresses a contentious opinion in order to provoke debate or test the strength of the opposing arguments.
            "the i

        • by Solandri ( 704621 ) on Wednesday December 19, 2018 @11:10AM (#57830600)

          This is like saying "Hey, I noticed there was a lock on the front door, so I went in the back. Clearly, this was the proper thing. There was never a lock there before!"

          I'd agree with the analogy if tumblr had created all the content in the first place, content which made them famous on the web, then they decided to remove it from the web.

          That's not the case here though. Users created the content, and that content was what made the site famous. Then the company unilaterally decided to pull the users' content off the web, which they're allowed to do since it's their servers hosting it. Verizon doesn't own that content though, the users who uploaded it to tumblr did. As such, Verizon doesn't have the right to selectively block archive.org from accessing that content. The copyright holder has control over distribution, not Verizon. So Verizon has no right to discriminate against who can view the artworks (unless the copyright holders ceded that right to them - I dunno what tumblr's TOS say).

          So the more appropriate analogy here would be an art studio allowed people to hang their artwork on the walls of their building for the public to view. This became quite popular, making the building famous and a popular destination for tourists, and also making it quite valuable. Then suddenly the studio decides that it wants to remove some of that artwork (which it has the right to do since it's their building). Prior to the date of removal, the door is never locked. The public is still allowed to come in and view/make copies of the artwork. But when a photographer arrives to take photos of all the artwork to be removed, the studio blocks him (and only him) from entry.

        • by Ichijo ( 607641 )

          the very presence of the lock [on the front door], indicates that the building's owner wishes to restrict entry.

          ...through the front door.

      • by Desler ( 1608317 )

        Because they were being blocked and publicly announced they were circumventing that very block. If Verizon wanted to be malicious that would provide perfect fodder for a case against them.

        • What case? No one needs permission to access a publicly accessible website. Good grief. Have people sunk this low to think you need permission from everyone before doing anything?
        • If they didn't want them to access the website they should have contacted them and made that clear. Otherwise it was simply a technical glitch that was resolved, that may or may not have been intentional.
      • How about read this case law [volokh.com]? It is similar but not exactly. Though, it is in California court.

    • Comment removed based on user account deletion
    • It's an interesting question. However I think that in this case tumblr is considered a public site and service, so I would liken it more to a store deciding to not allow one or two customers in when clearly everyone else is able to go in. They were not doing anything malicious, so denying normal access to an otherwise public site falls into a different realm.
      • There is precedent against Archive.org, in the form of CraigsList vs 3Taps.

        https://en.wikipedia.org/wiki/... [wikipedia.org].

        Craigslist is clearly a public site, however, they explicitly blocked 3Taps by IP range. 3Taps circumvented the block.

        The court ruled against them, and in favor of CraigsList.

        • The key question here is whether tumblr wants to admit that they deliberately tried to block archive.org. The case between CL and 3T was clean cut, 3T tried to cut into the market of CL. archive.org doesn't, simply by virtue of tumblr explicitly stating that they want to withdraw from exactly this market.

    • by 110010001000 ( 697113 ) on Wednesday December 19, 2018 @09:23AM (#57829866) Homepage Journal
      I have noticed that the younger generation doesn't understand this: if you put it on a public network and don't require authorization, it isn't "unauthorized access". You don't need permission. Oddly the younger generation seems fine with data collection by corporations without any "authorization" at all.
      • The issue is not with the content.

        The issue is with circumventing an access control technology to a network that contains it.

        See also, this case.

        https://en.wikipedia.org/wiki/... [wikipedia.org].

        CraigsList is clearly publicly available data; However, the operators of Craigslist explicitly blocked 3Taps from scraping their data. (much like Verizon explicitly blocked Archive.org). 3Taps circumvented that lockout. The court handed them their teeth.

        • I didn't say anything about the content. Courts make wrong decisions all the time. You don't need permission to access a public website.
          • For what it is worth, I agree with you.

            However, what you or I assert, is not what holds authority.

            • For what it is worth, I agree with you.

              However, what you or I assert, is not what holds authority.

              If enough of us assert it, it does. Either the laws are explicitly changed, or the courts "interpret" them to suit the current zeitgeist. I would argue that the "no permission needed to access any public website" convention is already firmly established in people's expectations, and IP range blocking does not constitute access control. When literally any other member of the public can access the site, an IP range block is the equivalent of "No Negros" on a public bathroom door.

              I am choosing that analogy

        • I think the core theme will less be whether archive tried to circumvent a block but rather whether Verizon would admit that they tried to block them (which they would have to if they want to claim a circumvention). Because between 3T and CL it's easy for CL to "admit" the blocking. 3T tried to steal their revenue, there's no bad press to be expected from this.

          Blocking an archivist trying to preserve content from being lost is not as easily sold.

          • If it were any other organization I would agree with you.

            This *IS* Verizon though. They continue to PERSIST with policies that have earned them a slot on the "Most hated company" list for early a decade solid.

            https://www.usatoday.com/story... [usatoday.com]

            What would normally be considered reasonable to assume, does not seem so in this particular instance. More than likely, Verizon was so concerned about the data throughput of a complete archival dumping process, that they explicitly tried to block Archive. They *COULD

      • by AmiMoJo ( 196126 )

        That's like claiming that if you do a public reading of a poem you wrote then reproducing or reusing that poem does not require authorization. It does, copyright does not get invalidated with the first public performance.

        Stuff on Tumblr is still protected by copyright law and while you have been given the right to view it on Tumblr anything else is still at the discretion of the copyright holder.

        • by lgw ( 121541 )

          You're conflating two issues here. One is copyright. Archive.org is in the clear on that, thanks to language in the DMCA that explicitly allows archiving (and prior copyright law always allowed archiving). The other is "unauthorized access". When you make something public for everyone to see, there can be no unauthorized access. That may have changed when Verizon blocked their IPs, hard to say, but before that point there clearly was no issue.

      • I wouldn't pin this on 'the younger generation'. Plenty of old dudes seem to think this is 'unauthorized access'. See: court cases where 'hackers' were able to access 'private' data on a site through a publicly available URL but are convicted anyway.

      • I have noticed that the younger generation doesn't understand this

        Interestingly there doesn't seem to be anyone young writing or proposing laws on this topic, only old people. Rather than misattributing blame to some arbitrary age group, why not just call them by the label they deserve: Stupid people.

    • It is quite clear that Verizon DID NOT authorize the archivists to archive the data prior to the mass purge, as evidenced by the imposition of the IP blocking

      How is that clear? How do you know that Verizon didn't make an honest mistake of auto-categorizing their traffic as a DoS?

      It seems to me they fanned out a bit to keep the Verizon IPS happy and kept some network engineers from having to deal with false alarms.

  • Comment removed based on user account deletion
  • by Entropius ( 188861 ) on Wednesday December 19, 2018 @08:55AM (#57829694)

    In the old days, the internet was built on protocols. "Social media" mostly meant things like Usenet and IRC, and people hosted websites by spinning up an Apache instance that spoke HTTP and would serve their content to anyone who asked. And so there was never that big of a stink about censorship-by-nonprovision-of-services, since anyone could run an IRC server. Communities themselves were responsible for their own infrastructure. Don't like a particular IRC client? Use a different one. Don't like the folks who run a particular IRC server? Run your own.

    But now that "I have apache running on a linux box in my basement hosting my blog" has given way to these "services", where communication platforms usually involve a for-profit company running all the infrastructure themselves in an opaque way. Aside from all the other issues that come from a corporate advertising-supported model, people are now learning that you can't trust these companies. The people I know who use tumblr as a primary means of communication are all going "gee, I wonder who else we can trust? We thought we could trust these folks."

    But ... this isn't inevitable, and there's no reason that the next big thing in social networking can't be designed as an open protocol, with no central point of control -- a system where people may choose to provide the infrastructure required to power their Facegram or Instabook or whatever themselves, or (more likely) hire someone replaceable to do it for them. Open protocols can't be sold out and can't be owned.

    Hardware capability is through the roof now. My smartphone has more storage, more processing power, and more bandwidth than the machines hosting IRC servers not that many years ago. There are no technical barriers to crowd-hosted social media.

    • by tepples ( 727027 ) <tepples AT gmail DOT com> on Wednesday December 19, 2018 @09:12AM (#57829792) Homepage Journal

      people hosted websites by spinning up an Apache instance that spoke HTTP and would serve their content to anyone who asked. And so there was never that big of a stink about censorship-by-nonprovision-of-services

      How did people become "anyone who asked" in the first place?

      there's no reason that the next big thing in social networking can't be designed as an open protocol, with no central point of control

      The IndieWeb community [indieweb.org] is trying to build a more protocol-centric social web. Each IndieWeb user registers a domain and buys hosting to hold his or her own posts, and IndieWeb sites use Webmention requests [indieweb.org] (similar to pingbacks) to notify other sites that replies have been posted. Right now, the biggest missing piece of IndieWeb is a recommendation engine [indieweb.org] to suggest related works by other authors.

      Hardware capability is through the roof now.

      IPv4 address space, by contrast, is not. Nor is IPv6 routing; I haven't seen evidence that an IPv6-only website can become successful in gaining and keeping readers.

      My smartphone has more storage, more processing power, and more bandwidth than the machines hosting IRC servers not that many years ago.

      But it's missing one thing: the ability to accept incoming connections on IPv4. Most cellular ISPs put their subscribers behind carrier-grade NAT [wikipedia.org], as do even home ISPs in some countries [slashdot.org]. These ISPs give the same public IP address to a whole neighborhood and will refuse to forward inbound port 443 on your neighborhood's IP address to your machine.

    • Re: (Score:3, Interesting)

      by Anonymous Coward

      Agreed. Unfortunately, there's not a clear path from here to there.

      The good news is that in the lists of Tumblr alternatives, I did see some people seriously considering Plume [fediverse.blog], which is a federated blogging platform that can connect to other Fediverse [wikipedia.org] federated blogs.

      In practice, the vast majority of people are not technical and aren't going to figure out how to run their own servers. That doesn't mean they'll never run their own servers; it means people with technical skills have to make running your own s

      • The good news is that in the lists of Tumblr alternatives, I did see some people seriously considering Plume, which is a federated blogging platform that can connect to other Fediverse federated blogs.

        In practice, the vast majority of people are not technical and aren't going to figure out how to run their own servers. That doesn't mean they'll never run their own servers; it means people with technical skills have to make running your own server user-friendly before they will. FreedomBox is one project working on that; the current state doesn't look super-user-friendly, but I think the goal is to be able to sell a box that you plug into your home internet that already has the software installed and can be configured over an easy web interface.

        FreedomBox has been around since freaking 2010. And it has Eben freaking Moglen behind it. And it's STILL too difficult.

        The last 20% of usability is 80% of the work. Open source is notorious for never exerting itself past the 80% mark. The difference between Linux on the Desktop and the Microsoft OSs is Microsoft spent the money to hammer away at that last 20%. It isn't much in functionality but it is a gulf of usability. Without it, FreedomBox and projects like it simply won't gain traction.

        Plex and

    • The main use of social media isn't just hosting, it's discovery. You can post whatever you want on your own site, but it's almost certainly going to remain unknown - social media matches up your content with people who might actually want to see it.

    • But ... this isn't inevitable, and there's no reason that the next big thing in social networking can't be designed as an open protocol, with no central point of control -- a system where people may choose to provide the infrastructure required to power their Facegram or Instabook or whatever themselves, or (more likely) hire someone replaceable to do it for them. Open protocols can't be sold out and can't be owned.

      Hardware capability is through the roof now. My smartphone has more storage, more processing power, and more bandwidth than the machines hosting IRC servers not that many years ago. There are no technical barriers to crowd-hosted social media.

      There's one huge technical barrier, which an anonymous coward two levels down touched on before disappearing down a Tor rabbit hole.

      Home connections are asymmetrical. Massively asymmetrical. An order of magnitude or TWO asymmetrical. Your home bandwidth can't handle serving up the data to support even your immediate family in hits, let alone your several hundred Facebook "friends". Not when you're posting high resolution images and video footage. It could eventually transmit everything to everybody the

  • by jeti ( 105266 )
    And around the same time Facebook prohibits even the slightest sexual innuendo. Is this all coincidence or is it a response to the SESTA [wikipedia.org]?
    • I'm pretty sure it's related. There are big moves afoot both against anonymity and for greater control of social media postings, all with the best of intentions (as always). I've recently seen otherwise reasonable people suggesting that the government needs to act to end anonymous posting, and I find it equal parts baffling, repugnant, and frightening.
  • Where does the archive team host the content now?

    Asking for a friend.

  • Holy Crap. A searchable list of 2.6million archived tumblr blogs: https://transfer.sh/13Aa3n/tum... [transfer.sh]

    By trying to rid the internet of porn, Verizon may have given us the best source yet.

    • Nice find. I managed to archive a couple-thousand images myself, but that's a drop in the ocean.

    • On closer look, it's a list alone... where's the actual data?

      • Sadly on a closer look you're right. This was a list that the Archivers were hoping to scrape. Just doing a search around their Wiki it would seem they didn't even scratch the surface.

    • I saw some people on /r/datahorders who had downloaded a good percentage. Some archivist might spend the next couple decades stitching it all together. One presumes that Verizon would rather crush the drives than help the archivists.

  • Tumblr some time ago provided a switch that would allow blog owners to set their blogs as explicit or exclude them from search results (the latter toggled on automatically if you activate the former, although if you wanted to remove your SFW blog from search you could do so).

    Recently they disabled the ability to deactivate these toggles - once you opt out of search or mark your blog as 'explicit', it's permanent, the toggles grayed out. That is, unless you edit the HTML source (e.g. through built in dev too [tumblr.com]

    • (oops, didn't notice Tumblr deleted the original post... but since reblogs are actual copies and not just links, good luck deleting all the posts by people who reblogged it... e.g. link [slashdot.org])

"There is nothing new under the sun, but there are lots of old things we don't know yet." -Ambrose Bierce

Working...