Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Government United States

The EPA Plans To Sunset Its Online Archive (theverge.com) 30

Come July, the EPA plans to retire the archive containing old news releases, policy changes, regulatory actions, and more. The Verge reports: The archive was never built to be a permanent repository of content, and maintaining the outdated site was no longer "cost effective," the EPA said to The Verge in an emailed statement. The EPA announced the retirement early this year, after finishing an overhaul of its main website in 2021, but says that the decision was years in the making. The agency maintains that it's abiding by federal rules for records management and that not all webpages qualify as official records that need to be preserved.

The EPA says it plans to migrate much of the information to other places. Old news releases will go to the current EPA website's page for press releases. When it comes to the rest of the content, the EPA has a process for making case-by-case decisions on what content can be deleted -- and what is relevant enough to move to the modern website. Some content might be deemed important enough to join the National Archives. The public will be able to request that content through the Freedom of Information Act.

The archive is the only comprehensive way that public information about agency policies, like fact sheets breaking down the impact of environmental legislation, and actions, like how the agency implements those laws, have been preserved, [says Gretchen Gehrke, one of the cofounders of a group called Environmental Data and Governance Initiative (EDGI) that's fighting for public access to resources like the EPA's online archives]. That makes the archive vital for understanding how regulation and enforcement have changed over the years. It also shows how the agency's understanding of an issue, like climate change, has evolved. And when the Trump administration deleted information about climate change on the EPA's website, much of it could still be found on the archive. Besides that, Gehrke says the content should just be available on principle because it's public information, paid for by taxpayer dollars.

This discussion has been archived. No new comments can be posted.

The EPA Plans To Sunset Its Online Archive

Comments Filter:
  • by Frobnicator ( 565869 ) on Thursday March 24, 2022 @06:28PM (#62387619) Journal

    The problem is everywhere, and nothing new or unique to the EPA website.

    While the web has made many resources more available, they are also more fragile than ever. Published legal documents get silently modified or vanish. Research papers get amended, or vanish.

    Long ago you could go to an archive to find an ancient newspaper for a specific date, magazine, or a university collection of government publications for an obscure data collection. This has been dying off for the past 30 years, sometimes more detrimental than others.

    • ...then they want the past records to disappear.

      • Bingo!

        Had a situation where the local magistrate online listing of a law was seemingly changed. Had recollection of the previous version, so asked when the law had been revised.

        No record of that, and much hemming and hawing later put it to a typo. Typo that ended up costing several, thousands of dollars and threats of lawsuits.

        Memory holing of official documents. Especially now, storage is cheap. There is no reason for this.

      • by PPH ( 736903 )

        Those who control the past, control the future.

    • I understand your point, But I worked for the EPA as a computer scientist. This is about information hiding.
    • I would think this collection is something that should be shipped to the Library of Congress.

    • by jhecht ( 143058 )
      Too often the problem is just some pointy-haired marketing type who designs everything needs to be redesigned and rebranded so they can continue receiving their paycheck. Every time they rebrand something all the archives get lost.
  • Is the arguments for it 10-15 years ago. Those tidings of doom which never came to pass. It is best that these cannot be found. Those truths are⦠inconvenient.
    • If only the research was based on decades of work by scientists worldwide in multiple disciplines and not remotely reliant on the EPA being the only source of data—oh wait it is!
      • Actually, no it isn't. It's based on a very small, and manipulated, set of data sources most of which are the EPA and NOAA.
        • So the EPA/NOAA somehow controls research done worldwide by scientists in multiple fields? Research that is published by other countries and institutions? If only other countries made their own conclusions based on data—Oh wait they do!
  • If not that many people are using the archive, it shouldn't cost that much to keep a couple terabytes sitting on disk.
    If it's about paying for bandwidth, if that many are hitting that much data, it needs to be retained and upgraded.
    If it's about hiding and deleting information, those responsible should be replaced with a competent staff.

    • If not that many people are using the archive, it shouldn't cost that much to keep a couple terabytes sitting on disk. If it's about paying for bandwidth, if that many are hitting that much data, it needs to be retained and upgraded. If it's about hiding and deleting information, those responsible should be replaced with a competent staff.

      It's not just about the data. It's about the interface for your access to the data. You need the website which manages the interface, the servers which process requests for the website, and other servers managing things like caching and data storage. Those all run software at a lower level which needs to be kept up to date, which costs time and money and most notably someone's attention - especially across major software updates. I would guess they are most likely making this change to try and consolidate t

    • Er what? The cost in maintaining an online archive is not as cheap as a few drives. That’s like saying it should it cost Apple only a few disks to sell music from an artist that sold only a few thousand albums a year. You do know that 1) the EPA is not the sole repository of the data 2) some of the information is being moved to other places 3) information is still available via FOIA
    • Fancy CMS based systems are the problem. They can keep static html pages forever and make some kind of index of them for an archive CHEAPLY with almost no cost. The problem becomes when you build a complex CMS with database to power the website and then have to maintain that database and CMS software forever even when it's no longer supported - that is a long term COST that goes on forever. Sure you could run a VM with it forever... but it gets hacked forever etc. Still costs long term.

      Everything they do

  • You might not like this data, you might not need this data, but nothing gives you the right to destroy this data - or let it be destroyed - Forever.
  • If they made the site correctly then it is already in the internet archive.

    If they didn't, that's the real crime here.

  • When we as a planetary race switched from paper conservancy of historic record to store it and forget it data. Data which in a relatively short time frame will be rendered unreadable by time or simply discarded as irrelevant since there is now just so much noise within it.

    Well, unless there are still Monks somewhere transcribing important documents and twitter posts..
  • Will that be Standard or DST? :-)

  • Web 4.0 needs to be based on something a little more robust. Maybe some of the ideas from the interplanetary filesystem can be applied in practice. Content addressable data. Decentralized peer-to-peer sharing. Etc.
  • by virtig01 ( 414328 ) on Thursday March 24, 2022 @10:20PM (#62387969)

    The central problem is, er, centralization. One authority maintains a copy of the content, so they control the content's lifetime. Archive.org is great, but it's still a centralized repository. There are other tools that can be used to keep content available to everyone.

    With IPFS [ipfs.io], content is distributed as people access it. As long as one node has a copy, it can be retrieved.

    Arweave [arweave.org] is enabling the permaweb, allowing data to be stored "permanently", in a distributed fashion. There's a browser plugin; just click "Archive this page", and it'll be stored forever.

  • Anything that does not comport with current doctrine goes down the memory hole. Nothing to see here, folks.

  • The author of the article doesn't really have a good understanding on what the EPA Archive really is - it's an archive of the agency's webpages, *not* environmental data, records, publications, etc. Those can all be found elsewhere and aren't affected by this "sunset".

You are always doing something marginal when the boss drops by your desk.

Working...