Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Bush Administration's E-Mail Deluge May Overload Archive System

Posted by Soulskill on Sat Nov 22, 2008 09:16 AM
from the hello-sir-madam dept.
Lucas123 writes "The Clinton administration generated 32 million e-mails. Bush's administration has generated 50 times as much data — 140TB, 20TB of which is email — which soon will have to be archived through a new government-built records management system. The new system may not be up to the task because the technology behind it may not be able to handle the sheer volume of data along with the fact that the Bush administration has been slow in providing the National Archives and Records Administration (NARA) with needed information about the records, according to a Computerworld story. Questions have also been raised about millions of missing e-mails from between March 2003 and October 2006. 'It wasn't until this summer that an intensive effort began to share information,' said Ken Thibodeau, director of NARA's Electronic Records Archives."
+ -
story

Related Stories

[+] Politics: Thousands of White House E-mails Deleted 799 comments
kidcharles writes "The Washington Post reports that in the midst of an investigation by the U.S. Congress into the firing of eight U.S. Attorneys by the Department of Justice, numerous White House e-mails have been lost. Among them are communications from presidential adviser Karl Rove. Parallels are being drawn with the infamous '18 minutes' missing from the Nixon Watergate tapes. Also at issue is the use of Republican National Committee e-mail domains (such as gwb43.com and georgewbush.com) rather than the official White House domain. This is a violation of the Presidential Records Act."
[+] Politics: White House Says Hard Drives Were Destroyed 411 comments
wanderindiana brings us an update on the White House missing emails mess, which we have discussed before. It seems the hard drives of many White House computers are gone beyond the possibility of recovery. Is it unusual in your experience for, say, a corporate IT department to destroy hard drives by policy? "Older White House computer hard drives have been destroyed, the White House disclosed to a federal court Friday in a controversy over millions of possibly missing e-mails from 2003 to 2005. The White House revealed new information about how it handles its computers in an effort to persuade a federal magistrate it would be fruitless to undertake an e-mail recovery plan that the court proposed."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by Anonymous Coward on Saturday November 22 2008, @09:22AM (#25857391)

    The other 120 TB was probably just Clinton's porn stash that the Bush administration found while purging off records.

  • by Ice Wewe (936718) on Saturday November 22 2008, @09:23AM (#25857401)

    "The Clinton administration generated 32 million e-mails. Bush's administration has generated 50 times as much data -- 140TB, 20TB of which is email -- which soon will have to be archived through a new government-built records management system.

    Well, to be fair, email wasn't quite as popular during Clinton's administration as it is now. Then again, the 400GB of e-mails that the Clinton administration must have generated (if it is 50 times less than 20TB) must have been rather hard to store when he left office.

    • Re: (Score:3, Interesting)

      Well, most of that 400GB from Clinton's administration was dirty pictures of interns. In all seriousness, though, I don't think the problem will be finding a way to store all that data. The real kicker will be finding information you need in it. Seems to me like the best way to hide relevant and/or damaging e-mails would be to have them stored right alongside truckloads of chain letters.
      • Re: (Score:3, Interesting)

        It isn't storage and it isn't finding it, the problem is preserving it long enough to look through and index it. I'm sure that Google and companies that do similar work have the technology to do it. I'm also quite sure that for the right price the Federal government could obtain software to do most of the heavy lifting.

        The problem is that the Bush administration deliberately migrated only partially to a new system leaving it in a state of constant risk for bit rot and corruption. It's hard to say how much o

    • My knee-jerk reaction was the same. Then I realized the Clinton reference is probably there to provide some term of comparison.
    • Well, to be fair, email wasn't quite as popular during Clinton's administration as it is now.

      Good point. I mean hell, Al Gore had just invented the Internet the year prior. Cut the guy some slack.

  • by Anonymous Coward on Saturday November 22 2008, @09:24AM (#25857405)

    ...Now too many many emails.

    Whining is Washington's most favorite thing to do.

  • Text only, no html (Score:5, Insightful)

    by Teun (17872) on Saturday November 22 2008, @09:33AM (#25857449) Homepage
    Start by mandating text only mail.

    No more fancy signatures and html crap will cause a 60-80% drop in volume if not more.
    Mandate the Usenet way with replies after the original, (it will) teach people to cut irrelevant repeats.
    Stop the addition of stupid and ineffective disclaimers.

    Teach the use of (ftp) servers for sharing large documents, no more Microsoft sized attachments, send a link.

    • by ai3 (916858) on Saturday November 22 2008, @09:40AM (#25857487)
      I would rather buy another hard disk than waste precious time editing the mail I'm replying too, in most cases it simply isn't necessary. For Usenet it's a different story as many people read it, so it's worth the effort.
      • by malkavian (9512) on Saturday November 22 2008, @10:33AM (#25857761) Homepage

        Longer email threads seem to end up forwarded and brought to the attention of many people you never expected at the outset.
        Judicious editing of the emails to include only the relevant sections for the replies, giving the context of the emerging thread of conversation means that someone being brought up to speed with that segment of the conversation doesn't need to trawl through masses of irrelevant junk to get at the meat of the issue.
        I tend to do it as an efficiency gain, rather than taking storage space into account. All comes back to that quote you hear people come out with after sitting through a bad movie "Well, that's an hour of my life I'll never get back". It may only be a few minutes at a time, but they mount up over time. Plus, crafting things to cut to the heart of the matter puts things into sharp perspective, and means people are far less likely to digress, saving even more wasted time.

        • Yea, and there's an aesthetic feel to it too. If I'm in a 20 reply discussion, I like to edit out anything more than 2 exchanges old, and I change the subject title every two mails.

          Nothing annoys me more than 20 mails titled "re: call"

      • Re: (Score:3, Informative)

        When not on Slashdot we're expected to read the message we reply to.

        Deleting the bit that's already answered, not relevant or whatever can hardly be called 'editing', it has more to do with comprehension.

        One of the worst things for the latter is a typical corporate Outlook mail exchange (I know that word...) with at the bottom text that hasn't been read for the last ten replies.

      • Re: (Score:3, Insightful)

        140 1TB Hard disks (plus another for RAID) probably costs less than a couple of government office chairs so what's the problem?

        [Most likely the fact that it's in secret, proprietary formats and spread across hundreds of PCs instead of being archived by the mail gateway]

    • by Neoprofin (871029) on Saturday November 22 2008, @10:14AM (#25857669)
      If it anything like our corporate mail server I would bet you the number one space filler is people making minor changes on documents then reattaching them and forwarding them back to the same 50 people who just got the previous version of the document, repeated over 100 iterations as the email soon becomes a 2GB mess.
      • Re: (Score:3, Informative)

        Some database driven mail servers like Citadel, Exchange, Zimbra and probably Domino support only storing the message and attachments once no matter how many people it was sent to.

        It goes a long way in preventing the attachment * user mess.

          • Re: (Score:3, Informative)

            Your statement doesn't make sense. Exchange supports and automatically takes advantage of single instance storage right out of the box. What do you need 3rd party software for that disables it?

            I run Exchange on a NetApp SAN so everything gets deduped and archived to tier 2 storage if it hasn't been accessed within 90 days. Tier 2 is a lot SATA disks that are backed up to tape. It's not even an expensive solution when you start talking about the cost of enterprise storage.

      • Re: (Score:3, Insightful)

        If it anything like our corporate mail server I would bet you the number one space filler is people making minor changes on documents then reattaching them and forwarding them back to the same 50 people who just got the previous version of the document, repeated over 100 iterations as the email soon becomes a 2GB mess.

        In our organisation (government but non-US) we just give people a document management system and we educate people about why they should use it. If that doesn't work we point out it's policy a

    • That will never happen in the real world. People always and forever will reply above the original. You can't force the way people reply to emails, it's personal choice.
    • by kevin_conaway (585204) on Saturday November 22 2008, @11:16AM (#25858045) Homepage
      Did you stop paying attention to email in 1995?

      No more fancy signatures and html crap will cause a 60-80% drop in volume if not more.

      I know you hate it when your mom or the boss' secretary at work sends out a cutesy formatted email but some people can actually use HTML email effectively in lieu of sending a document or a link

      Mandate the Usenet way with replies after the original, (it will) teach people to cut irrelevant repeats.

      Irrelevant repeats for you may be important context for someone else.

      Stop the addition of stupid and ineffective disclaimers.

      Often times, those disclaimers are required by law. Most people don't add them for fun or to make themselves feel important.

      Teach the use of (ftp) servers for sharing large documents, no more Microsoft sized attachments, send a link.

      FTP? Are you serious? Sending documents by carrier pigeon is more secure and reliable than FTP

      • Stop the addition of stupid and ineffective disclaimers.

        Often times, those disclaimers are required by law. Most people don't add them for fun or to make themselves feel important.

        I don't know about the situation in the USA, but in most parts of the world, this is exactly the reason.
        "Because everybody else does it" is another.
        In multiple european countries, those disclaimers are entirely worthless, and even in some cases came back
        to bite those using them in court by proving that the sender was aware that some piece of information might
        end up in the wrong place.

        Disclaimers don't replace common sense or encryption.

    • Sheesh. I call this phenomenon "technological puritanism". All tech must be ugly! 80 columns should be enough for anyone! Fixed-width fonts were good enough for my granddaddy, they were good enough for me, and they should be good enough for everyone! Words are worth a thousand pictures! Get off my damn lawn!

      Nothing personal, but if people like you were in charge of the world, we'd all be living in gray, cast concrete cubes. Think of the efficiency! No more wasted paint. You can just make a bigger house by stacking the blocks and adding a ladder.

      Most of us *like* color, pictures, paragraphs, and most of all, convenience. Use FTP when I can just add an attachment that goes directly to the source? Give me a frickin' break. No one gives you respect points when you prove how miserably you can live.

      Let's put this in perspective... that 120 terabytes costs 12,000 dollars in hard drives. Retail at Fry's. The entire output of the Bush Administration costs less than what they probably spend on coffee in a month.

      P.S. And, yes, this is from someone who used a teletype in high school, and was ecstatic when we got a 300 baud modem (whoa! It's almost 3 times faster than the ol' 110!) and a Televideo terminal. Those days were not better.

      • But maybe I'm the decision maker at the end of the chain...

        Various departments work together on a proposal and eventually I am involved for the final say.

        It's not uncommon I ask for some information on how they have come to the proposal and I'm confronted with these weird -read-from-bottom-to-top- conversations, they're not conductive to a smooth process.

      • Madam Secretary would likely not know how to change from text to html, the issue starts with mail clients that are defaulting to rich text or whatever.
        Besides, the mail server can strip all html before storing, all it needs is support from a corporate policy.
  • What's up? (Score:3, Interesting)

    by Anonymous Coward on Saturday November 22 2008, @09:34AM (#25857459)

    It hasn't helped that the Bush administration has been slow in providing NARA with needed information about the types and volume of data that will need to be archived. It wasn't until this summer that an intensive effort began to share information, Thibodeau says.

    I can understand the reasoning that for national security, some information needs to be kept secret. The thing is, the more I hear of this administration's obfuscation of their communications and dealings, I can't help but wonder what in the World they are hiding.

  • Shadowy Government (Score:4, Interesting)

    by GMonkeyLouie (1372035) <gmonkeylouie@gmailYEATS.com minus poet> on Saturday November 22 2008, @09:36AM (#25857473)
    Whenever I receive news that information that we're supposed to have access to from the Bush administration has gone missing, it makes me queasy. There's so much secrecy surrounding random little things that it's started to make me paranoid. Maybe it's just me wanting to blame the last eight years on a scapegoat, but I feel like someone at the top is trying to hide something really big and succeeding.
    • Perhaps you're too young to remember, but Clinton's administration had a problem with missing emails [cnn.com] during investigations too (Lewinsky, why hundreds of FBI records on their political enemies ended up in the White House, illegal campaign donations from China, etc).

      I'd say it's par for the course and if you think just one side is doing shady stuff, it might be because you're a bit partisan.
      • Re: (Score:2, Interesting)

        Well, Clinton never tried to insist that his VP wasn't part of the executive branch, never tried to put Harriet Miers on the supreme court... Actually I think the shadiest person in the administration is Cheney. He's certainly one of the only members of the 2001 Bush team left, and he keeps so many secrets! Also, he shot a man in the face one time. I love adding that to the end of my Dick Cheney rants. Is that too partisan?
        • Well, Clinton never tried to insist that his VP wasn't part of the executive branch,

          It's called "Unitary Executive." That is, there's only one guy in the executive branch that gets to make the decisions. It's entirely up to the President how much of a role he gives the Vice President. Under George Washington, John Adams lamented that the only thing he could do was preside over the Senate and then, he had no say on anything unless there was a tie. It drove him nuts.

          If you read the Constitution, Article II groups the Vice President in with the executive branch, but the ONLY place it provi

      • by rtfa-troll (1340807) on Saturday November 22 2008, @10:38AM (#25857771)

        No; you are partisan when you think an accusation against one side can be answered by an accusation against the other side. They are both bad (they are US politicians; corruption is so endemic that it's legal and called lobbying), but Clinton's presidency ended about eight years ago and isn't something worth discussing now.

        The questions are; how to make sure Bush follows the law for what he still does? How to make sure Obama doesn't start off like Bush?

      • by Frosty Piss (770223) on Saturday November 22 2008, @11:31AM (#25858169)

        Perhaps you're too young to remember, but Clinton's administration had a problem with missing emails during investigations too (Lewinsky, why hundreds of FBI records on their political enemies ended up in the White House, illegal campaign donations from China, etc).

        Yes, but there is a magnitude of difference in importance between lost emails about blow jobs and a little dirty money, and emails about the loss of privacy and civil liberties of US citizens, torture of POWs, and the various other nastiness that GWB et al are suspected of. Much different.

    • by mhollis (727905) on Saturday November 22 2008, @10:49AM (#25857851) Journal

      I have understood this outgoing administration to be more than secretive. they're positively paranoid and the only administration in memory that was similar was Nixon. All internal memos have been classified first. Declassification only happens when there is a strong and abiding reason why the memo should be declassified. Contrast that with Clinton, where all internal memos are not classified, unless there was a strong and abiding reason why the memo(s) should be classified.

      When Bush announced that his administration would immediately prepare for a transition (before the 4th of November, which was election day in the US), I assumed that the first course of action was that this Bush administration would do what the last Bush administration did: [Rip] the hard drives out of their computers and tried to erase "sensitive" computer files in the White House and West Wing. [consortiumnews.com]

      To say that the Clinton Administration started with a "clean slate" was an understatement. Later, Clinton lawyers ignored the dangers of historical archive deletion when faced with Republican destruction of historical records. Presumably, they wanted a "pass" from future Republican administrations.

      Republican administrations tend to be very secretive. Democratic administrations tend to not. I shall expect the Obama administration shall have to purchase all new computers -- or at least hard drives -- in order to simply start up in their first week. This is a horrid waste of taxpayers' money all in the name of whitewashing one's past deeds (for good or ill).

      Due to record-keeping, we now know that Nixon did know about the Watergate break-in. And we do know that he was very interested in its coverup. Nobody can be prosecuted at this time for that (those who were found guilty have all ready served their time). I would be very interested to know if Reagan's CIA planted the stacks of AK-47s used as evidence by his administration that the attack on Grenada was justified. And we still do not know everything about the Iran-Contra affair. These historical records are worth keeping because, well after the Statute of Limitations, America gets another look at how an administration dealt with the world.

      It is a shame that any Administration is that interested in "rewriting history" in order to unfairly burnish a legacy, which in the case of "W" is hardly salvageable.

  • For hiding all your nefarious emails in the noise.
    Half those old geezers will be dead before anyone get's around to reading them.

  • Great, now they've got to deal with the same sort of things we do. Archiving every bit of email that comes into the system, and making sure it's available online for searching and retrieval.

    I'm interested in how they're going to be doing it. I've been looking at Global Relay [globalrelay.com] for my own mail archiving. I wonder what they'll end up going with. I asked this a while ago on my blog [blogspot.com], too.

  • Although this is about white house e-mails, this sort of stuff shows how ridiculous it is trying ask ISPs to record all traffic. At least here tax payer money is being used, but an ISP simply does not have that sort of budget. I feel all to often the layman confuses IT with magic and the people in the field as magicians. We are lucky enough if manage to become a level one mage :)

    • ISP's are to keep record of the subscribers contacts, not the messages them self.

      A rather big difference.

  • How much is spam? (Score:4, Interesting)

    by houghi (78078) on Saturday November 22 2008, @10:18AM (#25857689) Homepage

    How much of that is spam? I can imagine they are not allowed to delete spam. Spam has increased, so this would mean that all of it is still there.

    The rest can mean a lot of different things. I am forced to work (otherwise no food) with 150MB excel files that I would love to put in a database and would take up at least 10 times less space. And I am not even talking about speed increase and ease of use, because somebody else has the file open, so I can not change the content.

    Or perhaps Clinton did not keep everything. Or ...

  • The Bush administration moved the White House from a Notes/Domino based system to a Microsoft Exchange based system.

    Before moving, they'd had no downtime -- even when congress was taken out for 2 days by the code red word (they were on Exchange).

    In moving, they mysteriously 'lost' all their backups for a period of time that was suspicious as hell, and now they can't scale to handle the capacity issues they face.

    In a Notes/Domino world, this kind of archiving problem wouldn't be all that hard to deal with. You'd just need enough storage for it, and create archives per week/month/year (or an archive per individual's mailbox, or whatever) to put on as much hardware as was required. I single checkbox would be all that was needed to have it encrypted as well.

    Oh well. I guess if conveniently "loosing" mail when you don't want it found is one of your design goals, than you probably want to migrate to something less reliable.

      • There's an inherent architectural difference between storing mail in a database built on Microsoft's JET technology, and one which stores its data in something that is (although distinctly odd) very much like an xml data store. The Domino architecture makes segmenting the archive into manageable parts by date, by person, or by any combination thereof much simpler.

        Essentially, the Domino architecture results in exactly what you describe -- throw more storage space at it and you can keep storing more data.

        • Re: (Score:3, Informative)

          The theoretically competant staff with perfect hindsight and unlimited budget could just alias all the emails to also go to a decent system that actually works properly (and can be backed up easily) - however release cycles with MS Exchange are fairly short and the advertising is so good that people could be convinced that it is a half decent system THIS time. Backups have been a horrible problem with MS Exchange for years and it just seems bandaids have been placed over the problems to keep things going i
  • Dear staff (Score:5, Insightful)

    by keraneuology (760918) on Saturday November 22 2008, @11:01AM (#25857937) Journal
    It has come to my attention that as I prepare to leave office my previous instructions to make all email and other documentation available to the shredder was incorrect. The correct policy is to make everything available to the archiver. If you have any concerns please feel free to pick up a copy of the standard presidential pardon boilerplate from my secretary's desk. Thank you, W
  • Self-proclaimed "Most Advanced nation on earth" that doesn't have enough hard drives ... my ass.
  • See what happens when you keep sending that same Excel spreadsheet back and forth to the whole distribution list?

  • by omb (759389) on Saturday November 22 2008, @11:46AM (#25858267)
    As with almost all problems where electronic/internet technologies bump into real life issues eg privacy, non-repudiability and simple confidence it is because the Law has not kept up with technology, and that in the USA is the responsibility of the Congress. Writing was thousands of years old, and the printing-press more than 300 years old when the Constitution was adopted in September 17, 1787. The drafters understood the technology.

    Today we are blessed with ignorant self serving legislators who do not, and are far too happy to follow hard-case makes bad law hurd thought, eg children, porn, paedophilia, drugs and terrorism. The courts have long held that you can read post-cards, but that if your letter-in-an-envelope is opened then a felony is committed or the information is normally in-admissible.

    For this to work people have to start encrypting and signing their e-mails and the Congress and the SCOTUS must enforce identical rules for electronic and hand-written communication.

    Specifically you can not go out and discover the entire contents of someone's library and papers in a law suite, and expect to go on a search-engine enabled fishing expedition.
  • by An dochasac (591582) on Saturday November 22 2008, @01:30PM (#25858917)
    At the current presidential email growth rate, NTFS isn't gonna cut it for Obama.
  • by RyuuzakiTetsuya (195424) <taiki.cox@net> on Saturday November 22 2008, @03:21PM (#25859699)

    Of course that happens when you embed the 1600x1200 raw image of dick cheney giving everyone the finger with each email

    • by jimicus (737525) on Saturday November 22 2008, @09:50AM (#25857529) Homepage

      Besides, only 140TB (or 20 TB)? That's child's play for any competent DB admin, never mind only about $2k worth of hardware to hold it.

      Assuming that none of it's been put into the archival system yet, that means they're dumping 140TB on it in one go.

      You index 140TB on $2k worth of hardware and come back to me when you're done. Hopefully I won't have died by then.

      • Re: (Score:3, Insightful)

        Maybe not $2k worth of hardware but $200k will do. Which is still peanuts in government terms. They probably spend that amount on paperclips and toilet paper in the pentagon alone.
        Honestly, storing and indexing 140TB of e-mail is a trivial task when you can apply a six digit budget to it.

        If their "archival system" blinks at the sight of 140TB of mostly text then it doesn't even deserve the name.