Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Spam Communications The Internet Your Rights Online

You've Got Mail -- Tons Of It 249

Daniel Goldman writes "The Baltimore Sun has an article about the City of Baltimore's email problem." A snippet: "Millions of old e-mail messages are clogging Baltimore's municipal computers, so the city is going to start automatically deleting any messages older than 90 days. A common practice in private business, the move raises questions when made by a municipality, which has a responsibility to retain certain public records." Goldman points out "Just think about all the potential law suits; 'if it's not there, they can't subpoena it.'"
This discussion has been archived. No new comments can be posted.

You've Got Mail -- Tons Of It

Comments Filter:
  • Simple... (Score:5, Funny)

    by Anonymous Coward on Sunday June 06, 2004 @02:30PM (#9351595)
    Outsource each employees email to GMail. Problem solved.
    • by hype7 ( 239530 ) <u3295110&anu,edu,au> on Sunday June 06, 2004 @03:02PM (#9351786) Journal
      Outsource each employees email to GMail. Problem solved.


      yeah, and if the budget's looking a bit bad for that year, they could always put a few of the email accounts up on ebay.

      -- james
    • Comment removed (Score:4, Insightful)

      by account_deleted ( 4530225 ) on Sunday June 06, 2004 @03:16PM (#9351857)
      Comment removed based on user account deletion
      • Re:Simple... (Score:4, Insightful)

        by BasilBrush ( 643681 ) on Sunday June 06, 2004 @05:28PM (#9352556)
        That's fine. Disk storage is cheap. Certainly cheaper than paying hundreds of staff for the time taken to go through all their old mail sorting the wheat from the chaff. The right solution to running out of disk space for email is to add more disks.
        • Re:Simple... (not) (Score:4, Informative)

          by Zen ( 8377 ) on Sunday June 06, 2004 @05:50PM (#9352668)
          There comes a point where that, too, gets very expensive. At my company (large US healthcare provider, with governmental and private contracts both HMO and PPO), after saying 3, 5, and 7 years, our lawyers have told us we have to archive all email potentially forever that the end user doesn't specifically delete. They may do an end-run around the deletion and archive those, too, but I don't know. Anyway, our email system (Lotus Notes, which is an extreme HOG) eats somewhere between 100GB - 1TB/week. I was told it was well over 1TB, but I don't believe them. This is of course due to older Notes versions inability to store attachments in public directories and simply sending a copy to each and every recipient (and the stupidity of no size limits on internal email). There is a point to how many drives you can add to a SAN, and then you have to get a whole extra chassis, which is where the expensive part comes in. To keep buying new SAN units every 6 months or so, as well as the harddrives to put in them (plus the maintenance contracts, 24/7 support, etc) could easily add up to $1million/year or more. Which is definitely more costly than 10 average low-mid level administrator's salaries.
          • Re:Simple... (not) (Score:3, Insightful)

            by BasilBrush ( 643681 )
            The 10 secretaries in question were only using 1 GB each per year. 10GB per year in total. If your company is as large as you imply, the amount of work hours involved in sorting though old emails will be larger than that. Each person (or their PA) would need to do their own. That's a lot of hours.
            • Re:Simple... (not) (Score:3, Interesting)

              by Zen ( 8377 )
              Right. Our company originally tried to instate size limits when we went to Notes (only 3 years ago), but then the lawyers said we need to keep everything anyway (HIPAA requirements). So even with the exorbitant expense of the system, it is probably still cheaper to keep expanding every couple months rather than pay people to sit there and sort through their own email. Anything from an external party must be kept, and anything remotely regarding a customer must be kept as well. It's a huge pain, and they
      • Re:Simple... (Score:3, Insightful)

        by Doppler00 ( 534739 )
        How to save 90% of disk space:

        Sort all users e-mail recieved by size for a given year.

        Delete 5% of the largest e-mails. These will probably account for around 90% of all disk usage. They probably represent file attachments which should have been stored on a server instead of in an e-mail account anyway.

        Just think, when you mail a 2MB attachment to 3,000 people in a division, that could use quite a bit of disk space.
    • by mikael ( 484 ) on Sunday June 06, 2004 @06:50PM (#9352959)
      When I heard my city were outsourcing their garbage collection services, I imagined office blocks of staff in India sifting through online hex editors looking for spare memory blocks to delete.
  • Beowulf cluster (Score:2, Insightful)

    by b0lt ( 729408 )
    This might be a practical use of one, determine which emails are valid, and which aren't, like a spam filter. Allow users to flag 25% or so emails as important, and archive those.
  • by Animats ( 122034 ) on Sunday June 06, 2004 @02:31PM (#9351605) Homepage
    Either Google or the Internet Archive would be happy to archive that data for the City of Baltimore and keep it available for public reference.
    • And I'm sure they'd love to offer it up to the general public, as well. The question comes -- should all of it be public? I'm guessing that there are bits of it, which shouldn't be, and it's be more costly in the long run to try to analyze it, and determine what would have to remain confidential, then to just store it all in the first place.

      I'd prefer that people who are familiar with the actual data being stored make the determination if it should be publicly available.
  • by Dave419 ( 776133 ) on Sunday June 06, 2004 @02:32PM (#9351612)
    Since they need to delete tons of old messages spam included, but want to save official email, why don't they train a Bayesian Filter to sort through and save as much as possible. Since they can't rely on their employees actually saving each message which was official to their hard drives.
    • by Kenja ( 541830 ) on Sunday June 06, 2004 @02:42PM (#9351667)
      because even one false positive can get them in trouble?
      • You could have a manual meta-check on all the positives to make sure they aren't anything vital. Computers don't make mistakes, they just don't think like we do so sometimes it's necessary to sort through it all.
        • Won't work. The point of purging the 90+ day old messages is so that noone has to meta-check them for importance. Unless you want to hire a cadre of Trained Monkeys to look at the positives. It'd be a 1-banana job, and would have to pay bargain-basement peanuts.
      • "because even one false positive can get them in trouble?"

        You should probably go take a class on probability. When you're dealing with millions of email, there are going to be some false positives.
        What's the alternative, hand sort them?
        Yeah, that's a good idea right? But with bayesian filtering, you can do a lot of refining when you're dealing with millions of email.
        And who says that you need to use the same filters for the health dept and the transport dept.

        Jesus christ, there are lots of companies that

    • Since they need to delete tons of old messages spam included, but want to save official email, why don't they train a Bayesian Filter to sort through and save as much as possible

      Mostly, corporations do not want competent customer support (unless it's a big client).

      Why?

      - Truly good customer support costs money and requires that the CS people know what they're talking about. That costs money.
      - Corporations want script-trained "shooters" that can deflect resposibilty.
      - Corporations think that they
    • by Gonoff ( 88518 ) on Sunday June 06, 2004 @04:30PM (#9352230)
      We have to spend a lot of time telling people to **NOT** save to local drives. If it is important or confidential, or may be in the future, this should not be saved locally unless you want to loose it or explain to an enquiry why it was found on sale in a car boot sale after a break in. This is what a network is for.

      The answer to the problem in the article is quotas. *nix has them, Novell has them and even Windows has them. Our email quota works as follows
      Limit 1 - email user once per day marked high importance that they are getting close.
      Limit 2 - disable sending and continue with (2k) warning message.
      Limit 3 - disable receiving apart from one final message saying that it would all start working again when the user clears some space

      When they can't send/receive, they get a dialogue box reminding them when they try and when they can't receive, the sender gets a messge.

      This does make for support calls like...

      "Why does my computer tell me that the email is full up and I can't send any more?"
      "Because your email is full up. You have a message explaining this to you."

      "X tried to send me an email and it bounced saying that my mailbox was full up. Why?"
      "Because your mailbox is full up."

      • by Grell ( 9450 ) on Sunday June 06, 2004 @06:03PM (#9352725) Homepage
        3 warnings?

        Must be nice.

        I field 250-320 emails a week.

        All replys are "reply with history" often with screenshots as company policy and due to the complexity of the job. (3rd level insurance support w/ story problems galore)

        I have a personal storage quota of 75 megs,
        the mailbox I save to has a personal storage quota
        of 75 megs. (personal space? about 7% and holding, corporate box? 90-100% at all times)

        They cannot share, or transfer any storage quota from one user or resource to another.

        They will not buy *any* new drive space.

        They will not examine *any* redistribution of present drive shares. (like oh I dunno, *USAGE*)

        And the first warning we get that the drive is
        filling (about 3-4X a week) is the cannot write to
        drive warning.

        We delete somewhere in the neighborhood of 90 megs a week of pontentially subpeonable documentation and there is no plans in place, or even spoken of
        to correct this. (don't ask, don't tell)

        Save it to a local drive? No that would violate security protocols.

        Grell
    • by Raven42rac ( 448205 ) * on Sunday June 06, 2004 @04:31PM (#9352235)
      Five points for excellent use of buzzwords. I would say compress messages older than 90 days and save them. The government is not supposed to just willy nilly throw things away. I would invest in more hard drive space to hedge against lawsuits.
  • Why not... (Score:5, Insightful)

    by Izago909 ( 637084 ) <.moc.liamg. .ta. .dogsiuat.> on Sunday June 06, 2004 @02:34PM (#9351618)
    figure out what percentage is spam, and sue spammers to recover damages for lost resources.
  • by Anonymous Coward on Sunday June 06, 2004 @02:34PM (#9351619)
    can't they, like, just buy a big hard drive and stuff?

    If the average message is 10kib (10,000 bytes, make the math easier), and compresses down to 3kib (probably even better if you compress a bunch together), then you'd need roughly 30gib to store 10 million of them. Can you even buy hard drives that small any more?

    Add some search index, throw a crappy web interface on it, and call it a day. Never delete an email again!
    • The math is more like this.

      With gzip -9, you get either 20% or 10% of the original size of a text document (this is what the Internet Archive used for its archive of HTML when I was there).

      An index into a text collection is on the order of 10% - 100% of the size of the original collection, depending on what features you want to offer at speed. 10-50% is a reasonable size.

      So for 10M messages at 10k each, assuming the compression ratio above (which might not hold for MS Word attachments - a big caveat) you
    • by ThisIsFred ( 705426 ) on Sunday June 06, 2004 @02:54PM (#9351746) Journal
      can't they, like, just buy a big hard drive and stuff?
      Here's the problem with this: The longer the stuff is retained, the more expensive it gets to hold on to it. IT is usually a very low budget priority to government agencies, so it's going to be hard to purchase high-reliabilitly mass storage devices every couple of years. Since the goal is permanent archival, cheap, high-cap ATAPI fixed disks are going to be the
      last thing you want to store the stuff on. The other issue is that the user of the mailbox has complete control over the contents, so retaining everything is going to be really difficult to do, and accidental deletion will be a very credible alibi.

      There are rumblings about FOI and permanent archival among my Governmental Overlords, so I'm thinking hard about potential solutions to the problem. Trust me, it's very complicated issue, more so than I care to illustrate here (especially considering my habit of rambling on).

      The simplest solution is responsibility. If it's official policy, it's on dead-trees and filed away.
      • what about cds as archival media?
        something about breaking down, but is that real?

        then there's dvds and magneto-optical (my personal favourite)
    • The problem is, the average e-mail is not necessarily 10kb. While HTML can be part of the problem by making e-mails several times bigger than need be, my experience is that large attachments are generally the biggest culprits. A 20Mb powerpoint presentation sent by pointy-haired manager to all his minions can easily swamp the system. And trust me, there are plenty of clueless managers out there sending out Very Large Attachments. I've received 50Mb Excel spreadsheet once, which contained nothing but a singl
    • They're estimating that it would cost $250K/year in management and hardware to expand their system. Assuming that half of that's hiring a good sysadmin (:-), the only way I can see them spending that much money just for expansion is that they must be running some clumsy proprietary mail system - probably MS Exchange, or possibly some antique from IBM (worst case = PROFS.) Exchange has the advantage that it encourages bloatmail - sending attached Word documents instead of simply writing text, or sending Po
  • by Richard_L_James ( 714854 ) on Sunday June 06, 2004 @02:35PM (#9351629)
    You wouldn't expect a public office to hang onto every piece of paper, so why should they be expected to hang onto every email they have ever received?

    There are always going to be things like replies to an original question and subsequent follow up questions going back and forth, so normally hanging onto the latest/final reply would be sufficient (providing it had the previous history - clearly showed the conclusion).

    Now if they were to use this as an excuse to accidently lose records that would be a different matter. This however is where auditors should be playing a role to ensure that they are keeping the right records and discarding the rubbish.

    • There are always going to be things like replies to an original question and subsequent follow up questions going back and forth, so normally hanging onto the latest/final reply would be sufficient (providing it had the previous history - clearly showed the conclusion).

      I'm old-school when it comes to email (probably because I've been a BBS sysop who had to worry about bandwidth consumption), but you've touched on one of the two big problems with most corporate email cultures:

      • Top-posting a reply, whil
      • Any HTML at all is unacceptable in an email. And you forgot the most egregious error of all, word attachments.
    • Try 30 days (Score:2, Informative)

      by Anonymous Coward
      Though I don't work in the auditors office in my state, here is what they implemented. Any document (digital or not) over 30 days must be made public. Solution, any e-mail over 30 days is deleted. It allows them to not worry about keeping all e-mail till the end-of-time and not worry about making e-mail public. Great solution in that scenario.
  • incremental backup (Score:5, Insightful)

    by Anonymous Coward on Sunday June 06, 2004 @02:36PM (#9351632)
    "Baltimore officials, who approved the new e-mail policy at a Board of Estimates meeting last month, say they have no choice but to delete old messages, which are slowing city computers to a crawl. They say the system is so overburdened that creating a daily backup has become impossible; there is so much data that it takes more than 24 hours to copy it."

    What?!? What's wrong with an incremental backup? Surely all those millions of messages aren't *changing* every day?!?

    Think of all the children that will suffer from this!!!
    • by Piquan ( 49943 ) on Sunday June 06, 2004 @04:10PM (#9352120)

      What?!? What's wrong with an incremental backup? Surely all those millions of messages aren't *changing* every day?!?

      That depends on how their email system works. If it stores each user in a single file, then that file is changing every day. If they're using a file-based backup system...

      • Sure, but saving the user's copies of the e-mail is friggen retarded. It is much easier to store the incoming and outgoing streams of e-mail on the server and use logrotate to create a new file every week, or every day even. As soon logrotate moves a file to backup, it won't change anymore - ever.
        With Postfix, use always_bcc to forward all outgoing mail to a user called outlog, then use procmail to save all outgoing mail to a log file.
        Likewise, procmail can save all incoming mail - after the crap filte
    • Just print the damn things out and file them. Anyone who wants to subpoena them had better have a fleet of trucks and hundreds of spare staff...
  • by buelba ( 701300 ) on Sunday June 06, 2004 @02:37PM (#9351642)
    There are two technical culprits here:

    1. On-line storage. There's no reason to keep all of everyone's mail on-line on the server (a la IMAP or proprietary MS Exchange) instead of offline on their PC's (a la POP, most often seen with Eudora for non-techies). With offline storage, the servers don't clog, and you can keep as much mail as you like.

    The biggest rap agains off-line storage is that you can't control what people do with their mail or how they store it. My old job had a neat solution for this: Eudora downloaded your mail, but stored it on a file server. Each employee had 100 GB or something very large. It worked great; the SMTP/POP servers were never full, and everyone could keep their email.

    2. Ridiculous stupid bullshit HTML rich-text mail crap. Can you tell I have a bias here? Aside from being annoying, HTML mail can take up to ten times the size of plain old text. Some of the HTML generated by common email programs is just terrible; filled with repeating tags for every line, and just wasting an incredible amount of space for absolutely zero benefit. (Outlook is bad, but there are others that are just as bad.)

    There's no excuse for not fixing these problems. Someday someone's going to tell a court they had to delete mail for these reasons, and someone else is going to explain exactly why they're wrong. Until then, people who want to delete mail for legal reasons will hide behind false technical reasons.
    • by Anonymous Coward on Sunday June 06, 2004 @02:48PM (#9351708)
      1. On-line storage.
      Actually, storing the messages on local computers in an organization is about the worst thing to do. Most/all user computers are not backed up the way the servers are.

      For legal requirements for some organizations, various backups must be maintained. Just because the active mailstore does not maintain messages older then X days in it does not mean that the data is lost forever (and thus, subpoena-able).

      To do this right, first, the City needs to create a policy that establishes that active e-mail messages will not be retained in the "inboxes" more than 30 days. They should also set up mailstores for everyone in a different area on the same or different server (but NOT to user PCs. they need to define a policy against this, also, because user computers can be subpoenae'd, so if a user has been retaining e-mail messages on their own computer, this could undermine the overriding policy, aka "Smoking Gun").

      HTML/Rich-text e-mail messages
      No argument there!

      It is LEGAL to not retain e-mail messages past a reasonable amount of time as long as there is an organization-wide POLICY in place and reasonably applied over the entire organization, but the policy has to be in place first.

      There is lots of information on the net about this already. I would maybe google for "email retention policy"...
    • Eudora downloaded your mail, but stored it on a file server. Each employee had 100 GB or something very large. It worked great; the SMTP/POP servers were never full, and everyone could keep their email. Why couldn't you have a 100GB account for each employee on the mail server instead? - what is the bloody point to get mail from one network server and move it to another one?
      • Why couldn't you have a 100GB account for each employee on the mail server instead? - what is the bloody point to get mail from one network server and move it to another one?

        Lots of reasons. First, mail servers just don't work very well when storing large quantities of mail for large quantities of people. I've never seen one that works well. If I'm wrong, please tell me. Second, the file server model is much more flexible: you can spread the accounts out across lots of file servers, but still have o

        • by imroy ( 755 )

          Check out project Cyrus [cmu.edu]. I haven't used it for large projects, but I notice it does support distributing mailboxes across multiple backend servers (The Murder stuff).

        • by vladj ( 716394 )
          Try CommunigatePro [stalker.com]: it's not open source but extremely reliable and flexible and can handle huge volumes of mail, and can do clustering over multiple servers too. I've had experience with it on couple of large-scale installations, 50K accounts+ with millions of msgs per day.
  • wrong approach (Score:5, Insightful)

    by yppiz ( 574466 ) on Sunday June 06, 2004 @02:38PM (#9351647) Homepage
    This has to be the stupidest approach to the problem. Their networks are too slow, so instead, they're going to have each employee go through their old email and save individually important messages to their local hard disk? Not only are they going to tie up employees with this manual effort, they're also going to lose key documents and a key service - the ability to centrally search and reply to requests for information. In the future, each department will have to search their local hard drives for this information.

    They've taken a simple problem of old or improperly speced equipment and turned it into a manual labor solution instead. That's an insane waste of time and salary. They should just upgrade their network and storage. If I can build a 4 terabyte RAIDed PC for a few thousand dollars, they can centralize their mailserver and back it up for say a hundred thousand, even with extra redundancy and inefficiencies and admin costs.

    By contrast, forcing every current employee to perform a task that would eat up weeks of time per employee per year, in a city of Baltimore's size, will cost tens of millions of dollars.

    Dumb, dumb, dumb.

    --Pat / zippy@cs.brandeis.edu
    • The problem is that using MBOX format for mails, it -really- bogs down the mail server when it has to parse that whole thing.

      Moving older stuff into folders (that are still on the server) would probably make more sense.
  • Seems like a perfect application of Google's mail [gmail.com] technology. Baltimore has tons of mail which needs to be searchable. Google has a scheme for holding and searching large quantities of mail. Plus, even the spam is worth keeping, should Baltimore decide to file suit against someone for attacking the City's technological infrastructure by flooding their servers with spam. Scott Richter, I'm looking in your direction.

  • Blatimore should offer some muncipal service, say free waste disposal for a year in Baltimore, per gmail swap [gmailswap.com]. Problem solved.
  • Temporary Fix (Score:5, Insightful)

    by Rie Beam ( 632299 ) on Sunday June 06, 2004 @02:41PM (#9351664) Journal
    Backup all e-mails from the last 4½ years into permanent storage, and then from there, get organized. Put spam filters on, force people to sort any important mail or else it gets deleted after, say, two weeks. People always seem to want to "start from scratch". without looking at the situation rationally. Five years of documents, gone overnight. How can anyone not be at least outraged by that?
  • by Anonymous Coward
    I'm posting anonymously because this may risk my relationship with my employer.

    We see old e-mails as a resource to be harnessed and turned into profit. Thanks to old e-mails we can ensure that no employee leaves with a spotless record since everyone always e-mails something incriminating sooner or later from the company e-mail address.

    We also find that the e-mails are great for data repositories; we fill all of our databases with text and when our clients come in, we tell them that those data warehouses
  • by jafo ( 11982 ) on Sunday June 06, 2004 @02:46PM (#9351700) Homepage
    The spam problem is unlikely to go away until people start treating it like the attack on the Internet that it is.

    I've noticed an annoying trend lately that e-mail sent to businesses is frequently getting just ignored. Certainly it seems much more frequent this year than in the past. I've wondered if this is simply because so many e-mail boxes are getting filled up as fast as the spammers can send.

    I'd suspect that the city of Baltimore wouldn't be having any problems if spam weren't such a problem. If the number of messages they had to deal with dropped by 5 to 20 times (depending on which estimates of current spam levels you believe), they could probably just leave the mail where it is.

    This is all something I've been struggling with, being a small business owner doing business on the net. My company of 5 people gets between 4,000 and 20,000 borderline spams per day. By borderline, I mean that we throw away obvious viruses and things which score above a certain score in SpamAssasin (I think it's 9). So, that doesn't count the super spammy messages.

    If it weren't for our fairly strict and complicated spam blocker setup, and a very powerful machine, we couldn't get the few hundred messages per day that are of interest to us. Spam is killing e-mail. I'm not sure why more people aren't treating it as an attack, but it's really hard to get anyone's interest to take some action. Canceling accounts doesn't even begin to solve the problem.

    In the mean time, the City of Baltimore is suffering...

    Sean
  • by DAQ42 ( 210845 ) on Sunday June 06, 2004 @02:48PM (#9351709)
    to dump it off to tape and then just store the tapes instead of just deleting it. Though they are probably running an Exchange server so offloading data stores wouldn't be the easiest thing to do. If they were using something with a simple mbox store, they could easily just parse it through a date filter and dump the older than 90 day stuff to tape. At least then it could be retrieved at a later date.

    Oh, wait, let me guess, they aren't using tape backups...
    • to dump it off to tape and then just store the tapes instead of just deleting it. Though they are probably running an Exchange server so offloading data stores wouldn't be the easiest thing to do. If they were using something with a simple mbox store, they could easily just parse it through a date filter and dump the older than 90 day stuff to tape.

      It's a lot tougher on backup systems to deal with mbox systems, because every time a flag is changed in a mail or a mail is added, the entire mailbox is ad

      • Re-read my comment.
        If they were using something with a simple mbox store, they could easily just parse it through a date filter and dump the older than 90 day stuff to tape.
        If the creation date is older than 90 days, off load to tape and delete original. End of story.
        Sure, means they can't go through stuff older than 90 days, but if they need it, restore from tape. Geez.
  • Archive the whole lot of it, and/or copmress it and store it. Don't even try to sift through it all. If and when it is needed, then get it out and pay somebody to sort through it.
    Then it's not clogging anything anymore, and also it's there if you ever need it.
  • by swb ( 14022 ) on Sunday June 06, 2004 @03:02PM (#9351785)
    OMFG, we nearly had a lynch mob attack us when we began deleting mail older than *two years* -- it eventually took the intervention of the CFO and a faked mail system "crash" to make 2-year max retention work, and even then there are people still pissed about it, or who claim that "the client" requires them to retain all correspondence (nope, sorry, we checked the contract).

    90 days seems both unrealistic to implement and way too much reliance on .PST files, which often max out at 2 gig and can get corrupted way too easily, not to mention being fdisked into eternity by clueless helpdesk people.

  • I question wether the article author understood 'take offline' and 'delete' as the same thing, though they are so very different.

    Data is valuable, and Sysadmins know it. (Values such as when combating a lawsuit as the poster suggests or for trend analysis, contact information, or other historical purposes.)

    That said, hard drive space is inexpensive and archiving to optical medium is even LESS expensive. When 47 GB of DVD media can be had at Target for less than $10, it makes NO sense to destroy this data.
  • So, what's new? The community I live in is famous for losing hard copies of just about anything you can imagine. I'm not sure whether it has really been lost or whether it was decided that it was better to be without certain possibly troublesome papers. So now they're intentionally losing stuff in order to avoid being drowned in cruft. It just goes to remind that whenever you deal with authority, you should keep hard copies of your correspondence :)
  • by Anonymous Coward
    Where I work, which is government, we only keep backups for about a week, since they are public records.

    We don't want someone to be able to request something from backups that the user thinks is gone.

    This way it's up to the user to decide if they want their data archived. And the onus is on the user to comply with however long the data is supposed to be kept before being destroyed.

  • Problem with email (Score:3, Insightful)

    by sydbarrett74 ( 74307 ) <<sydbarrett74> <at> <gmail.com>> on Sunday June 06, 2004 @03:14PM (#9351847)
    This highlights a fundamental problem with email -- many people pass documents as attachments, or in the body of the email, instead of using email as a sort of metadata describing their works in progress. Documents shouldn't be passed around in email; they should be stored on a network share, where proper controls for mutual exclusion and such can be employed.
    • I back you 100%. I can not seem to drill into the heads of my customers that they can not be sending files by email. We have tried everything, we have given secure drop boxes to delver files to people, we have put up a public access scratch drive (w/ 7-day auto delete) to dump jokes and such, we have posted step by step instructions to make links to files in an email. We limit live mail boxes to 100 MB w/ a nag screen and 200 MB send privleges taken away. We limit the attachement size to 30 MB external
  • I am not sure if they can use email as official communication? There would be problems with repudiation ("we never received it"), privacy ("someone intercepted it who was not supposed to") and authentication ("it wasn't me who sent it, it was my dog"). Can they use an email in the court then? What would have to be done is to have all the messages signed and encrypted with a public key, and perhaps have some way for the sender to get a receipt back when reciever reads the message.
  • According to Georgetown University [georgetown.edu] that I found regarding retention of records in Maryland:

    "8. Has any public records legislation/administrative regulation been proposed calling for "permanent public access" to electronic public records? _x__ Yes ___ No a. If "Yes," cite to and briefly discuss the legislation/proposed regulation; what was the outcome? Arguably, Maryland has such a provision in MD. REGS. CODE tit 14.18.04. Certain electronic records may be considered "permanent electronic records" in they
  • by Ohm2k ( 262274 ) on Sunday June 06, 2004 @03:23PM (#9351883)
    Working at a law firm we have to keep everything for 7 years. We have a system in place that takes all mail over 90 days old pulls it out of exchange and move it to the SAN. As a plus it puts a link back into the information store to make it look like the message is still there. User wants a Old message he can still get it himself w/o a IT person having to do dig up a tame, restore the file and the e-mailing it to him (Thus creating MORE mail). The messages are still searchable and it makes retrival when needed a snap.

    Mind you, we are only a 700 user shop. But nothing gets deleted. If it gets buy the spam filter it gets saved.
  • by Flounder ( 42112 ) * on Sunday June 06, 2004 @03:23PM (#9351885)
    I work in the IT department of a county close to Baltimore. Our server can retain e-mail indefinitely (there is a space limit per mailbox, but not a time limit). However, our backups only go back 30 days. This is stipulated by the county lawyers.

    As far as I've been able to figure out, this arose from a lawsuit against the county where an e-mail retrived from two years previous proved a county commissioner to be taking bribes in a zoning issue.

    Rather than fix the corruption, just ensure that it's covered up more efficiently. Gotta love local governments.

  • by jhines ( 82154 ) <john@jhines.org> on Sunday June 06, 2004 @03:25PM (#9351897) Homepage
    Once an actual human person has read and acted on the mail, they should be able to mark it "official business" and/or move the email into an "official business" folder which does get kept as required.

    Better procedures and training goes a long way here. These same folks have no problems with snail mail.
    • You mention a good point. Email messages have different purposes. Some are chit-chat, some worthless jokes, some spam, some messages external to the organization, some internal, some from your boss, some from your employees, etc. But it's all the same in the inbox, governed by the same disk quota and archival policies. Perhaps the file/folder dynamic for email is a little lacking (I think of IBM's Remail [ibm.com], though I'm not sure if this really hits the problem). We just recently got "junk" as a new categor
  • The way to deal with spam in a situation like this, where you may be legally required by state record keeping laws to archive records, including email, for a long time, is to only accept encrypted mail from the public.

    Actually, it doesn't have to be encrypted--any hoop that you can people jump through to mail you is fine, as long as it isn't something that spammers will be able to automate. For example, you could also use a randomly generated email address that changes frequently, and provide a website w

    • How much of the public though have the slightese clue about encryption?

      I once tried using X509 to everyone, but Outlook express just refuses to display the message and puts up a huge warning about a corrupt email (all other mailers handled it fine - OE just doesn't support X509 correctly), so I'd just get an email back that said 'your mail was corrupted and I couldn't read it'.

      PGP is worse. It isn't supported by *any* mailer widely uses mailer (installing an extra 'plugin' does not count - most of the pe
    • I think the last think the government wants to do in encourage MORE of it's citizenry to start communicating via encrypted mechanisms. The police would never allow such a policy to be enacted, it would severely limit their ability to "gather information to prevent terrorist and criminal activity" or some such bullshit.

  • by phr1 ( 211689 ) on Sunday June 06, 2004 @03:49PM (#9352026)
    Just dump the old email to DVD-R and archive it somewhere. If someone wants to subpoena it, burn off copies and wish 'em luck. Even if the city is getting a million pieces of spam a day, at 5kb each after data compression, that's just one DVD-R per day at a buck or so each, peanuts compared to what the city already must spend xeroxing memos for records retention purposes.
  • by crimoid ( 27373 ) on Sunday June 06, 2004 @03:54PM (#9352053)

    A better option would be to archive old messages rather than remove them entirely. From the article it sounds like they are keeping ALL messages active all the time. For example:

    "They say the system is so overburdened that creating a daily backup has become impossible; there is so much data that it takes more than 24 hours to copy it."

    So, it seems like the solution would be to periodically lop off old messages to offline storage (tape, spare drives, whatever). In the event of a lawsuit the old messages could be reasonably recovered and the cost for such a system would be extremely minimal.

  • by EconomyGuy ( 179008 ) on Sunday June 06, 2004 @04:01PM (#9352087) Homepage
    Unlike a legal office where communications are governed by extensive regulation, governments are really only required to keep records of official documents and decisions. The myriad of e-mails leading up to a decision are not generally protected under such an act, nor are snail mail or phone conversations. In fact, the whole idea of there being a digital trail to follow for governmental decision making is really very new. Does it makes sense to change that practice? Do we really think our government officials should be so closely watched that EVERY e-mail/phone conversation/smoke signal should be recorded and exposed to public scrutiny? Talk about making an unattractive job even less inticing.

    In responce to the posters question about all those subpoenas: welcome to the world of civil litigation, where the first one to destroy the evidence wins!
    • Yes. When a government official is supposed to be acting in the best interest of the people they should be subject to scrutiny at any level that is reasonably available.
      Storing older emails is a rather trivial issue of collecting, compressing and copying to an inexpensive tape or hard drive which can be archived. A 250GB IDE drive is quite inexpensive and could probably archive several hundred million emails, many more than the city is claiming it will delete.

      In a time when the government is fading furthe
      • I think spending a few years in government might get you to sing a slightly different tune on the question of scrutiny. In my experience most government employees are really trying their best to do what they can with limited budgets. When we start to place bureaucrats under heavy inspection (different from elected officials, who have a whole other set of methods for evaluation) we end up having employees who are more worried about how they appear than the actual quality of the work preformed.

        There is a c
  • They say the system is so overburdened that creating a daily backup has become impossible; there is so much data that it takes more than 24 hours to copy it

    I find that rather hard to believe. They only need to back up the new emails, then they can delete them at any time without actually losing them. I doubt they see many terabytes of new email every day. Nine times out of ten, any IT tech who says something is "impossible" is just lazy and/or incompetent.
  • ILM is the next big thing. Its the logical extension to the ever increasing SAN/NAS Server/Workstation exponentially-increasing-data problem (go google for pretenders to the law).

    You can't oversee growing data storage without a parallel increase in administration costs. Instead, the idea is to build automatic archiving into your storage architecture.

    In practice this means you build tiers of storage/archive methods. Tier 1 is a high tkt Shark SAN etc, Tier 2 is lower priced SATA RAID and Tier 3 is a DAS Tape Library. Build retention guidelines into the storage management playform (Tivoli etc). Older items are automatically moved to the Tier corresponding to that retention/access policy. Really old items "live" on Tape. Frequently accessed data lives on the high speed boxes near to the users/application. You snapshot updates to a DR replica offsite or burn periodic Tape sets etc. Its a good idea to team this with storage virtualization (virtual LUNS/ Metadata directory servers) and you can add/rotate/modify the storage tiers when necessary without any downtime.

    From a user perspective, you click on the link and if applicable, get notified the item is being retrieved from media x (its mostly transparent). Worse case - access times are in the minutes.

    Of course, all this comes with a high price. Enterprise Storage systems are not cheap. Recent legislated policy (Sarbanes Oxley etc) enforces the retention of some media (e.g. email). You cannot rely on end users to enforce data retention. This lets you mandate tiers of protection and is highly configurable to support per application monitoring.

    Nothing is foolproof. Its still being finessed but if you can afford it - its truly a thing of beauty.
  • Screw the Lawyers (Score:4, Interesting)

    by Detritus ( 11846 ) on Sunday June 06, 2004 @06:14PM (#9352776) Homepage
    At one company that I worked for, they got the brilliant idea to delete all email older than 30 days. They also didn't want employees to make backups of their personal mailboxes. They intentionally wanted all traces of old email to disappear. While I'm sure that it made the lawyers happy, it caused a lot of grief for the people actually doing work for the customer. Many design decisions, bug reports and other important things were only documented in email messages. This is supposed to be the age of the paperless office, right? When you are involved in a multi-year project, you often need to refer to old messages. It also had the effect of making old policy memos disappear, whose existence had proved to be very inconvenient to management on several notable occasions.
    • Re:Screw the Lawyers (Score:3, Interesting)

      by geekoid ( 135745 )
      I worked at a company with that policy.
      Then one day, in a meeting with VP's, a manager tried to put me on the spot, and use me as a scapegoat with some bold face lies.
      I'll never forget the look on his face when I produced hard copies of our email exchange...
      ahh, memories.
      I also got the VP to change the email policy.

If you want to put yourself on the map, publish your own map.

Working...