Forgot your password?
typodupeerror
Government United States IT Your Rights Online

White House CIO Describes His 'Worst Day' Ever 333

Posted by Soulskill
from the need-a-bipartisan-effort-to-swap-out-a-machine dept.
dcblogs writes "In the first 40 days of President Barack Obama's administration, the White House email system was down 23% of time, according to White House CIO Brook Colangelo, the person who also delivered the 'first presidential Blackberry.' The White House IT systems inherited by the new administration were in bad shape. Over 82% of the White House's technology had reached its end of life. Desktops, for instance, still had floppy disk drives, including the one Colangelo delivered to Rahm Emanuel, Obama's then chief of staff and now Mayor of Chicago. There were no redundant email servers."
This discussion has been archived. No new comments can be posted.

White House CIO Describes His 'Worst Day' Ever

Comments Filter:
  • by gimmebeer (1648629) on Wednesday March 14, 2012 @12:21AM (#39348629)
    The problem is the procurement process. It takes a hell of a long time to get IT resources ordered, and often by the time they are actually put into service half of their warranty life-time has expired. It has nothing to do with a lack of knowledge on the OMB IT front, it's got everything to do with the red tape they have to cut through to make anything happen.
  • Re:Floppy Drives! (Score:5, Interesting)

    by kenh (9056) on Wednesday March 14, 2012 @02:33AM (#39349433) Homepage Journal

    OK, perspective is called for - Obama took the White House in 2009, up until 2009 HP had floppy drives STANDARD on business desktops - so as Obama took the White House, HP was still shipping floppy drives as STANDARD.

    Yes, sitting in 2012 we can all agree that floppy drives have been obsolete for years, but in 2009 HP was still shipping them as standard.

    The note about Dell Dimensions is nice, but those are "home" computers, not "professional".

    And that 6 year-old software? I can guarantee you it was Office 2003 - sure, as Bush was preparing to leave office his staff certainly could have gone around and upgraded everyone to the latest/greatest version of office (Office 2007), but it is now 2012, and the latest version of Office on PCs is 2010 - does that have 100% market penetration, or are there a few stragglers on 2007 or even 2003?

    Maybe, like most office users at the time, the Bush White House wasn't a big fan of the ribbon interface introduced in Office 2007 [wikipedia.org]

  • Re:Not a bad number (Score:4, Interesting)

    by nahdude812 (88157) * on Wednesday March 14, 2012 @08:30AM (#39351083) Homepage

    Gates: We need five nines of uptime
    Ballmer: Engineering, we need 9 + 9 + 9 + 9 + 9 uptime.
    Engineering Manager: Guys, our uptime goal is 45%
    Engineering: We already deliver about 72%.
    Engineering Manager: Steve, we actually have 9 + 9 + 9 + 9 + 9 + 9 + 9 + 9 uptime!
    Ballmer: Bill, we're so stable we have 8 nines of uptime! Let's see our competitors beat that!
    Gates: Great Steve, let's add some more bloat and see if we can bring that number down some so we leave ourselves with room for improvement.

  • by Anonymous Coward on Wednesday March 14, 2012 @09:03AM (#39351311)

    This article is partially correct but leaves out the actual technical issues involved.

    Someone *from* that Datacenter here at that time. Here's what really happened.

    The old administration did not care about the existing IT infrastructure because they were on their way out. They wanted no changes made- just that things be left up. Yes the email system was old and past EOL, but the outages were really the perfect storm of everything that could hit the fan actually hitting the fan at the same time.

    The facility was doing work on the power system- the UPS to be specific. Somewhere along the line they messed up, and cut the power. *All* of the power. Datacenter goes dark. They brought the power back up, but then tripped it again before bringing it up for good. This detail is what caused the weekend of hell.

    The SAN that the clustered email servers (yes, clustered, they *were* redundant) had the stores on was an EMC Symmetrix. It has a built-in battery backup system so that if the SAN looses power it has enough stored to flush the cache to disk. The power going off started this process. The power going back on triggered the response to stop flushing the cache and start checking and rebuilding. Then the power went off again. This is the part where the specific details get hazy but in effect the SAN did not like this. I don't believe it had enough power to totally flush the cache and/or it did not have the logic built in to handle an outage while in recovery mode. The result was a downed SAN that *would not come back up*. Now all of the data was down and nothing could be done but wait for the vendor to show up and try to fix it.

    At the same time we were dealing with *every* server being off and having to come back up. There were hundreds. Luckily most did. Some did not. Some were important, such as in the case of *both* the servers in a clustered system that would not boot- which just so happened to be the system that some of the say "more important" VIPs were on. These were old systems running Exchange 2000 on Windows 2000. Long past due, but kept up by the staff since the EOP would not approve a new email infrastructure.

    Eventually the systems would be restored and everything would be back on-line. In the meantime though Brook thought it would be a good idea to spend untold amounts of money to bring in MS Engineers to look things. They cost a lot of money and made a bunch of reports but they didn't fix a damn thing. The staff that was already there found the issues with the servers and fixed them.

    There were later headaches, such as when mentioned that the Sonnet was cut (thanks Verizon!) and further SAN maintenance but that was the weekend from hell.

    Things to note:

    • There has been 24x7 NOC (Network Operations Center) for the EOP data center long before the current administration.
    • There was a DR (disaster recovery) data center. It wasn't *great* but it was there. Due to the SAN outage and estimated time to fail over it was determined by those in charge that the best call was to repair instead of failing over.
    • The "some previous experience" listed for Brook was *all* of his previous experience.
    • The GOALIE position is a joke. They took perfectly good technical government people off of doing technical work and put them in a useless role basically overseeing time sheets. Unfortunately things were now slower because changes had to go through the GOALIE's who generally didn't have quite the overall end-to-end system expertise to make final decisions.
    • Brook really wanted to push a "mobile desktop initiative" that was a joke. He wanted the remote experience to be "just like working in the office" with requirements of the laptops being encrypted. Let's disregard that this can never happen if not simply for bandwidth constraints. But still, they tried. Vista SP1 would have been perfect for this (because of bitlocker), or hell even waiting until Windows 7. But no, Brook just said "no" to Vista because it was Vista and forced the engineers to

The clearest way into the Universe is through a forest wilderness. -- John Muir

Working...