Hey, what is this switch for?
by Volker Weber
We've now learned more about the outage at 365 Main's San Francisco datacenter that knocked some of the Web's most popular sites offline. The latest theory: An employee, reportedly drunk, hit the emergency-power-off switch in 365 Main's Colo 4 room.
Want to hire an operator with a drinking problem?
[via Barry]
Update: Time and again it proves that good stories don't have to be true. :-)
Comments
and I always thought Bill would of done something like that...
Want to hire an operator with a drinking problem?
Should be readily available in vast numbers once Vattenfall have done their homework ;-)
What I don't understand is how all of these huge e-commerce sites have no "hot site" disaster recovery plan.
Think they started serving Guinness in datacenters? I wonder where they got the idea.
I investigated this a bit further, since we are particularly sensitive to power-outage problems around here at the moment, and it does seem that there may have been a more general power outage problem in that area of San Francisco.
Of course there would/ should have been redundant power systems and backup servers in place, but as someone pointed out elsewhere on that site, if someone has physical access to the servers and heaves the plugs out, there is not a lot you can do...
Which does beg the question; unless you have a duplicated 'nuclear missile keys' type arrangement, there is always going to be some one individual with sufficient access privilege to take out an entire data centre if they feel they want to - how safe are any of our web presences?
Vitor, I heard that Rob Novak was giving a presentation at the data center. Since the "hey, let's order 80 pints of Guinness" worked just great in Dublin, he thought he'd try that stunt again. The rest is history ...
Don't mention 'that' button! Especially when architects and health & safety guys get together. Architect said we had to have one in our new Data Centre - H&S guys said it HAD to be at the door so could be pressed 'in an emergency' when the last person was running from the room (fire etc).
Only problem was, it was a single push button located right beside the light switches and door exit button..... and these guys have degrees!!!
Finally got it moved and still had a battle getting a double action set-up (lift cover and THEN press the button).
No amount of resilliance would cater for someone pressing the button. Even with a DC site failover.
Oh the joys......
Heard about this earlier from a colleague:
One cable collapses, plunging Barcelona into total darkness .
Puts the DC power button into the halfpenny place...
FWIW, the "drunken operator" theory has been debunked... mentioned, for example, here
Interesting week so far. First the shake (4.2) and then power-out's. What's next?
My company hosts at 365. According to the post-mortem we've received, one of the two backup-generators did not kick in sending all load to the second backup which subsequently overloaded when two other generators (of 10 total) failed. Sounds like they need a third backup. ;-)
Our power kill button was located behind the door, which had no stop, so if you opened it and forgot about it, it would swing and hit the button. Least for the first year it was never connected to the fuse panel, and finally moved when someone got sick of restarting servers.