Perhaps the server was mad. Maybe it was seeking to avenge its sibling that got its fans cleaned. Or possibly I just have the worst luck of anyone alive. Regardless, one of the Arthmoor servers decided to kill a harddrive. Dead. As in click of death dead.
The irony here was that I was just about to make a global backup of all of the MySQL databases. I think it sensed my intent. The command issued, all the files were created, and then nothing. The thing began clicking and seconds later the kernel panicked and the system rebooted. It became stuck at the BIOS screen, unable to reboot.
The timing on this couldn't have been worse. The middle of the night, on a Saturday. I've got a voicemail in to an emergency data recovery place that will not void the warranty on the drive but I'm not sure I'm too comfortable with the fact that the call went to voicemail. But as the last backups I took from the site are months old, I don't really see there being much of a choice. Damn life for not providing me the time to do this stuff properly.
I'm not particularly harsh on computers. It's not like I torture them to death with SETI folding or any of that other CPU-intensive crap. The servers lead decent lives as far as hardware goes. So why should a drive that's barely 6 months old just up and die? Is it too much to ask companies to make quality equipment anymore?
So anyway. Until I can get this resolved, the following sites will be offline:
www.arthmoor.com
www.smaugmuds.org
www.iguanadons.net
www.alsherok.net
Server02 for the MudBytes IMC2 network
My mud
And several other sites belonging to others.
Another 30 seconds was all I needed too.
Keywords: bad luck, click of death, harddrive, poor quality, western digital



Comments
Wow that really sucks and to happen on the weekend when nothing can really be done to recover it... have you tried the old freezer trick? I've recovered more than 1 harddrive with that. Also Spinright from GRC.com does wonders on a drive that is clicking. I've had it recover enough for me to boot and transfer data a few times. Doesn't solve the problem but sometimes will give you a little time.
What timing.. I had the same thing happen, just not quite as badly, to my server on Thursday night. My server is still up, so the static web pages and muds are still running, but anything that uses MySQL is returning errors due to the journaling system having gone into read-only mode until I can get home this evening to reboot the server and start seeing what I can do about replacing the drive that bears my /, /boot, and several bad sectors. *sigh*
In any event, good luck Samson, at least on mine I know I can still get it back up to recover the data and replace the drive. *sigh*
Well apparently emergency service doesn't even mean what they say. I've yet to be called back on the emergency number. Weekend or not, holiday in the morning or not, if you're going to advertise emergency service it damn well better happen. I'm not thrilled with the potential prospect of having to wait until Tuesday to get this done.
No, I haven't tried the old freezer trick. The drive heads are physically damaged in some way, I took it out last night and just turning it on its side you can hear the things jiggling around in there. Normally that's not possible. So whatever happened it was bad. This is leaving a really sour taste in my mouth about the state of customer service and product durability.
To top things off, my Windows box has a bad fan in the uber-expensive video card that spins up and vibrates like mad until it gets warm. Someone somewhere is trying to tell me I should just bring in a huge magnet and put and end to all of it. :(
Well the first verdict is in. According to datamechanix.com the "servo track" on the drive has either been corrupted or destroyed. He couldn't tell me which. For what it might be worth, the drive is mechanically undamaged. It's a media issue of some sort. He apparently wasn't interested in telling me what that meant, just that he couldn't recover anything from it. Cosnidering he was nasty, rude, and didn't want to be there on a Sunday, I'm not inclined to believe him. So I'm seeking a second opinion from a couple of other places.
But, chances are, this is going to turn out to be a rather devastating loss since what little I do have backed up is too old to be of use, and three sites have no backups at all, including my own blog which I'm now faced with having to recover from Google's cache.
Yes, I can hear the cries now. Why didn't I make backups regulary? To tell you the truth, I don't know. It's one of those things you always intend to do, but somehow never get around to doing. I've had drives fail before in the past but always had >90% of the data stored elsewhere and recoverable. But, that was before we all became overdependent on databases to store content. Right now it's those databases that are preventing a total recovery. I can restore the HTML pages. I can restore the lost C++ code. I can even restore some of the pfiles from our own MUD. But without those databases there's very little point.
So, this drive failure is turning into a long term mess and I'll just have to deal with it as best I can.
The guy implied that it's toast, as in the media is unusable now. He hinted that I couldn't even reformat it and use it again elsewhere. Whatever this "servo track" is supposed to be, he kept insisting there was nothing that could be done about it. Like I said, he had a nasty attitude and acted like he didn't want to be there, so I just grabbed my drive and left.
Mechanically, according to him, there's nothign wrong. The heads weren't damaged and the platters are pristine. So it doesn't make any sense to me that the thing just died.