Monday, December 17, 2012

RAID errors and other dramas

A bag full of lessons, and prophetic words.

"You have got a backup, haven't you ?"

So we had a fantastic curry with a native Sikh friend and his wife. He's into web design stuff, designing marketing emails & what not. He's a tad OCD too which is always fun for a wind up.

We had a great night - him and I are both a little geeky. He has a nice little setup with a hefty chunk of metal with Windows server as a desktop OS, and a lovely 4 monitor setup. They are on the end of some seriously long and seriously expensive cables so that his box is tucked quietly away in a nicely built well ventilated cupboard. Nice touch. SO quiet.

All was well in the world, til his small home built Windows 2008 server started having a wobble. It appeared he was losing his RAID volume and the RAID card was the first suspect.

Bearing in mind the conversation we had had only a week before about backups etc, when he told me he thought the RAID array and data had gone west, the conversation went like this :


"So what's the problem ?"

"I think I lost all my data"

"So you have got a good backup, haven't you"

The pregnant pause is like a Poker players 'tell'.....

"Ahhhhhhh......"

You know what's coming next.

"well no, not exactly......"

"And you have got a spare RAID controller card like I said that you should get ?"

Another pregnant pause. I really ought to try this poker malarkey.....

So we have a panic on our hands. Now I remember him telling me proudly about his RAID card some while before, but it was not a big thing so I never really took that much notice of the specifics. Now I asked him about it in a bit more detail.

Seems he had an old PCI Rocket RAID 1740 card made by Highpoint. On further investigation it seemed like the card had very kindly setup his array in a RAID 0. Except I don't think it was exactly, as he had 2Tb drives which if striped should show 4Tb when in fact it showed only 2Tb as one drive. I did wonder if it had done some form of striping/mirroring but I'm not sure.

Either way I realised that this was not even a proper RAID card, but really just a SATA controller card with a bit of software to do the RAID. And without the card we were probably sunk as well.

It reminded me of my own decision making about such things around the same time he chose the card. I was on a budget and I too thought about a hardware RAID card but realised I either had to get decent one, plus a spare, or forget it and use the OS to do its own software RAID.

All the bits I read about it said that it was the sensible way to go - why buy a SATA PCI card which will be seriously handicapped by the PCI bus speed ? Pointless. That has stood me in good stead down the years. Apart from anything, I can still read the drives easily if the OS or motherboard fall flat on their faces.

Anyway, onwards and upwards. He could occasionally get the lot to boot but it was struggling. So I ran round and took a look. For someone with OCD, the inside was a dusty mess. I took the opportunity to wind him up :-)

I needed to look twice. Something wasn't right but you know that bit when the brain doesn't quite register ? I was looking for a RAID card fault. And then I saw the plethora of capacitors with domes to rival St Pauls Cathedral.

One board change later and a happy friend is on the phone telling me how lovely his new box is. Unfortunately the next call wasn't flushed with the same good news. There was an air of dread in his voice, again.

He had decided to clean up his old backup drive. Downloaded some fancy bit of drive cleaning software (WHY??????), got cocky as it was all going so well (famous mistake) and let it rip on the drive. Except it was the WRONG DRIVE Grommet.


"Bother" said Pooh, as the pin fell out of the hand grenade. How much bad luck can one man have ?
 
Ever had that feeling ??? Bad. Very bad. Probably akin to throwing yourself off a cliff with no bungy or something. Gut wrenching, puke inducing, sheer, absolute terror. You only do that once.

He'd managed to get another piece of software to view and recover files from trashed disks and could see his files, but then the box popped up asking about the colour of his money.....

I checked to see what the original file destroyer had done by firing up a Win XP virtual machine with a new spare disk, then using the software to destroy it. Fortunately he had decided to use the non default 'Quick' settings (cos he didn't want to wait) so instead of rewriting the drive with random 1s and 0s several times, it looked like it had just blown the partition table away. I could probably recover that.

So off I trots with a couple of Linux ISOs and a few USB sticks. I knew the one problem was going to be getting linux to see the RAID card.

Anyway, after a few false starts with USB sticks and installing drivers, we grabbed an old spare disk and installed an old copy of Ubuntu on it (9.04 Jaunty, as that was the last time drivers were built for it) in about 20 minutes ! The drivers installed seamlessly and up came the card.

Next was my favourite piece of software. Testdisk. And with that we could see the files. I was still pretty sure it was a quick 'rewrite partition table', but I was taking no chances and took the opportunity to grab and save the data to a backup drive just in case. And then we did it to a second driver just for good measure. That kept him busy for a few hours !!!!

The we rewrote the partition table, and Lo ! All was well with the world.

One chastened friend..............

No comments:

Post a Comment