Data loss nightmare with RackShack.net
At about 1am, my machine had a very high load average (over 100!) for some reason. I suspect that either Apache or AdCycle chewed up all the available memory (the machine gets over 250,000 hits per day), although I never managed to get a "top" display to check.After 15 minutes of trying to "su" (I got as far as the "Password:" prompt but it just sat there), I asked RackShack for a reboot:
Problem Description:
1/20/02 1:26:50 AM
1:39am up 20 days, 15:29, 14 users, load average: 168.51, 139.18, 82.77
Machine is frozen. Please reboot.
I quickly got back a response that the machine was rebooted.
Resolution Description:
1/20/02 1:30:34 AM
rebooted.
However, the machine did not come back up when it was rebooted. Three hours later, I finally got this response:
1/20/02 4:27:16 AM
server has some strange behavior. customer has installed the LILO loader and what looks like a recompiled kernel. It does not seem to finish booting and it returned some I/O HD errors the last boot. Possible bad drive, definite restore.
1/20/02 4:31:03 AM
We will need authorization to do a restore and replace the drive. Let us know immediately whether you wish to procede with this. We will not be responsible for any restoration of content.
The part about the kernel seems to be irrelevant (BTW it was a reconfigured kernel, not a recompiled one) since the machine had been running fine since the beginning of the month.
I'm thinking that with a load average of over 100, there were many files open that were not properly closed that caused the disk to be corrupted when they rebooted the machine.
So now, I have to wait for them to restore the server. I can't even get my e-mail right now since it was on that server. I have a 20 hour old backup of the server, but it only has data so I have to recompile all the programs, which will take me a whole day.

Anyway, the reason I posted about this was to ask you guys: Do you think that RackShack was at fault for this (for purchasing bad hardware? for not being able to recover the hard drive?)?