Well, that sucked
Posted: Wednesday, February 3, 2010 @ 2:24 AM
Updated: Thursday, February 4, 2010 @ 1:13 AM
Earlier one of the laptops in the house wasn't able to acquire a DHCP lease. I was in a hurry, so I assigned it a static IP address and it worked fine. While it was perplexing, it wasn't a fatal issue, so I put it on my TODO list for later. Boy, was I wrong about it not being a fatal issue.
Fast-Forward a few hours later and my friend is unable to connect to my VPN. I checked my VPN server and it was fine. The next logical step was to check the firewall, for which the web administration interface failed to respond. Ok, that's odd. So I went downstairs to the server-wall and checked the console messages. The system was going haywire with messages of being unable to allocate inodes. That's not good, because it means that it's a file system or drive issue. Turns out, it was the latter. The hard drive finally died. Great. The network connectivity and routing continued to work because the pages were in-memory and hadn't been paged out. However, the userland stuff was dying off pretty fast.
So I set about yanking the system off the shelf and figuring out the best way to get back up and running quickly. I had recently cleaned the space down there and the next step was to reconfigure and rebuild the network. However that's a good 12-14 hour days worth of work, and it just wasn't high on the priority list. Now given the current situation, I could have either just spent the night rebuilding whatever I could from scratch or patched up the current server and waited until a later date to rebuild. I chose the latter since within the next few weeks I should be getting a crate full of newer hardware to deploy on the network, after which I can phase out the old servers. In addition, my friend and I would be doing it together, so that cuts the rebuilding time down from 12-14 hours to 6-8 hours. Once this was decided, it was off to patch up the box.
I actually ended up grabbing a different box that I wasn't using, but that I knew to be good to replace the now-defunct gateway-firewall. The other big pain in the neck was that for whatever reason, I couldn't find my saved XML configuration file for the router. Well that sucked since it added an additional couple of hours onto the time necessary to get back up and running. I ended up installing a newer version of the gateway-firewall software and then configuring it from memory. Much to my surprise, I was actually successful the first time around. Go me!
So there you have it. Sorry about the downtime of a few hours if you tried to go to any of the hosted websites and they didn't come up. I did my best.