How I fsck'd my mail server
Hi everyone,First post, and I thought I'd share my recent troubles with everyone, so it hopefully won't happen to you. I'm still trying to figure out when I screwed up, but I definitely know HOW I did. I also know I won't do it again!
Running a debian mail server with messagewall, amavis, spamassassin, qmail & vpopmail. Ext3 FS.
We just installed messagewall and changed some settings in amavis, and we were running out of memory. It was all caused by spam. I was trying to see where it was coming from, and instead of being smart and doing things the right way, I decided to be lazy and set up one of my busier mail clients with a temporary "catch-all" account. I set it up, looked to see if the messages were getting tagged, where they were coming from, etc. So I looked at it, and I guess in my haste I deleted the directory where that mail account stores new mail (no, I don't know WHY)... and then I forgot to turn off the catch-all.
A month or two later (two weeks ago), I rebooted the server for the first time in about a year. It didn't come back up. I sent a panicked email to the network admin. He gave me the console login and we walked through it. There was a disk problem, and I needed to run fsck. I did ... and it ran all morning ... all day ... and all night. And it still wasn't done. Unluckily I didn't have a good back-up solution, but luckily there weren't many clients on the box and it was a weekend. So we did a fastboot and ran with the errors on the disk all week. There was nothing actually wrong with the disk, just a ton of inodes that had a setting of 2 that it was resetting to 1.
I looked in lost+found, and there were a ton of individual files - 11GB in total. And it wasn't done.
Sometime late in the week, I watched syslog for a few minutes and saw several error messages like this:
Code:
Feb 11 11:52:02 host qmail: 1108140722.047433 delivery 213544: \ failure: user_does_not_exist,_but_will_deliver_to_/var/lib/vpopmail/domains/domain.com/catchall //link_REALLY_failed_/var/lib/vpopmail/domains/domain.com/catchall/Maildir/tmp/1108140722.20655.tminus0,S=14773_ /var/lib/vpopmail/domains/domain.com/postmaster/Maildir/new/1108140722.20655.tminus0,S=14773_errno_=_2/system_error/
However, I did learn my lesson: Do it right the first time, or you'll ruin your weekends
