Ran into a major issue with my setup overnight:
We have a Win2K AD domain running SFU with a master and 1 slave NIS server. Our mail server is a FC1 box that runs dovecot and a MailScanner/sendmail config. The mail server is configured as a NIS client. The problem is that we lost power overnight and all of the boxes shutdown after the UPS's ran out of battery. After the power came back on, all of the boxes automatically rebooted. The problem is that our mail server is somewhat faster than the rest of the servers. It booted up before the master or slave NIS servers were up and running. Since the mail server uses a PAM NIS authentication scheme, it knows nothing about our users unless it connects to a NIS server. So, this morning users were getting "unknown username" type errors with outlook. Furthermore, the mail logs show that sendmail was rejecting mail with "unknown user" errors. It wasn't until we rebooted the mail server and it was able to reconnect to the NIS servers that mail resumed being received and users could access it. It seems like a fundamental problem with our setup. Since I'll eventually be updating the box to a newer OS and configuration, I need to plug this hole somehow without changing the authentication scheme. Furthermore, I'm concerned that if the master or slave went down while the mail server was still up, we'd see similar results. I'm thinking of one of two options to fix this but wanted to run it by everyone to see if there was a better way:
Configure the boot loader to wait 5 minutes before loading the OS (it uses GRUB so I'll set it display the OS menu screen for 5 minutes and then boot). Then if I'm doing maintenance to the box I can just hit enter to boot immediately and hopefully the 5 minute delay will allow the other boxes to boot after a power outage. This won't cover me if it's not a power outage (Eg. The master or slave dies).
Configure the mail server as an NIS slave. I'm thinking that this will basically "copy" the user info (username, password, homedir, etc.) on a schedule and store it locally on the mail server (is this how it works?). This covers both issues - power outage and a server dying. But I've read about problems getting password sync to happen quickly. I'd ideally like it to happen immediately (without any manual intervention) but I don't think this is possible.
Thoughts?
Jeff Graves, MCSA Customer Support Engineer Image Source, Inc. 10 Mill Street Bellingham, MA 02019
508.966.5200 - Phone 508.966.5170 - Fax jeff@image-src.com - Email www.image-src.com