Greetings all,
I'm trying to get an understanding of a problem we are facing here. We're currently running dovecot 1.0-beta3 and have a long standing issue of system crashes on our mail server (Debian Linux 2.4.27-2-k7-smp).
Here's what is happening:
The machine hangs and the system load climbs as high as 80.0+. Yet, the system response is not effected. Command line is still responds instantly. There are multiple running dovecot PIDs, even if I stop the service. If I try to kill or -9 the PIDs, they will not die. The machine is DOA and must be forcefully restarted. Issuing a reboot will cause the machine to hang when it attempts to unmount network shares.
Here's the setup:
- Dovecot 1.0-beta3
- lock_method = dotlock
- mmap_disable = yes
/var/mail is store locally on the mail server and access via NFS to ALL remote machines. All remote machines have /var/mail sym linked to the NFS share on Mail.
/home on Mail is NFS'd to another set of servers where IMAP mail folders reside in mbox format. All client machines have /home sym linked to the second NFS server.
In other words, there's a lot of NFS shares and one mail transaction can involve 3 machines.
What I'm trying to find out is the current state of NFS locking with Dovecot. This system hang happens 1-3 times a week. The current /home NFS mounts are running from SGI machines on IRIX 6.5. Clients are all Linux (debian) 2.4 or Linux (ubuntu) 2.6.
Is our setup too much for Dovecot to handle? Are there other variables we're not looking at here?
Thanks everyone.