I wanted to followup on my NFS lock issue with dovecot-uidlist.
After doing some research, the current FreeBSD NFS client (as of 6.2-
STABLE at least) appears to have a long-standing bug with caching on
files with high create/removal rates. With the NFS access cache
enabled or disabled, the NFS client still uses another cache for
certain file attributes and requires at least a second to go by
before it will invalidate an entry if it was deleted. If the file
attributes are accessed before the second is up, the timer is restarted.
Since the dotlocking code in Dovecot micro-sleeps for less than a
second between each check for the .lock file, the entry is never
removed from the cache's cache, so the lstat() on the lock file
always returns 0 (success). This never allows the lock file to be re-
created until the stall timeout is reached. All Dovecot processes
(IMAP, POP3, deliver) hang until the kernel invalidates the entry,
causing the problem. Using a sleep() call > 1 second after removing
the lock and before attempting to use it again helps, but is
obviously not very performance-friendly for a high-volume mail server.
The other solution I've found that seems to work is updating the
mtime on the .lock file if all other dotlocking checks fail in
check_lock() in src/lib/file-dotlock.c (see attached patch). This
invalidates the cached entry in the kernel and allows lstat() to
return the correct response (-1), as the .lock file no longer
exists. I didn't check to see if the utime() fails, as it just means
the kernel invalidated the entry when it should have and can be ignored.
I have performed some high-volume delivery (deliver) and pickup
testing (imap and pop3) using the workaround, and so far everything
has worked as expected for all Dovecot control files, including indexes.
Does anyone know of any side effects the forced mtime update may have
that I may not be seeing?
Thanks again for any assistance.
-Doug
On May 17, 2007, at 10:45 AM, Doug Council wrote:
We are in the process of migrating away from Courier-IMAP/POP3 and
Maildrop. I want to use Dovecot (LDA, IMAP, POP3). During my
testing, it has worked great except for dotlocking on the dovecot- uidlist file.The problem:
When a delivery is being made with deliver and a mail client has
the mailbox open (Thunderbird in this case), neither Thunderbird or
deliver can get a dotlock on the dovecot-uidlist file, causing both
deliver and Thunderbird to hang until the dotlock timeout runs out
and the lock gets replaced. Once the lock is replaced, both will
go about their business until the next lock miss and hang again.
Eventually, everything is delivered and Thunderbird wakes up.Looking at each of the processes with truss, they are looping
trying to stat the dotcot-uidlist.lock file, which no longer exists.We are using NFS, and based on reading through the mailing list
archives, it can be a little difficult to get working reliably.
But, I've read quite a few posts with our same or similar
configuration having good luck with the setup. To reduce multiple
box access-issues for now, I've been doing all testing with a
single NFS client.Our configuration:
NetApp filers for storage FreeBSD 6.2-RELEASE NFS clients Postfix 2.3.9 MTA Dovecot 1.0.0 LDA for local deliveries Dovecot 1.0.0 IMAP for pickup
My dovecot.conf file is at the end of this message. NFS access
cachcing on the FreeBSD has been disabled
(vfs.nfs.access_cache_timeout = 0, see NFS mount options below).
Postfix destination recipient and concurrency limit for the Dovecot
LDA is set to 1.The NFS mount options:
rw,tcp,-r=32768,-w=32768,nfsv3,dumbtimer,noatime,acregmin=0, acregmax=0,acdirmin=0,acdirmax=0
The dovecot.conf file:
protocols = imap imaps pop3 pop3s disable_plaintext_auth = no syslog_facility = local0 ssl_cert_file = /nethere/conf/dovecot/ssl-nh-cert.pem ssl_key_file = /nethere/conf/dovecot/ssl-nh-key.pem login_greeting = Server ready. login_log_format_elements = user=<%u> ip=[%r] method=%m encryption=% c pid=%p login_log_format = %U$: %s mail_location = maildir:~/Maildir:INDEX=MEMORY mmap_disable = yes dotlock_use_excl = no lock_method = dotlock first_valid_uid = 200 last_valid_uid = 200 first_valid_gid = 200 last_valid_gid = 200 maildir_copy_with_hardlinks = yes
namespace private { prefix = INBOX. inbox = yes }
protocol imap { login_executable = /usr/local/libexec/dovecot/imap-login mail_executable = /usr/local/libexec/dovecot/imap imap_client_workarounds = outlook-idle delay-newmail }
protocol pop3 { login_executable = /usr/local/libexec/dovecot/pop3-login mail_executable = /usr/local/libexec/dovecot/pop3 pop3_uidl_format = UID%u-%v pop3_client_workarounds = outlook-no-nuls oe-ns-eoh }
protocol lda { postmaster_address = postmaster@nethere.com sendmail_path = /usr/sbin/sendmail auth_socket_path = /var/run/dovecot/auth-master syslog_facility = mail }
auth_executable = /usr/local/libexec/dovecot/dovecot-auth
auth default { mechanisms = plain digest-md5 cram-md5 passdb ldap { args = /nethere/conf/dovecot/dovecot-ldap.conf } userdb ldap { args = /nethere/conf/dovecot/dovecot-ldap.conf } user = root socket listen { master { path = /var/run/dovecot/auth-master mode = 0600 user = mailuser group = mailuser } } }
It may just be "how it works", but the lock contention seems a
little too fragile for busy mailboxes.Does anyone have any ideas? Thanks in advance for any assistance.
-Doug