[Dovecot] Solution: NFS & Dovecot!
Hi folks!
It looks like Timo nailed down the NFS issues we've been seeing here in last night's CVS build.
Anyone running Dovecot in a high-usage environment should check this out.
Here are some relevant ChangeLog updates that seem to address the problem:
2006-05-02 11:11 Timo Sirainen <timo.sirainen@movial.fi>
* src/lib-storage/index/maildir/: maildir-save.c,
maildir-storage.h, maildir-sync.c, maildir-uidlist.c,
maildir-uidlist.h: Adding mail to index while saving it had a
race condition. Fixing it required a bit larger changes. Switched
uidlist/index locking order so that uidlist is now locked first.
2006-05-02 11:04 Timo Sirainen <timo.sirainen@movial.fi>
* src/lib-index/: mail-index-private.h, mail-index-sync-update.c,
mail-index.c: mmap_disable: When syncing in-memory index from
transaction log, we didn't skip external transactions which were
already been in our in-memory mapping, causing "Append with UID
n, but next_uid = m" errors.
The build is available here:
http://dovecot.org/nightly/dovecot-20060503.tar.gz
Thus far, I've seen no complaints of corrupted indexes or core dumps. This is absolutely fantastic.
As always... hats off to Timo. Great work!
Steve
On Wed, May 03, 2006 at 09:51:54AM -0400, Apps Lists wrote:
Hi folks!
It looks like Timo nailed down the NFS issues we've been seeing here in last night's CVS build.
I can confirm that the UID complains in logs have vanished.
I'm still seeing, very occasionally (on a very busy server) the following inexplicable sequence:
May 4 15:02:19 mailstore dovecot: pop3-login: Login: user=<user2@example.net>, method=PLAIN, rip=192.0.0.1, lip=192.0.0.2 May 4 15:02:23 mailstore dovecot: POP3(user2@example.net): Disconnected: Logged out top=0/0, retr=0/0, del=0/7, size=295127 May 4 15:02:44 mailstore dovecot: pop3-login: Login: user=<user2@example.net>, method=PLAIN, rip=192.0.0.1, lip=192.0.0.2 May 4 15:02:44 mailstore dovecot: POP3(user2@example.net): Disconnected: Logged out top=0/0, retr=0/0, del=0/7, size=295127 May 4 15:12:19 mailstore dovecot: pop3-login: Login: user=<user2@example.net>, method=PLAIN, rip=192.0.0.1, lip=192.0.0.2 May 4 15:12:19 mailstore dovecot: POP3(user2@example.net): Corrupted index cache file /var/mailstore/mail/user2@example.net/dovecot.index.cache: invalid field header size May 4 15:12:19 mailstore dovecot: POP3(user2@example.net): Disconnected: Logged out top=0/0, retr=0/0, del=0/7, size=295127 May 4 15:22:20 mailstore dovecot: pop3-login: Login: user=<user2@example.net>, method=PLAIN, rip=192.0.0.1, lip=192.0.0.2 May 4 15:22:20 mailstore dovecot: POP3(user2@example.net): Disconnected: Logged out top=0/0, retr=0/0, del=0/7, size=295127 May 4 15:22:44 mailstore dovecot: pop3-login: Login: user=<user2@example.net>, method=PLAIN, rip=192.0.0.1, lip=192.0.0.2 May 4 15:22:45 mailstore dovecot: POP3(user2@example.net): Disconnected: Logged out top=0/0, retr=0/0, del=0/7, size=295127 May 4 15:32:20 mailstore dovecot: pop3-login: Login: user=<user2@example.net>, method=PLAIN, rip=192.0.0.1, lip=192.0.0.2 May 4 15:32:20 mailstore dovecot: POP3(user2@example.net): Corrupted index cache file /var/mailstore/mail/user2@example.net/dovecot.index.cache: invalid field header size May 4 15:32:20 mailstore dovecot: POP3(user2@example.net): Disconnected: Logged out top=0/0, retr=0/0, del=0/7, size=295127
no deliveries, no IMAP accesses.
(identities genericised)
Haven't tracked this down yet. So far baffled and looking for patterns. Seeing it about 10 times in every 100,000 logins (about an hour's worth).
Steve, are you using dotlocks or fcntl locks? (I see this with both, talking to a netapp filer)
/JG
On Thu, 2006-05-04 at 16:21 +1000, Joshua Goodall wrote:
May 4 15:12:19 mailstore dovecot: POP3(user2@example.net): Corrupted index cache file /var/mailstore/mail/user2@example.net/dovecot.index.cache: invalid field header size May 4 15:32:20 mailstore dovecot: POP3(user2@example.net): Corrupted index cache file /var/mailstore/mail/user2@example.net/dovecot.index.cache: invalid field header size
I think these are the same that I can easily get with imaptest. They happen around the time cache file is compressed and another process tries to access the new file using old offsets for some reason. Haven't bothered to look at these too closely yet since they're completely transparent to users.
participants (3)
-
Apps Lists
-
Joshua Goodall
-
Timo Sirainen