[Dovecot] Maildir + NFS + multiple machines = spectacular failure
I'm running beta7 on two machines, with maildir on NFS. I have lockd running on all machines. I've found that Dovecot is highly unstable with NFS when accessing a mailbox on more than one machine at the same time.
Both dovecot machines have:
mmap_disable = yes lock_method = fcntl
NFS is version 3, exported from a third linux machine. All machines are running 2.6.9 kernel.
Any ideas what's going wrong here?
Steve
*** dev4 ***
Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230258.P2889Q0M407851.dev4.neonova.net:2,
Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0
Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line 883 (mail_index_sync_from_transactions): assertion failed: (hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset)
Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with signal 6
*** dev3 ***
Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230254.P2888Q0M651598.dev4.neonova.net
Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4036 -> 1772
Apr 28 09:17:44 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4036 -> 1774)
Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2,
Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4038 -> 1774
Apr 28 09:17:47 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4038 -> 1775)
Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2,
Apr 28 09:17:49 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4039 -> 1775)
Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0
Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line 881 (mail_index_sync_from_transactions): assertion failed: (hdr.messages_count == (*map)->hdr.messages_count)
Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4039 -> 1775)
Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 1775, but next_uid = 4039
Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 1775, but next_uid = 4039
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected transaction log desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: Mailbox is in inconsistent state, please relogin.
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4039 -> 1776
On Fri, 2006-04-28 at 09:25 -0400, Apps Lists wrote:
Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230258.P2889Q0M407851.dev4.neonova.net:2,
I saw these kind of problems with 2.6.16 kernel. They got fixed after I applied utime() fix.
Also in general with current code you're going to have problems if you have attribute cache enabled. See http://wiki.dovecot.org/NFS
Hi all. We also had problems with client NFS locking using RHEL4, kernel 2.6.9 Some of them were solved updating the kelnel from RedHat. Hope this helps.
Regards.
Apps Lists wrote:
I'm running beta7 on two machines, with maildir on NFS. I have lockd running on all machines. I've found that Dovecot is highly unstable with NFS when accessing a mailbox on more than one machine at the same time.
Both dovecot machines have:
mmap_disable = yes lock_method = fcntl
NFS is version 3, exported from a third linux machine. All machines are running 2.6.9 kernel.
Any ideas what's going wrong here?
Steve
*** dev4 ***
Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230258.P2889Q0M407851.dev4.neonova.net:2,
Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0
Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line 883 (mail_index_sync_from_transactions): assertion failed: (hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset)
Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with signal 6
*** dev3 ***
Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230254.P2888Q0M651598.dev4.neonova.net
Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4036 -> 1772
Apr 28 09:17:44 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4036 -> 1774)
Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2,
Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4038 -> 1774
Apr 28 09:17:47 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4038 -> 1775)
Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2,
Apr 28 09:17:49 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4039 -> 1775)
Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0
Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line 881 (mail_index_sync_from_transactions): assertion failed: (hdr.messages_count == (*map)->hdr.messages_count)
Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4039 -> 1775)
Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 1775, but next_uid = 4039
Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 1775, but next_uid = 4039
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected transaction log desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: Mailbox is in inconsistent state, please relogin.
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4039 -> 1776
-- +----------------------------------------------^-----------------------+ | Luis Meléndez Aganzo ^ Email: luism@uco.es | | Servicio de Informática ^ Tlf: 34-(9)57-211022 | | Analista - Área de Sistemas ^ Fax: 34-(9)57-218116 | | Universidad de Córdoba (SPAIN) ^ http://www.uco.es | +----------------------------------------------^-----------------------+
We're using 2.6.9-34, the latest RHEL4 kernel, over NFSv3 with 3
Dovecot servers. Our most concerning episode was a failed APPEND
operation (with a set of folders containing 200 messages):
IMAP(username): Corrupted transaction log file /var/mail/username/ Maildir/dovecot.index.log: unexpected end of file while reading header
That resulted in one of the new APPENDed mail folders and content
being lost. We see those errors at other times too, although they
don't always end in a noticeable failure.
Continuous other errors we see that might be NFS related:
IMAP(username): close() failed with index cache file /var/mail/
username/Maildir/.Deleted Messages/dovecot.index.cache: Stale NFS
file handle
IMAP(username): Corrupted index cache file /var/mail/username/
Maildir/.OldMail/dovecot.index.cache: invalid record size
IMAP(username): close() failed with index file /var/mail/username/
Maildir/.INBOX.Security/dovecot.index: Bad file descriptor
We also see errors with POP3, which would rarely benefit from indexes
(since 92% of our users RETR-and-DELE messages):
POP3(username): file_dotlock_create() failed with file /var/mail/
username/Maildir/dovecot.index.log: Stale NFS file handle
POP3(username): Corrupted transaction log file /var/mail/username/
Maildir/dovecot.index.log: record size too small (type=0x8080,
offset=9056, size=0)
POP3(username): Our dotlock file /var/mail/username/Maildir/
dovecot.index.log.lock was deleted (kept it -118 secs)
POP3(username): Timeout while waiting for release of dotlock for
transaction log file /var/mail/username/Maildir/dovecot.index.log
We've tried manually removing "dovecot*" from all folders, allowing
the indexes to be re-built, but the errors return.
Excerpt from our config:
mail_cache_fields = mail_never_cache_fields = mail_cache_min_mail_count = 1000 mailbox_idle_check_interval = 10 mail_full_filesystem_access = no mail_read_mmaped = no mmap_disable = yes mmap_no_write = no lock_method = dotlock maildir_stat_dirs = yes maildir_copy_with_hardlinks = no
Our NFS mount:
nfshost:/var/mail /var/mail nfs
rw,hard,intr,rsize=8192,wsize=8192 0 0
I've read some of the list archives about disabling the index files.
We realize that indexes should be beneficial, but we'd like to at
least experiment with a "no index" install. This would solve some of
the NFS locking issues, eliminate the need to modify our MTA
behavior, and remove (potentially) millions of files from the system.
Did anyone discover a method to cleanly disable the indexes? At least
for POP3?
Thanks!
Rich Sandberg richs@whidbey.net
On Apr 28, 2006, at 6:59 AM, Luis Melendez wrote:
Hi all. We also had problems with client NFS locking using RHEL4, kernel
2.6.9 Some of them were solved updating the kelnel from RedHat. Hope this helps.Regards.
Apps Lists wrote:
I'm running beta7 on two machines, with maildir on NFS. I have lockd running on all machines. I've found that Dovecot is highly
unstable with NFS when accessing a mailbox on more than one machine at the same
time.Both dovecot machines have:
mmap_disable = yes lock_method = fcntl
NFS is version 3, exported from a third linux machine. All
machines are running 2.6.9 kernel.Any ideas what's going wrong here?
Steve
*** dev4 ***
Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in
uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230258.P2889Q0M407851.dev4.neonova.net:2,Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction
log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line
883 (mail_index_sync_from_transactions): assertion failed: (hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset)Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with signal 6
*** dev3 ***
Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in
uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230254.P2888Q0M651598.dev4.neonova.netApr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4036 -> 1772
Apr 28 09:17:44 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was
lowered (4036 -> 1774)Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in
uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2,Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4038 -> 1774
Apr 28 09:17:47 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was
lowered (4038 -> 1775)Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in
uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2,Apr 28 09:17:49 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was
lowered (4039 -> 1775)Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction
log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line
881 (mail_index_sync_from_transactions): assertion failed:
(hdr.messages_count == (*map)->hdr.messages_count)Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was
lowered (4039 -> 1775)Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction
log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with
UID 1775, but next_uid = 4039Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction
log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with
UID 1775, but next_uid = 4039Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected transaction log desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: Mailbox
is in inconsistent state, please relogin.Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4039 -> 1776
-- +---------------------------------------------- ^-----------------------+ | Luis Meléndez Aganzo ^ Email:
luism@uco.es | | Servicio de Informática ^ Tlf: 34-(9) 57-211022 | | Analista - Área de Sistemas ^ Fax: 34-(9) 57-218116 | | Universidad de Córdoba (SPAIN) ^ http:// www.uco.es | +---------------------------------------------- ^-----------------------+
participants (4)
-
Apps Lists
-
Luis Melendez
-
richs@whidbey.net
-
Timo Sirainen