Rich,
We're also in the process of setting up a GFS/Cluster with CentOS. I'm hopeful this will eliminate some of the issues.
I'll keep everyone posted.
Steve
Hi Steve,
We'll give that a try. I'm sure users would prefer a slightly slower mail experience if it ensured accurate mail lists and reduced the chance for lost mail.
Rich Sandberg richs@whidbey.net Systems & Network Administrator Whidbey Telecom Network Operations
On Apr 28, 2006, at 9:37 AM, Apps Lists wrote:
Hi Rich.
I've seen a lot of the same IMAP errors... haven't played with POP yet.
You might want to try adding "noac" to the mount options... your line..
nfshost:/var/mail /var/mail nfs rw,hard,intr,rsize=8192,wsize=8192 0 0
.... would become:
nfshost:/var/mail /var/mail nfs rw,hard,intr,rsize=8192,wsize=8192,noac 0 0
.... and see if that helps. I haven't seen much if any difference.
What I don't understand is how some people are running what appears to be the same configuration ... RHEL/CentOS ... NFS3, 2+ Dovecot machines with IMAP and have no problems.
I'd love to get to the bottom of this.
Steve
We're using 2.6.9-34, the latest RHEL4 kernel, over NFSv3 with 3 Dovecot servers. Our most concerning episode was a failed APPEND operation (with a set of folders containing 200 messages):
IMAP(username): Corrupted transaction log file /var/mail/username/ Maildir/dovecot.index.log: unexpected end of file while reading header
That resulted in one of the new APPENDed mail folders and content being lost. We see those errors at other times too, although they don't always end in a noticeable failure.
Continuous other errors we see that might be NFS related:
IMAP(username): close() failed with index cache file /var/mail/ username/Maildir/.Deleted Messages/dovecot.index.cache: Stale NFS file handle IMAP(username): Corrupted index cache file /var/mail/username/ Maildir/.OldMail/dovecot.index.cache: invalid record size IMAP(username): close() failed with index file /var/mail/username/ Maildir/.INBOX.Security/dovecot.index: Bad file descriptor
We also see errors with POP3, which would rarely benefit from indexes (since 92% of our users RETR-and-DELE messages):
POP3(username): file_dotlock_create() failed with file /var/mail/ username/Maildir/dovecot.index.log: Stale NFS file handle POP3(username): Corrupted transaction log file /var/mail/username/ Maildir/dovecot.index.log: record size too small (type=0x8080, offset=9056, size=0) POP3(username): Our dotlock file /var/mail/username/Maildir/ dovecot.index.log.lock was deleted (kept it -118 secs) POP3(username): Timeout while waiting for release of dotlock for transaction log file /var/mail/username/Maildir/dovecot.index.log
We've tried manually removing "dovecot*" from all folders, allowing the indexes to be re-built, but the errors return.
Excerpt from our config:
mail_cache_fields mail_never_cache_fields mail_cache_min_mail_count
1000 mailbox_idle_check_interval = 10 mail_full_filesystem_access = no mail_read_mmaped = no mmap_disable = yes mmap_no_write = no lock_method = dotlock maildir_stat_dirs = yes maildir_copy_with_hardlinks = no
Our NFS mount:
nfshost:/var/mail /var/mail nfs rw,hard,intr,rsize=8192,wsize=8192 0 0
I've read some of the list archives about disabling the index files. We realize that indexes should be beneficial, but we'd like to at least experiment with a "no index" install. This would solve some of the NFS locking issues, eliminate the need to modify our MTA behavior, and remove (potentially) millions of files from the system.
Did anyone discover a method to cleanly disable the indexes? At least for POP3?
Thanks!
Rich Sandberg richs@whidbey.net
On Apr 28, 2006, at 6:59 AM, Luis Melendez wrote:
Hi all. We also had problems with client NFS locking using RHEL4, kernel 2.6.9 Some of them were solved updating the kelnel from RedHat. Hope this helps.
Regards.
Apps Lists wrote:
I'm running beta7 on two machines, with maildir on NFS. I have lockd running on all machines. I've found that Dovecot is highly unstable with NFS when accessing a mailbox on more than one machine at the same time.
Both dovecot machines have:
mmap_disable = yes lock_method = fcntl
NFS is version 3, exported from a third linux machine. All machines are running 2.6.9 kernel.
Any ideas what's going wrong here?
Steve
*** dev4 ***
Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230258.P2889Q0M407851.dev4.neonova.net:2,
Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0
Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line 883 (mail_index_sync_from_transactions): assertion failed: (hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset)
Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with signal 6
*** dev3 ***
Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230254.P2888Q0M651598.dev4.neonova.net
Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4036 -> 1772
Apr 28 09:17:44 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4036 -> 1774)
Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2,
Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4038 -> 1774
Apr 28 09:17:47 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4038 -> 1775)
Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2,
Apr 28 09:17:49 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4039 -> 1775)
Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0
Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line 881 (mail_index_sync_from_transactions): assertion failed: (hdr.messages_count == (*map)->hdr.messages_count)
Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4039 -> 1775)
Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 1775, but next_uid = 4039
Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 1775, but next_uid = 4039
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected transaction log desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: Mailbox is in inconsistent state, please relogin.
Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4039 -> 1776
-- +---------------------------------------------- ^-----------------------+ | Luis Meléndez Aganzo ^ Email: luism@uco.es | | Servicio de Informática ^ Tlf: 34-(9) 57-211022 | | Analista - Área de Sistemas ^ Fax: 34-(9) 57-218116 | | Universidad de Córdoba (SPAIN) ^ http:// www.uco.es | +---------------------------------------------- ^-----------------------+