[Dovecot] Maildir + NFS + multiple machines = spectacular failure

Apps Lists appslists at neonova.net
Fri Apr 28 21:27:37 EEST 2006


Rich,

  We're also in the process of setting up a GFS/Cluster with CentOS.  I'm
hopeful this will eliminate some of the issues.

  I'll keep everyone posted.

Steve

> Hi Steve,
>
> We'll give that a try.  I'm sure users would prefer a slightly slower
> mail experience if it ensured accurate mail lists and reduced the
> chance for lost mail.
>
> Rich Sandberg
> richs at whidbey.net
> Systems & Network Administrator
> Whidbey Telecom Network Operations
>
>
>
> On Apr 28, 2006, at 9:37 AM, Apps Lists wrote:
>
>> Hi Rich.
>>
>> I've seen a lot of the same IMAP errors... haven't played with POP
>> yet.
>>
>> You might want to try adding "noac" to the mount options... your
>> line..
>>
>> nfshost:/var/mail      /var/mail     nfs
>> rw,hard,intr,rsize=8192,wsize=8192 0 0
>>
>> .... would become:
>>
>>
>> nfshost:/var/mail      /var/mail     nfs
>> rw,hard,intr,rsize=8192,wsize=8192,noac 0 0
>>
>>
>> .... and see if that helps.  I haven't seen much if any difference.
>>
>> What I don't understand is how some people are running what appears
>> to be
>> the same configuration ... RHEL/CentOS ... NFS3, 2+ Dovecot
>> machines with
>> IMAP and have no problems.
>>
>> I'd love to get to the bottom of this.
>>
>> Steve
>>
>>> We're using 2.6.9-34, the latest RHEL4 kernel, over NFSv3 with 3
>>> Dovecot servers. Our most concerning episode was a failed APPEND
>>> operation (with a set of folders containing 200 messages):
>>>
>>> IMAP(username): Corrupted transaction log file /var/mail/username/
>>> Maildir/dovecot.index.log: unexpected end of file while reading
>>> header
>>>
>>> That resulted in one of the new APPENDed mail folders and content
>>> being lost.  We see those errors at other times too, although they
>>> don't always end in a noticeable failure.
>>>
>>> Continuous other errors we see that might be NFS related:
>>>
>>> IMAP(username): close() failed with index cache file /var/mail/
>>> username/Maildir/.Deleted Messages/dovecot.index.cache: Stale   NFS
>>> file
>>> handle
>>> IMAP(username): Corrupted index cache file /var/mail/username/
>>> Maildir/.OldMail/dovecot.index.cache: invalid record size
>>> IMAP(username): close() failed with index file /var/mail/username/
>>> Maildir/.INBOX.Security/dovecot.index: Bad file descriptor
>>>
>>> We also see errors with POP3, which would rarely benefit from indexes
>>> (since 92% of our users RETR-and-DELE messages):
>>>
>>> POP3(username): file_dotlock_create() failed with file /var/mail/
>>> username/Maildir/dovecot.index.log: Stale NFS file handle
>>> POP3(username): Corrupted transaction log file /var/mail/username/
>>> Maildir/dovecot.index.log: record size too small (type=0x8080,
>>> offset=9056, size=0)
>>> POP3(username): Our dotlock file /var/mail/username/Maildir/
>>> dovecot.index.log.lock was deleted (kept it -118 secs)
>>> POP3(username): Timeout while waiting for release of dotlock for
>>> transaction log file /var/mail/username/Maildir/dovecot.index.log
>>>
>>> We've tried manually removing "dovecot*" from all folders, allowing
>>> the indexes to be re-built, but the errors return.
>>>
>>> Excerpt from our config:
>>>
>>> mail_cache_fields mail_never_cache_fields mail_cache_min_mail_count
>>> >> 1000 mailbox_idle_check_interval = 10 mail_full_filesystem_access
>>> = no
>>> mail_read_mmaped = no
>>> mmap_disable = yes
>>> mmap_no_write = no
>>> lock_method = dotlock
>>> maildir_stat_dirs = yes
>>> maildir_copy_with_hardlinks = no
>>>
>>> Our NFS mount:
>>>
>>> nfshost:/var/mail      /var/mail     nfs
>>> rw,hard,intr,rsize=8192,wsize=8192 0 0
>>>
>>> I've read some of the list archives about disabling the index files.
>>> We realize that indexes should be beneficial, but we'd like to at
>>> least experiment with a "no index" install.  This would solve some of
>>> the NFS locking issues, eliminate the need to modify our MTA
>>> behavior, and remove (potentially) millions of files from the system.
>>>
>>> Did anyone discover a method to cleanly disable the indexes? At least
>>> for POP3?
>>>
>>> Thanks!
>>>
>>> Rich Sandberg
>>> richs at whidbey.net
>>>
>>>
>>> On Apr 28, 2006, at 6:59 AM, Luis Melendez wrote:
>>>
>>>> Hi all.
>>>> We also had problems with  client NFS locking using RHEL4, kernel
>>>> 2.6.9
>>>> Some of them were solved updating the kelnel from RedHat.
>>>> Hope this helps.
>>>>
>>>> Regards.
>>>>
>>>> Apps Lists wrote:
>>>>
>>>>> I'm running beta7 on two machines, with maildir on NFS.  I have
>>>>> lockd
>>>>> running on all machines.  I've found that Dovecot is highly
>>>>> unstable with
>>>>> NFS when accessing a mailbox on more than one machine at the same
>>>>> time.
>>>>>
>>>>> Both dovecot machines have:
>>>>>
>>>>> mmap_disable = yes
>>>>> lock_method = fcntl
>>>>>
>>>>> NFS is version 3, exported from a third linux machine.  All
>>>>> machines are
>>>>> running 2.6.9 kernel.
>>>>>
>>>>> Any ideas what's going wrong here?
>>>>>
>>>>> Steve
>>>>>
>>>>>
>>>>> *** dev4 ***
>>>>>
>>>>> Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in
>>>>> uidlist file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist:
>>>>> 1146230258.P2889Q0M407851.dev4.neonova.net:2,
>>>>>
>>>>> Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction
>>>>> log file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension
>>>>> introduction for unknown id 0
>>>>>
>>>>> Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line
>>>>> 883
>>>>> (mail_index_sync_from_transactions): assertion failed:
>>>>> (hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset)
>>>>>
>>>>>
>>>>> Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with
>>>>> signal 6
>>>>>
>>>>>
>>>>> *** dev3 ***
>>>>>
>>>>>
>>>>>
>>>>> Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in
>>>>> uidlist file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist:
>>>>> 1146230254.P2888Q0M651598.dev4.neonova.net
>>>>>
>>>>> Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index:
>>>>> first_recent_uid_lowwater 4036 -> 1772
>>>>>
>>>>> Apr 28 09:17:44 dev3 dovecot: IMAP(jtest):
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was
>>>>> lowered
>>>>> (4036 -> 1774)
>>>>>
>>>>> Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in
>>>>> uidlist file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist:
>>>>> 1146145480.P5976Q0M245663.dev3.neonova.net:2,
>>>>>
>>>>> Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index:
>>>>> first_recent_uid_lowwater 4038 -> 1774
>>>>>
>>>>> Apr 28 09:17:47 dev3 dovecot: IMAP(jtest):
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was
>>>>> lowered
>>>>> (4038 -> 1775)
>>>>>
>>>>> Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in
>>>>> uidlist file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist:
>>>>> 1146145480.P5976Q0M245663.dev3.neonova.net:2,
>>>>>
>>>>> Apr 28 09:17:49 dev3 dovecot: IMAP(jtest):
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was
>>>>> lowered
>>>>> (4039 -> 1775)
>>>>>
>>>>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction
>>>>> log file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension
>>>>> introduction for unknown id 0
>>>>>
>>>>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line
>>>>> 881
>>>>> (mail_index_sync_from_transactions): assertion failed:
>>>>> (hdr.messages_count
>>>>> == (*map)->hdr.messages_count)
>>>>>
>>>>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest):
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was
>>>>> lowered
>>>>> (4039 -> 1775)
>>>>>
>>>>> Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction
>>>>> log file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with
>>>>> UID
>>>>> 1775, but next_uid = 4039
>>>>>
>>>>> Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction
>>>>> log file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with
>>>>> UID
>>>>> 1775, but next_uid = 4039
>>>>>
>>>>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected
>>>>> transaction log
>>>>> desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index
>>>>>
>>>>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected:
>>>>> Mailbox   is
>>>>> in
>>>>> inconsistent state, please relogin.
>>>>>
>>>>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file
>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index:
>>>>> first_recent_uid_lowwater 4039 -> 1776
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> +----------------------------------------------
>>>> ^-----------------------+
>>>> | Luis Meléndez Aganzo                         ^  Email:
>>>> luism at uco.es  |
>>>> | Servicio de Informática                      ^  Tlf: 34-(9)
>>>> 57-211022 |
>>>> | Analista - Área de Sistemas                  ^  Fax: 34-(9)
>>>> 57-218116 |
>>>> | Universidad de Córdoba (SPAIN)               ^  http://
>>>> www.uco.es    |
>>>> +----------------------------------------------
>>>> ^-----------------------+





More information about the dovecot mailing list