[Dovecot] Maildir + NFS + multiple machines = spectacular failure

richs at whidbey.net richs at whidbey.net
Fri Apr 28 18:51:09 EEST 2006


We're using 2.6.9-34, the latest RHEL4 kernel, over NFSv3 with 3  
Dovecot servers. Our most concerning episode was a failed APPEND  
operation (with a set of folders containing 200 messages):

IMAP(username): Corrupted transaction log file /var/mail/username/ 
Maildir/dovecot.index.log: unexpected end of file while reading header

That resulted in one of the new APPENDed mail folders and content  
being lost.  We see those errors at other times too, although they  
don't always end in a noticeable failure.

Continuous other errors we see that might be NFS related:

IMAP(username): close() failed with index cache file /var/mail/ 
username/Maildir/.Deleted Messages/dovecot.index.cache: Stale NFS  
file handle
IMAP(username): Corrupted index cache file /var/mail/username/ 
Maildir/.OldMail/dovecot.index.cache: invalid record size
IMAP(username): close() failed with index file /var/mail/username/ 
Maildir/.INBOX.Security/dovecot.index: Bad file descriptor

We also see errors with POP3, which would rarely benefit from indexes  
(since 92% of our users RETR-and-DELE messages):

POP3(username): file_dotlock_create() failed with file /var/mail/ 
username/Maildir/dovecot.index.log: Stale NFS file handle
POP3(username): Corrupted transaction log file /var/mail/username/ 
Maildir/dovecot.index.log: record size too small (type=0x8080,  
offset=9056, size=0)
POP3(username): Our dotlock file /var/mail/username/Maildir/ 
dovecot.index.log.lock was deleted (kept it -118 secs)
POP3(username): Timeout while waiting for release of dotlock for  
transaction log file /var/mail/username/Maildir/dovecot.index.log

We've tried manually removing "dovecot*" from all folders, allowing  
the indexes to be re-built, but the errors return.

Excerpt from our config:

mail_cache_fields =
mail_never_cache_fields =
mail_cache_min_mail_count = 1000
mailbox_idle_check_interval = 10
mail_full_filesystem_access = no
mail_read_mmaped = no
mmap_disable = yes
mmap_no_write = no
lock_method = dotlock
maildir_stat_dirs = yes
maildir_copy_with_hardlinks = no

Our NFS mount:

nfshost:/var/mail      /var/mail     nfs      
rw,hard,intr,rsize=8192,wsize=8192 0 0

I've read some of the list archives about disabling the index files.   
We realize that indexes should be beneficial, but we'd like to at  
least experiment with a "no index" install.  This would solve some of  
the NFS locking issues, eliminate the need to modify our MTA  
behavior, and remove (potentially) millions of files from the system.

Did anyone discover a method to cleanly disable the indexes? At least  
for POP3?

Thanks!

Rich Sandberg
richs at whidbey.net


On Apr 28, 2006, at 6:59 AM, Luis Melendez wrote:

> Hi all.
> We also had problems with  client NFS locking using RHEL4, kernel  
> 2.6.9
> Some of them were solved updating the kelnel from RedHat.
> Hope this helps.
>
> Regards.
>
> Apps Lists wrote:
>
>> I'm running beta7 on two machines, with maildir on NFS.  I have lockd
>> running on all machines.  I've found that Dovecot is highly  
>> unstable with
>> NFS when accessing a mailbox on more than one machine at the same  
>> time.
>>
>> Both dovecot machines have:
>>
>> mmap_disable = yes
>> lock_method = fcntl
>>
>> NFS is version 3, exported from a third linux machine.  All  
>> machines are
>> running 2.6.9 kernel.
>>
>> Any ideas what's going wrong here?
>>
>> Steve
>>
>>
>> *** dev4 ***
>>
>> Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in  
>> uidlist file
>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist:
>> 1146230258.P2889Q0M407851.dev4.neonova.net:2,
>>
>> Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction  
>> log file
>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension
>> introduction for unknown id 0
>>
>> Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line  
>> 883
>> (mail_index_sync_from_transactions): assertion failed:
>> (hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset)
>>
>>
>> Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with signal 6
>>
>>
>> *** dev3 ***
>>
>>
>>
>> Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in  
>> uidlist file
>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist:
>> 1146230254.P2888Q0M651598.dev4.neonova.net
>>
>> Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file
>> /var/mailstore/72/af/375887/Maildir/dovecot.index:
>> first_recent_uid_lowwater 4036 -> 1772
>>
>> Apr 28 09:17:44 dev3 dovecot: IMAP(jtest):
>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was  
>> lowered
>> (4036 -> 1774)
>>
>> Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in  
>> uidlist file
>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist:
>> 1146145480.P5976Q0M245663.dev3.neonova.net:2,
>>
>> Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file
>> /var/mailstore/72/af/375887/Maildir/dovecot.index:
>> first_recent_uid_lowwater 4038 -> 1774
>>
>> Apr 28 09:17:47 dev3 dovecot: IMAP(jtest):
>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was  
>> lowered
>> (4038 -> 1775)
>>
>> Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in  
>> uidlist file
>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist:
>> 1146145480.P5976Q0M245663.dev3.neonova.net:2,
>>
>> Apr 28 09:17:49 dev3 dovecot: IMAP(jtest):
>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was  
>> lowered
>> (4039 -> 1775)
>>
>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction  
>> log file
>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension
>> introduction for unknown id 0
>>
>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line  
>> 881
>> (mail_index_sync_from_transactions): assertion failed:  
>> (hdr.messages_count
>> == (*map)->hdr.messages_count)
>>
>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest):
>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was  
>> lowered
>> (4039 -> 1775)
>>
>> Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction  
>> log file
>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with  
>> UID
>> 1775, but next_uid = 4039
>>
>> Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction  
>> log file
>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with  
>> UID
>> 1775, but next_uid = 4039
>>
>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected transaction log
>> desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index
>>
>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: Mailbox  
>> is in
>> inconsistent state, please relogin.
>>
>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file
>> /var/mailstore/72/af/375887/Maildir/dovecot.index:
>> first_recent_uid_lowwater 4039 -> 1776
>>
>>
>>
>>
>>
>>
>>
>
>
> -- 
> +---------------------------------------------- 
> ^-----------------------+
> | Luis Meléndez Aganzo                         ^  Email:  
> luism at uco.es  |
> | Servicio de Informática                      ^  Tlf: 34-(9) 
> 57-211022 |
> | Analista - Área de Sistemas                  ^  Fax: 34-(9) 
> 57-218116 |
> | Universidad de Córdoba (SPAIN)               ^  http:// 
> www.uco.es    |
> +---------------------------------------------- 
> ^-----------------------+
>



More information about the dovecot mailing list