[Dovecot] Dovecot 1.2.9 Crash, NFS
Hi,
I'm stuck with a problem we have with dovecot. My suspicion is, that it has to do with accessing the same mailbox/mail stored on a NFS-share from two machines at the same time.
setup
We have to mail servers running, both run a Ubuntu 10.04, Postfix 2.70 and Dovecot 1.2.9. The mailboxes are stored in maildir format on a NFS-Share. In front of those to mail servers we have a load balancer. Unfortunately it can't be set up to use the same server for each domain, but it uses the same server for the same source-ip for at least 1 hour.
Here is the output of dovecot -n: # 1.2.9: /etc/dovecot/dovecot.conf # OS: Linux 2.6.32-33-server x86_64 Ubuntu 10.04.3 LTS nfs log_timestamp: %Y-%m-%d %H:%M:%S protocols: imap imaps pop3 pop3s ssl_ca_file: /etc/ssl/ca-bundle/SSL123_CA_Bundle.pem ssl_cert_file: /etc/ssl/mail.newmedia.ch/mail.newmedia.ch.crt ssl_key_file: /etc/ssl/mail.newmedia.ch/mail.newmedia.ch.key ssl_verify_client_cert: yes disable_plaintext_auth: no login_dir: /var/run/dovecot/login login_executable(default): /usr/lib/dovecot/imap-login login_executable(imap): /usr/lib/dovecot/imap-login login_executable(pop3): /usr/lib/dovecot/pop3-login mail_max_userip_connections: 25 mail_privileged_group: mail mail_location: maildir:/data/vmail/%d/%n:INDEX=/data/vmail/%d/%n/indexes mmap_disable: yes dotlock_use_excl: no mail_nfs_storage: yes mail_nfs_index: yes mbox_write_locks: fcntl dotlock mail_executable(default): /usr/lib/dovecot/imap mail_executable(imap): /usr/lib/dovecot/imap mail_executable(pop3): /usr/lib/dovecot/pop3 mail_plugins(default): quota imap_quota mail_plugins(imap): quota imap_quota mail_plugins(pop3): mail_plugin_dir(default): /usr/lib/dovecot/modules/imap mail_plugin_dir(imap): /usr/lib/dovecot/modules/imap mail_plugin_dir(pop3): /usr/lib/dovecot/modules/pop3 imap_client_workarounds(default): outlook-idle delay-newmail imap_client_workarounds(imap): outlook-idle delay-newmail imap_client_workarounds(pop3): auth default: passdb: driver: sql args: /etc/dovecot/dovecot-mysql.conf userdb: driver: passwd userdb: driver: sql args: /etc/dovecot/dovecot-mysql.conf plugin: quota: maildir:storage=409600 sieve_global_path: /data/vmail/globalsieverc dict: quotadict: mysql:/etc/dovecot-dict-quota.conf
problem
the problem happens with a client's mailbox that is used by multiple users.
From time to time he cannot see any Emails in the mailbox, neither with his mail clients (Apple Mail) nor with in the webmail (Roundcube).
Around this time I get the following entries in the log files:
Mar 6 08:26:31 mail02 dovecot: IMAP(user@example.com): fdatasync(/data/vmail/example.com/user/dovecot-uidlist) failed: Input/output error Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): Maildir /data/vmail/example.com/user: Expunged message reappeared, giving a new UID (old uid=1522, file=1326961561.V15I4d8562M567017.mail02:2,Sad) Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): Maildir /data/vmail/example.com/user: Expunged message reappeared, giving a new UID (old uid=1523, file=1326705103.V15I90105M613353.mail01:2,Sad) Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 4: 1326961561.V15I4d8562M567017.mail02:2,Sad (uid 1522 -> 1598) Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 5: 1326705103.V15I90105M613353.mail01:2,Sad (uid 1523 -> 1599) Mar 6 08:42:30 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 4: 1326961561.V15I4d8562M567017.mail02:2,Sad (uid 1522 -> 1598) Mar 6 08:42:30 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 5: 1326705103.V15I90105M613353.mail01:2,Sad (uid 1523 -> 1599) Mar 6 08:42:30 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 4: 1326961561.V15I4d8562M567017.mail02:2,Sad (uid 1522 -> 1598) Mar 6 08:42:30 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 5: 1326705103.V15I90105M613353.mail01:2,Sad (uid 1523 -> 1599) Mar 6 08:42:31 mail02 dovecot: IMAP(user@example.com): Maildir /data/vmail/example.com/user: Expunged message reappeared, giving a new UID (old uid=1524, file=1327500903.V15I5722c8M210039.mail01:2,Se) Mar 6 08:42:31 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 6: 1327500903.V15I5722c8M210039.mail01:2,Se (uid 1524 -> 1600) Mar 6 08:42:31 mail02 dovecot: IMAP(user@example.com): Panic: file maildir-uidlist.c: line 403 (maildir_uidlist_records_array_delete): assertion failed: (pos != NULL) Mar 6 08:42:31 mail02 dovecot: IMAP(user@example.com): Raw backtrace: imap(+0xaeb5a) [0x7f37602b8b5a] -> imap(+0xaebc7) [0x7f37602b8bc7] -> imap(+0xae238) [0x7f37602b8238] -> imap(+0x497d7) [0x7f37602537d7] -> imap(maildir_uidlist_refresh+0x6f2) [0x7f37602545c2] -> imap(+0x4bb06) [0x7f3760255b06] -> imap(maildir_uidlist_sync_init+0x4d) [0x7f376025652d] -> imap(+0x46ed4) [0x7f3760250ed4] -> imap(maildir_storage_sync_init+0x147) [0x7f3760251557] -> imap(imap_sync_init+0x70) [0x7f376023b190] -> imap(+0x2411e) [0x7f376022e11e] -> imap(io_loop_handle_timeouts+0xcc) [0x7f37602c069c] -> imap(io_loop_handler_run+0x60) [0x7f37602c1000] -> imap(io_loop_run+0x18) [0x7f37602c0448] -> imap(main+0x58e) [0x7f376023bc5e] -> /lib/libc.so.6(__libc_start_main+0xfd) [0x7f375f877c4d] -> imap(+0x21979) [0x7f376022b979] Mar 6 08:42:31 mail02 dovecot: dovecot: child 16934 (imap) killed with signal 6 (core dumps disabled) Mar 6 08:42:31 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 6: 1327500903.V15I5722c8M210039.mail01:2,Se (uid 1524 -> 1600) Mar 6 08:42:31 mail02 dovecot: IMAP(user@example.com): Panic: file maildir-uidlist.c: line 403 (maildir_uidlist_records_array_delete): assertion failed: (pos != NULL) Mar 6 08:42:31 mail02 dovecot: IMAP(user@example.com): Raw backtrace: imap(+0xaeb5a) [0x7ff81b415b5a] -> imap(+0xaebc7) [0x7ff81b415bc7] -> imap(+0xae238) [0x7ff81b415238] -> imap(+0x497d7) [0x7ff81b3b07d7] -> imap(maildir_uidlist_refresh+0x6f2) [0x7ff81b3b15c2] -> imap(maildir_uidlist_sync_init+0x105) [0x7ff81b3b35e5] -> imap(+0x46ed4) [0x7ff81b3aded4] -> imap(maildir_storage_sync_force+0x52) [0x7ff81b3ae392] -> imap(maildir_file_do+0x99) [0x7ff81b3b3cb9] -> imap(+0x4d944) [0x7ff81b3b4944] -> imap(index_mail_set_seq+0x148) [0x7ff81b3c8ed8] -> imap(index_storage_search_next_nonblock+0x162) [0x7ff81b3cd622] -> imap(mailbox_search_next_nonblock+0x20) [0x7ff81b3db2c0] -> imap(mailbox_search_next+0x26) [0x7ff81b3db316] -> imap(imap_fetch_more+0x2bf) [0x7ff81b39295f] -> imap(cmd_fetch+0x36c) [0x7ff81b38a9ec] -> imap(+0x28fad) [0x7ff81b38ffad] -> imap(+0x2908d) [0x7ff81b39008d] -> imap(client_handle_input+0x135) [0x7ff81b3902c5] -> imap(client_input+0x5f) [0x7ff81b390baf] -> imap(io_loop_handler_run+0xbd) [0x7ff81b41e05d] -> imap(io_loop_run+0x18) [0x7ff81b41d448] -> Mar 6 08:42:31 mail02 dovecot: IMAP(user@example.com): imap(main+0x58e) [0x7ff81b398c5e] -> /lib/libc.so.6(__libc_start_main+0xfd) [0x7ff81a9d4c4d] -> imap(+0x21979) [0x7ff81b388979] Mar 6 08:42:31 mail02 dovecot: dovecot: child 13712 (imap) killed with signal 6 (core dumps disabled)
Mar 6 08:42:31 mail01 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 4: 1326961561.V15I4d8562M567017.mail02:2,Sad (uid 1522 -> 1598) Mar 6 08:42:31 mail01 dovecot: IMAP(user@example.com): Panic: file maildir-uidlist.c: line 403 (maildir_uidlist_records_array_delete): assertion failed: (pos != NULL) Mar 6 08:42:31 mail01 dovecot: IMAP(user@example.com): Raw backtrace: imap(+0xaeb5a) [0x7f4a311fcb5a] -> imap(+0xaebc7) [0x7f4a311fcbc7] -> imap(+0xae238) [0x7f4a311fc238] -> imap(+0x497d7) [0x7f4a311977d7] -> imap(maildir_uidlist_refresh+0x6f2) [0x7f4a311985c2] -> imap(+0x47023) [0x7f4a31195023] -> imap(maildir_storage_sync_init+0x147) [0x7f4a31195557] -> imap(imap_sync_init+0x70) [0x7f4a3117f190] -> imap(+0x2411e) [0x7f4a3117211e] -> imap(+0x64c0e) [0x7f4a311b2c0e] -> imap(io_loop_handle_timeouts+0xcc) [0x7f4a3120469c] -> imap(io_loop_handler_run+0x60) [0x7f4a31205000] -> imap(io_loop_run+0x18) [0x7f4a31204448] -> imap(main+0x58e) [0x7f4a3117fc5e] -> /lib/libc.so.6(__libc_start_main+0xfd) [0x7f4a307bbc4d] -> imap(+0x21979) [0x7f4a3116f979] Mar 6 08:42:31 mail01 dovecot: dovecot: child 24257 (imap) killed with signal 6 (core dumps disabled)
Note: the first part is on a mail server, while the smaller second part is on the other one.
Unfortunately I'm not able to reproduce this error.
My suspicion/speculation what happens is the following: Multiple users are accessing the Mailbox from their offices (all on the same server), one (or more) uses the Webmail or accesses the Mailbox from a different IP. Somehow this leads to problems with Locks on NFS, which leads to the crash.
I have no idea how to solve this problem and any help is greatly appreciated.
If you need further information, please say so. Mit freundlichen Grüssen Lukas Müller Systems Engineer
NEWMEDIA Südostschweiz Newmedia AG Kasernenstrasse 1 Postfach 508, CH-7007 Chur http://www.newmedia.chhttp://www.newmedia.ch/
TYPO3 & Drupal - Wir wissen wie. Ihre professionelle Web Agentur in Chur, Ilanz, Glarus und Zürich.
On 21.3.2012, at 17.45, Müller Lukas wrote:
Mar 6 08:26:31 mail02 dovecot: IMAP(user@example.com): fdatasync(/data/vmail/example.com/user/dovecot-uidlist) failed: Input/output error Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): Maildir /data/vmail/example.com/user: Expunged message reappeared, giving a new UID (old uid=1522, file=1326961561.V15I4d8562M567017.mail02:2,Sad) Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): Maildir /data/vmail/example.com/user: Expunged message reappeared, giving a new UID (old uid=1523, file=1326705103.V15I90105M613353.mail01:2,Sad) Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 4: 1326961561.V15I4d8562M567017.mail02:2,Sad (uid 1522 -> 1598) ..
My suspicion/speculation what happens is the following: Multiple users are accessing the Mailbox from their offices (all on the same server), one (or more) uses the Webmail or accesses the Mailbox from a different IP. Somehow this leads to problems with Locks on NFS, which leads to the crash.
Yes, most likely this is what's happening. Although your errors are more severe than what normally happens. I guess your NFS server is also partially to blame (microsecond resolution timestamps are at least helpful).
I have no idea how to solve this problem and any help is greatly appreciated.
The only way to fully fix this is: http://wiki2.dovecot.org/Director
Thanks for the quick answer.
I realised, that the error didn't occur since quite a while, opposed to what our client suggested. Back then I activated the two workarounds (imap_client_workarounds = outlook-idle delay-newmail) and increased mail_max_userip_connections for IMAP.
Is it possible that those could have improved the situation?
For now I don't have a way of reproducing the problem, so I will have to wait for an error to happen. Until then I will consider the problem as "Solved until happens again ;-) ", since the last error occured a while back (as mentioned above).
Mar 6 08:26:31 mail02 dovecot: IMAP(user@example.com): fdatasync(/data/vmail/example.com/user/dovecot-uidlist) failed: Input/output error Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): Maildir /data/vmail/example.com/user: Expunged message reappeared, giving a new UID (old uid=1522, file=1326961561.V15I4d8562M567017.mail02:2,Sad) Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): Maildir /data/vmail/example.com/user: Expunged message reappeared, giving a new UID (old uid=1523, file=1326705103.V15I90105M613353.mail01:2,Sad) Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 4: 1326961561.V15I4d8562M567017.mail02:2,Sad (uid 1522 -> 1598) ..
My suspicion/speculation what happens is the following: Multiple users are accessing the Mailbox from their offices (all on the same server), one (or more) uses the Webmail or accesses the Mailbox from a different IP. Somehow this leads to problems with Locks on NFS, which leads to the crash.
Yes, most likely this is what's happening. Although your errors are more severe than what normally happens. I guess your NFS server is also partially to blame (microsecond resolution timestamps are at least helpful).
I had a quick look a tour NFS (NetApp), but didn't find anything useful. In case the problem persists, I will check with the coworker responsible for NetApp.
I will check what sort of locking is used by postfix, since I'm not sure if postfix and dovecot are configured to use the same mechanisms. If not, I think it could be part of the Problem.
I have no idea how to solve this problem and any help is greatly appreciated.
The only way to fully fix this is: http://wiki2.dovecot.org/Director
Unfortunately that is not an option right now, but I will keep it in mind.
Thanks again.
On 26.3.2012, at 18.25, Müller Lukas wrote:
Thanks for the quick answer.
I realised, that the error didn't occur since quite a while, opposed to what our client suggested. Back then I activated the two workarounds (imap_client_workarounds = outlook-idle delay-newmail) and increased mail_max_userip_connections for IMAP.
Is it possible that those could have improved the situation?
Unlikely.
Mar 6 08:26:31 mail02 dovecot: IMAP(user@example.com): fdatasync(/data/vmail/example.com/user/dovecot-uidlist) failed: Input/output error Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): Maildir /data/vmail/example.com/user: Expunged message reappeared, giving a new UID (old uid=1522, file=1326961561.V15I4d8562M567017.mail02:2,Sad) Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): Maildir /data/vmail/example.com/user: Expunged message reappeared, giving a new UID (old uid=1523, file=1326705103.V15I90105M613353.mail01:2,Sad) Mar 6 08:42:29 mail02 dovecot: IMAP(user@example.com): /data/vmail/example.com/user/dovecot-uidlist: Duplicate file entry at line 4: 1326961561.V15I4d8562M567017.mail02:2,Sad (uid 1522 -> 1598) ..
My suspicion/speculation what happens is the following: Multiple users are accessing the Mailbox from their offices (all on the same server), one (or more) uses the Webmail or accesses the Mailbox from a different IP. Somehow this leads to problems with Locks on NFS, which leads to the crash.
Yes, most likely this is what's happening. Although your errors are more severe than what normally happens. I guess your NFS server is also partially to blame (microsecond resolution timestamps are at least helpful).
I had a quick look a tour NFS (NetApp), but didn't find anything useful. In case the problem persists, I will check with the coworker responsible for NetApp.
I guess you just had very bad luck. NetApp has a good NFS implementation and normally works fine with Dovecot.
participants (2)
-
Müller Lukas
-
Timo Sirainen