[Dovecot] Index cache file problems in Dovecot 1.2.0
I've been seeing lots of index cache file errors (using mbox on Solaris 8 sparc 64-bit, but with 32-bit Dovecot) since I switched my account to Dovecot 1.2.0 (we're still on 1.0.15 mostly, but I'm hoping to upgrade to 1.1.17+ this summer, or 1.2.x if it's stable enough).
e.g.
Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: field index too large (47 >= 25)
I've tried renaming/removing the .imap/INBOX directory, but the "Corrupted index cache file" errors return within a day or so.
This morning, Thunderbird got itself in a mess retrying something that caused multiple core dumps:
Jul 14 08:52:53 dovecot: Error: child 17694 (imap) killed with signal 10 (core dumped) Jul 14 08:52:54 dovecot: Error: child 17697 (imap) killed with signal 10 (core dumped) Jul 14 08:52:55 dovecot: Error: child 17700 (imap) killed with signal 10 (core dumped)
Here's a backtrace
#0 mail_cache_get_record (cache=0x124e28, offset=289, rec_r=0xffbeeb6c) at mail-cache-lookup.c:30 30 if (rec->size < sizeof(*rec)) { (gdb) bt #0 mail_cache_get_record (cache=0x124e28, offset=289, rec_r=0xffbeeb6c) at mail-cache-lookup.c:30 #1 0x84e2c in mail_cache_lookup_iter_next_record (ctx=0xffbeeb60) at mail-cache-lookup.c:186 #2 0x84f14 in mail_cache_lookup_iter_next (ctx=0xffbeeb60, field_r=0xffbeeb50) at mail-cache-lookup.c:219 #3 0x85650 in mail_cache_lookup_headers_real (view=0x128510, dest=0x129728, seq=11416, field_idxs=0xffbeec74, fields_count=2) at mail-cache-lookup.c:500 #4 0x85848 in mail_cache_lookup_headers (view=0x128510, dest=0x129728, seq=11416, field_idxs=0xffbeec74, fields_count=2) at mail-cache-lookup.c:555 #5 0x6ba94 in index_mail_get_raw_headers (mail=0x128f20, field=0xff142ee8 "Message-ID", value_r=0xffbeecfc) at index-mail-headers.c:614 #6 0x6c084 in index_mail_get_first_header (_mail=0x128f20, field=0xff142ee8 "Message-ID", decode_to_utf8=false, value_r=0xffbeede4) at index-mail-headers.c:771 #7 0x7893c in mail_get_first_header (mail=0x128f20, field=0xff142ee8 "Message-ID", value_r=0xffbeede4) at mail.c:118 #8 0xff1419a0 in mail_log_add_hdr (mail=0x128f20, str=0x10edc8, name=0xff142d68 "msgid", header=0xff142ee8 "Message-ID") at mail-log-plugin.c:277 #9 0xff141c5c in mail_log_action (dest_trans=0x128f20, mail=0x128f20, event=MAIL_LOG_EVENT_DELETE, data=0x0) at mail-log-plugin.c:318 #10 0xff141e5c in mail_log_mail_update_flags (_mail=0x128f20, modify_type=MODIFY_ADD, flags=12) at mail-log-plugin.c:376 #11 0x78af8 in mail_update_flags (mail=0x128f20, modify_type=MODIFY_ADD, flags=12) at mail.c:184 #12 0x2ab3c in cmd_store (cmd=0x11bec8) at cmd-store.c:192 #13 0x2b5f8 in cmd_uid (cmd=0x11bec8) at cmd-uid.c:27 #14 0x2ccc4 in client_command_input (cmd=0x11bec8) at client.c:611 #15 0x2cb68 in client_command_input (cmd=0x11bec8) at client.c:660 #16 0x2cd4c in client_handle_next_command (client=0x11bc40, remove_io_r=0xffbef23f) at client.c:701 #17 0x2ce40 in client_handle_input (client=0x11bc40) at client.c:713 #18 0x2cf78 in client_input (client=0x11bc40) at client.c:752 #19 0xc6c04 in io_loop_handler_run (ioloop=0x1186d8) at ioloop-poll.c:203 #20 0xc63f4 in io_loop_run (ioloop=0x1186d8) at ioloop.c:338 #21 0x369f4 in main (argc=1107968, argv=0xffbef484, envp=0xffbef494) at main.c:323
I didn't have these problems with 1.1.x. I guess I ought to get out the imaptest tool and see if I can reproduce this reliably ...
Best Wishes, Chris
-- --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+- Christopher Wakelin, c.d.wakelin@reading.ac.uk IT Services Centre, The University of Reading, Tel: +44 (0)118 378 8439 Whiteknights, Reading, RG6 2AF, UK Fax: +44 (0)118 975 3094
On Tue, 2009-07-14 at 11:08 +0100, Chris Wakelin wrote:
Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: field index too large (47 >= 25)
Wonder if http://hg.dovecot.org/dovecot-1.2/rev/ae3e0ff64c94 fixes it?
Jul 14 08:52:53 dovecot: Error: child 17694 (imap) killed with signal 10 (core dumped) .. #0 mail_cache_get_record (cache=0x124e28, offset=289, rec_r=0xffbeeb6c) at mail-cache-lookup.c:30 30 if (rec->size < sizeof(*rec)) {
Timo Sirainen wrote:
On Tue, 2009-07-14 at 11:08 +0100, Chris Wakelin wrote:
Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: field index too large (47 >= 25)
Wonder if http://hg.dovecot.org/dovecot-1.2/rev/ae3e0ff64c94 fixes it?
I've not seen that one in my stress testing with imaptest on a recent dovecot-latest (1.2.2 plus patches, almost the same as the 1.2.3 release candidate) which I did in earlier versions, so good stuff :)
Jul 14 08:52:53 dovecot: Error: child 17694 (imap) killed with signal 10 (core dumped) .. #0 mail_cache_get_record (cache=0x124e28, offset=289, rec_r=0xffbeeb6c) at mail-cache-lookup.c:30 30 if (rec->size < sizeof(*rec)) {
Likewise.
I'm still getting errors like
Aug 04 18:01:22 IMAP 23842 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 04 18:01:22 IMAP 23842 <username> 127.0.0.1 : Error: fcntl(write-lock) locking failed for file <path>/.imap/INBOX/dovecot.index.log: Deadlock situation detected/avoided Aug 04 18:01:22 IMAP 23842 <username> 127.0.0.1 : Error: mail_index_wait_lock_fd() failed with file <path>/.imap/INBOX/dovecot.index.log: Deadlock situation detected/avoided Aug 04 18:01:32 IMAP 23840 <username> 127.0.0.1 : Error: stat() failed with index cache file <path>/.imap/INBOX/dovecot.index.cache: No such file or directory
on Solaris 8 (64-bit Sparc, but 32-bit binary) and
Aug 05 16:42:46 IMAP 29142 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:42:46 IMAP 29143 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:43:16 IMAP 29148 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:43:16 IMAP 29148 <username> 127.0.0.1 : Error: fcntl() failed with mbox file <path>/INBOX: Deadlock situation detected/avoided Aug 05 16:43:16 IMAP 29148 <username> 127.0.0.1 : Error: fcntl() failed with mbox file <path>/INBOX: Deadlock situation detected/avoided Aug 05 16:43:17 IMAP 29149 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:43:17 IMAP 29147 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:43:18 IMAP 29150 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: record points outside file
on Solaris 10 (64-bit Sparc, but 32-bit binary) but no errors on Ubuntu 8.04.3 x86_64. When not using concurrent logons, I think it was fine even in 1.2.0 (which made we wonder about a possible race condition?).
I've been using something like
./imaptest host=127.0.0.1 port=6143 user=<username> pass=<passwd> mbox=./dovecot-crlf clients=5 msgs=10000 box=INBOX copybox=test1407 seed=140709
for all the tests, resetting the INBOX (which is a ~90MB mbox file with 10,000+ messages) and removing the indexes and test1407 folders each time.
Is there a way to make Dovecot abort (or even assert crash for a backtrace) on these errors (which of course, it usually manages transparently to the user)?
dovecot -n output ("%J" is a local hash function):
# 1.2.2: <path>/etc/dovecot-test1.2.conf # OS: SunOS 5.10 sun4u ufs base_dir: <path>/var/run/dovecot-test1.2/ log_path: <path>/var/log/dovecot-test1.2.log info_log_path: <path>/var/log/dovecot-test1.2.log protocols: imap imaps pop3 pop3s listen(default): *:6143 listen(imap): *:6143 listen(pop3): *:6110 ssl_listen(default): *:6993 ssl_listen(imap): *:6993 ssl_listen(pop3): *:6995 ssl_cert_file: <path>/certs/<hostname>.rdg.ac.uk.crt ssl_key_file: <path>/certs/<hostname>.rdg.ac.uk.key disable_plaintext_auth: no shutdown_clients: no login_dir: <path>/var/run/dovecot-test1.2/login login_executable(default): <path>/libexec/dovecot/imap-login-test1.2 login_executable(imap): <path>/libexec/dovecot/imap-login-test1.2 login_executable(pop3): <path>/libexec/dovecot/pop3-login-test1.2 login_greeting(default): University of Reading IMAP test1.2 ready. login_greeting(imap): University of Reading IMAP test1.2 ready. login_greeting(pop3): University of Reading POP3 test1.2 ready. login_process_per_connection: no max_mail_processes: 4096 verbose_proctitle: yes mail_location: mbox:/export/folders/%Jn/%n/mail/:INBOX=/export/mail/%Jn/%n/INBOX:INDEX=/export/indexes/%Jn/%n mbox_very_dirty_syncs: yes mail_drop_priv_before_exec: yes mail_executable(default): <path>/libexec/dovecot/imap-test1.2 mail_executable(imap): <path>/libexec/dovecot/imap-test1.2 mail_executable(pop3): <path>/libexec/dovecot/pop3-test1.2 mail_plugins(default): zlib mail_plugins(imap): zlib mail_plugins(pop3): mail_plugin_dir(default): <path>/libexec/plugins-1.2 mail_plugin_dir(imap): <path>/libexec/plugins-1.2 mail_plugin_dir(pop3): <path>/lib/dovecot/pop3 mail_log_prefix: %Us %p %u %r : imap_capability(default): IMAP4 IMAP4rev1 IDLE LOGIN-REFERRALS MAILBOX-REFERRALS NAMESPACE LITERAL+ UIDPLUS CHILDREN imap_capability(imap): IMAP4 IMAP4rev1 IDLE LOGIN-REFERRALS MAILBOX-REFERRALS NAMESPACE LITERAL+ UIDPLUS CHILDREN imap_capability(pop3): imap_client_workarounds(default): outlook-idle delay-newmail imap_client_workarounds(imap): outlook-idle delay-newmail imap_client_workarounds(pop3): pop3_reuse_xuidl(default): no pop3_reuse_xuidl(imap): no pop3_reuse_xuidl(pop3): yes pop3_uidl_format(default): %08Xu%08Xv pop3_uidl_format(imap): %08Xu%08Xv pop3_uidl_format(pop3): %08Xv%08Xu pop3_client_workarounds(default): pop3_client_workarounds(imap): pop3_client_workarounds(pop3): outlook-no-nuls namespace: type: private separator: / inbox: yes list: yes subscriptions: yes namespace: type: private separator: / prefix: mail/ hidden: yes list: yes subscriptions: yes namespace: type: private separator: / prefix: ~/mail/ hidden: yes list: yes subscriptions: yes namespace: type: private separator: / prefix: ~%u/mail/ hidden: yes list: yes subscriptions: yes lda: postmaster_address: postmaster@example.com auth default: mechanisms: plain login executable: <path>/libexec/dovecot/dovecot-auth-test1.2 username_format: %Ln verbose: yes debug: yes passdb: driver: passwd-file args: <path>/etc/test_users passdb: driver: passwd-file args: /export/mailconfig/%Jn/%n/migrating deny: yes passdb: driver: passwd-file args: /export/mailconfig/%Jn/%n/proxy.%Ls passdb: driver: ldap args: <path>/etc/dovecot-ldap.conf userdb: driver: passwd-file args: <path>/etc/test_users userdb: driver: passwd-file args: <path>/etc/userdb
Best Wishes, Chris
-- --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+- Christopher Wakelin, c.d.wakelin@reading.ac.uk IT Services Centre, The University of Reading, Tel: +44 (0)118 378 8439 Whiteknights, Reading, RG6 2AF, UK Fax: +44 (0)118 975 3094
Chris Wakelin wrote:
I'm still getting errors like
Aug 04 18:01:22 IMAP 23842 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 04 18:01:22 IMAP 23842 <username> 127.0.0.1 : Error: fcntl(write-lock) locking failed for file <path>/.imap/INBOX/dovecot.index.log: Deadlock situation detected/avoided Aug 04 18:01:22 IMAP 23842 <username> 127.0.0.1 : Error: mail_index_wait_lock_fd() failed with file <path>/.imap/INBOX/dovecot.index.log: Deadlock situation detected/avoided Aug 04 18:01:32 IMAP 23840 <username> 127.0.0.1 : Error: stat() failed with index cache file <path>/.imap/INBOX/dovecot.index.cache: No such file or directory
on Solaris 8 (64-bit Sparc, but 32-bit binary) and
Aug 05 16:42:46 IMAP 29142 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:42:46 IMAP 29143 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:43:16 IMAP 29148 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:43:16 IMAP 29148 <username> 127.0.0.1 : Error: fcntl() failed with mbox file <path>/INBOX: Deadlock situation detected/avoided Aug 05 16:43:16 IMAP 29148 <username> 127.0.0.1 : Error: fcntl() failed with mbox file <path>/INBOX: Deadlock situation detected/avoided Aug 05 16:43:17 IMAP 29149 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:43:17 IMAP 29147 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: used_file_size too large Aug 05 16:43:18 IMAP 29150 <username> 127.0.0.1 : Error: Corrupted index cache file <path>/.imap/INBOX/dovecot.index.cache: record points outside file
on Solaris 10 (64-bit Sparc, but 32-bit binary) but no errors on Ubuntu 8.04.3 x86_64. When not using concurrent logons, I think it was fine even in 1.2.0 (which made we wonder about a possible race condition?).
OK, I've tried various combinations of Solaris 8 and 10, both 32 and 64 bit binaries, both local mailboxes and over NFS (to the same NetApp filer volume, with indexes always local), and even compiled up GCC 4.x (4.3.4) on Solaris 10 to see why I get errors on Solaris/Sparc but not Linux/Intel 64-bit.
Wild speculations:
- Is it an endian problem?
- Is it a speed problem (race condition?); the Solaris machines are generally slower.
I'm not sure whether "record points outside file" has occured in 64-bit, but "used_file_size too large" occurs in all Solaris tests.
I've yet to try with a real client, but I had problems with searches taking a very long time in Thunderbird and version 1.2.0 (presumably while it rebuilt a "corrupt index") and once a very bad corruption of the headers cached by Thunderbird (I had to delete the Dovecot indexes and rebuild Thunderbird's index).
Best Wishes, Chris
-- --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+- Christopher Wakelin, c.d.wakelin@reading.ac.uk IT Services Centre, The University of Reading, Tel: +44 (0)118 378 8439 Whiteknights, Reading, RG6 2AF, UK Fax: +44 (0)118 975 3094
participants (2)
-
Chris Wakelin
-
Timo Sirainen