[Dovecot] Rare problem with Indexes?
We've been running 1.1 rc3 now for about a month, and overall it's been very fast and stable.
There is however an Index-related problem that keeps cropping up which prevents a user from accessing their mailbox until their Indexes are manually removed.
Somehow, an Index becomes problematic, and Dovecot hangs forever when the mailbox is accessed. It manifests when a user logs in and issues a "SELECT INBOX", or "GETQUOTAROOT" when "maildirsize" needs rebuilding.
When Dovecot hangs, no data is returned and no further commands are accepted, and the daemon must be killed. We see no errors in the logs, but have yet to catch the problem with "mail_debug" enabled.
We're using NFS with Maildir's and these related options:
mail_location = maildir:/mail/%Lu/Maildir:INDEX=/mail/%Lu/Cache mmap_disable = yes mail_nfs_storage = yes mail_nfs_index = yes mail_drop_priv_before_exec = yes mail_cache_min_mail_count = 128 mailbox_idle_check_interval = 5 maildir_stat_dirs = yes maildir_copy_with_hardlinks = yes maildir_copy_preserve_filename = no
Has anyone else run across the need to purge Indexes manually? Could this be unrecoverable Index corruption, or a locking issue?
We've made a copy of the mailbox when the problem occurs. However, when we test it later, Dovecot re-builds the "dovecot.index" file itself just fine, and functions normally.
We plan to "mail_debug" when this happens next to see if we can get more information.
Thanks!
-Rich
On Wed, 2008-04-09 at 09:04 -0700, richs@whidbey.net wrote:
We've been running 1.1 rc3 now for about a month, and overall it's been very fast and stable.
There is however an Index-related problem that keeps cropping up which prevents a user from accessing their mailbox until their Indexes are manually removed.
Somehow, an Index becomes problematic, and Dovecot hangs forever when the mailbox is accessed. It manifests when a user logs in and issues a "SELECT INBOX", or "GETQUOTAROOT" when "maildirsize" needs rebuilding.
When Dovecot hangs, no data is returned and no further commands are accepted, and the daemon must be killed.
Where is it hanging? Does strace show anything? Could you get gdb backtrace? (gdb attach <pid>, bt full)
We plan to "mail_debug" when this happens next to see if we can get more information.
mail_debug isn't really useful for debugging anything except configuration mistakes.
participants (2)
-
richs@whidbey.net
-
Timo Sirainen