On 21.07.2017 20:47, Bruce Guenter wrote:
I am running Dovecot IMAP on Linux, on a LizardFS storage cluster with Maildir storage. This has worked well for most of the accounts for several months.
However in the last couple of weeks we are seeing increasing errors regarding corrupted index files. Some of the accounts affected are unable to retrieve messages due to timeouts.
It appeared the problems were due to the accounts being accessed from multiple servers simultaneously, so I forced them all to access one server, but the errors remained. It looks like it has something to do with file locking, but LizardFS supports advisory file locking and I do have it enabled.
Deleting the corrupted indexes fixes the problem for a while, but it eventually returns, particularly for some accounts.
Here are some errors I'm seeing (just a random grab). Actual home directories are munged for confidentiality.
imap[25157]: (clientes.standby) Error: Failed to fix view for HOME/clientes:standby/dovecot.index: Missing middle file seq=1 (between 1..1, we have seqs 8): File is already open imap[5565]: (stadiumchair) Error: Transaction log file HOME/stadiumchair/.Drafts/dovecot.index.log: marked corrupted imap[5005]: (stadiumchair) Error: Corrupted transaction log file HOME/stadiumchair/.Drafts/dovecot.index.log seq 2: indexid changed 1418941056 -> 1500658549 (sync_offset=0) imap[20243]: (martha) Error: Transaction log HOME/martha/dovecot.index.log: duplicate transaction log sequence (539) imap[4665]: (emsspam) Error: Index file HOME/emsspam/dovecot.index: indexid changed: 1500658479 -> 1297175382 imap[4665]: (emsspam) Error: Corrupted transaction log file HOME/emsspam/dovecot.index.log seq 3: indexid changed: 1500658479 -> 1297175382 (sync_offset=316) imap[22985]: (emsspam) Error: Corrupted transaction log file HOME/emsspam/dovecot.index.log seq 10742: Invalid transaction log size (9296 vs 9296): HOME/emsspam/dovecot.index.log (sync_offset=9296) imap[3267]: (emsspam) Error: Failed to map view for HOME/emsspam/dovecot.index: Failed to map file seq=10742 offset=9052..18446744073709551615 (ret=0): corrupted, indexid=0 imap[3267]: (emsspam) Error: HOME/emsspam/dovecot.index view is inconsistent: uid=3062271 inserted in the middle of mailbox
The output of dovecot -n is pasted in below. Note that some of the boxes are running 4.9, some running 4.4, all have the same problems. Also note that I am using a custom authentication front end for our virtual mailboxes, but it just sets up the minimal environment variables and runs imap.
Is there anything I can change to eliminate these problems? Are there any other diagnostics I can provide to shed light on this?
# 2.2.31 (65cde28): /etc/dovecot/dovecot.conf # OS: Linux 4.4.66 x86_64 Gentoo Base System release 2.3 log_path = /dev/stderr mail_debug = yes mail_fsync = always mail_location = maildir:~/.maildir mail_log_prefix = "%s[%p]: (%u) " mmap_disable = yes namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = INBOX separator = type = private } passdb { args = * driver = pam } passdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_log_events = delete undelete expunge copy mailbox_delete mailbox_rename } ssl_cert = </etc/ssl/dovecot/server.pem ssl_key = # hidden, use -P to show it userdb { driver = passwd } userdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql }
Do you have users accessing the files concurrently from more than one dovecot instance at a time?
Aki