Am 21.07.2017 um 19:47 schrieb Bruce Guenter:
I am running Dovecot IMAP on Linux, on a LizardFS storage cluster with Maildir storage. This has worked well for most of the accounts for several months.
However in the last couple of weeks we are seeing increasing errors regarding corrupted index files.
you should avoid this one solution is to use loadbalancers with persistance and/or with i.e
https://wiki2.dovecot.org/Director
i dont know LizardFS but problems are somekind equal with all storage clusters and there are different solutions to handle this so i dont know what may the best at your place
i would read and ask here for settings with storage clusters, a good start could be
https://wiki2.dovecot.org/NFS https://wiki2.dovecot.org/SharedMailboxes/ClusterSetup https://wiki2.dovecot.org/MailLocation/SharedDisk
Some of the accounts affected are
unable to retrieve messages due to timeouts.
index settings and mailbox format has impact about this maildir mostly is self healing but that may fail sometimes on cluster
It appeared the problems were due to the accounts being accessed from multiple servers simultaneously, so I forced them all to access one server, but the errors remained. It looks like it has something to do with file locking, but LizardFS supports advisory file locking and I do have it enabled.
Deleting the corrupted indexes fixes the problem for a while, but it eventually returns, particularly for some accounts.
yeah that is perhaps per design
Here are some errors I'm seeing (just a random grab). Actual home directories are munged for confidentiality.
imap[25157]: (clientes.standby) Error: Failed to fix view for HOME/clientes:standby/dovecot.index: Missing middle file seq=1 (between 1..1, we have seqs 8): File is already open imap[5565]: (stadiumchair) Error: Transaction log file HOME/stadiumchair/.Drafts/dovecot.index.log: marked corrupted imap[5005]: (stadiumchair) Error: Corrupted transaction log file HOME/stadiumchair/.Drafts/dovecot.index.log seq 2: indexid changed 1418941056 -> 1500658549 (sync_offset=0) imap[20243]: (martha) Error: Transaction log HOME/martha/dovecot.index.log: duplicate transaction log sequence (539) imap[4665]: (emsspam) Error: Index file HOME/emsspam/dovecot.index: indexid changed: 1500658479 -> 1297175382 imap[4665]: (emsspam) Error: Corrupted transaction log file HOME/emsspam/dovecot.index.log seq 3: indexid changed: 1500658479 -> 1297175382 (sync_offset=316) imap[22985]: (emsspam) Error: Corrupted transaction log file HOME/emsspam/dovecot.index.log seq 10742: Invalid transaction log size (9296 vs 9296): HOME/emsspam/dovecot.index.log (sync_offset=9296) imap[3267]: (emsspam) Error: Failed to map view for HOME/emsspam/dovecot.index: Failed to map file seq=10742 offset=9052..18446744073709551615 (ret=0): corrupted, indexid=0 imap[3267]: (emsspam) Error: HOME/emsspam/dovecot.index view is inconsistent: uid=3062271 inserted in the middle of mailbox
The output of dovecot -n is pasted in below. Note that some of the boxes are running 4.9, some running 4.4, all have the same problems. Also note that I am using a custom authentication front end for our virtual mailboxes, but it just sets up the minimal environment variables and runs imap.
Is there anything I can change to eliminate these problems? Are there any other diagnostics I can provide to shed light on this?
# 2.2.31 (65cde28): /etc/dovecot/dovecot.conf # OS: Linux 4.4.66 x86_64 Gentoo Base System release 2.3 log_path = /dev/stderr mail_debug = yes mail_fsync = always mail_location = maildir:~/.maildir mail_log_prefix = "%s[%p]: (%u) " mmap_disable = yes namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = INBOX separator = type = private } passdb { args = * driver = pam } passdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_log_events = delete undelete expunge copy mailbox_delete mailbox_rename } ssl_cert =
i think you could rare the corrupt with optimize settings to i.e
mail_fsync = always mail_nfs_storage = yes mail_nfs_index = yes mmap_disable = yes
etc but to fix it at all you may have to rethink your whole setup dovecot gurus may help and search the list archive about cluster setups
Best Regards MfG Robert Schetterer
-- [*] sys4 AG
http://sys4.de, +49 (89) 30 90 46 64 Schleißheimer Straße 26/MG, 80333 München
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein