Inconsistency in map index with dovecot v2.3.21
Hello,
We have been running dovecot version 2.3.21 since July on debian bookworm system. After upgrading we have seen in the logs similar messages to the following:
2024-08-29T19:33:27.755399+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Warning: mdbox /path/user1/mdbox/storage: Inconsistency in map index (2213,40 != 2213,56). 2024-08-29T19:33:27.755399+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Warning: mdbox /path/user1/mdbox/storage: Inconsistency in map index (2213,40 != 2213,56) 2024-08-29T19:33:27.838990+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: sieve: msgid=39f9c444-0c08-4bda-bfc9-512c34584470@modulus.gr: fileinto action: stored mail into mailbox 'Replies' 2024-08-29T19:33:27.840222+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Warning: fscking index file /path/user1/mdbox/storage/dovecot.map.index 2024-08-29T19:33:27.953281+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Warning: mdbox /path/user1/mdbox/storage: rebuilding indexes 2024-08-29T19:37:27.989792+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481249>vhzsLROj0GYQVgcAfAk26A:P1: Error: Timeout (180s)while waiting for lock for transaction log file /path/user1/mdbox/storage/dovecot.map.index.log (WRITE lock held by pid 481240) 2024-08-29T19:37:27.995913+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481249>vhzsLROj0GYQVgcAfAk26A:P1: Error: sieve: msgid=39f9c444-0c08-4bda-bfc9-512c34584470@modulus.gr: fileinto action: failed to store into mailbox 'Mailbox': Timeout (180s) while waiting for lock for transaction log file /path/user1/mdbox/storage/dovecot.map.index.log (WRITE lock held by pid 481240)
This has happened a few times since we upgraded always by a lmtp process. It seems to be happening randomly and we have not been able to reproduce it.
The mailboxes are stored in a glusterfs server, and we use the dovecot director with shared mailboxes architecture.
Usually this was mitigated automatically by dovecot index rebuilding, but last time it happened on a mailbox with a lot of giga byte of mail data the following logs were produced for more than an hour:
2024-08-29T19:45:30.506694+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Error: mmap_anon(/path/user/mdbox/mailboxes/Mailbox/dbox-Mails/dovecot.index.cache, 74076160) failed: Cannot allocate memory
During that period segfaults starting being generated for the imap processes accessing that mailbox probably caused by timeouts from imap processes attempting to fetch emails from the affected mailbox:
[3912187.557029] imap[481783]: segfault at 38 ip 00007fb03248abf8 sp 00007ffc88b264f0 error 4 in libdovecot-storage.so.0.0.0[7fb0323c7000+d8000] likely on CPU 6 (core 3, socket 0)
What might be the reason that inconsistency of the map index is introduced on random users? We have seen some other users had similar issues and have mentioned them on this mailing list(dovecot Digest, Vol 249, Issue 4, dovecot Digest, Vol 250, Issue 8)?
Is there any update for a mitigation or bug confirmation for dovecot version 2.3.21 that might be responsible?
Thank you, Nikos Pyrgiotis
On 30/08/2024 15:32 EEST Nikolaos Pyrgiotis via dovecot dovecot@dovecot.org wrote:
Hello,
We have been running dovecot version 2.3.21 since July on debian bookworm system. After upgrading we have seen in the logs similar messages to the following:
2024-08-29T19:33:27.755399+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Warning: mdbox /path/user1/mdbox/storage: Inconsistency in map index (2213,40 != 2213,56). 2024-08-29T19:33:27.755399+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Warning: mdbox /path/user1/mdbox/storage: Inconsistency in map index (2213,40 != 2213,56) 2024-08-29T19:33:27.838990+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: sieve: msgid=39f9c444-0c08-4bda-bfc9-512c34584470@modulus.gr: fileinto action: stored mail into mailbox 'Replies' 2024-08-29T19:33:27.840222+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Warning: fscking index file /path/user1/mdbox/storage/dovecot.map.index 2024-08-29T19:33:27.953281+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Warning: mdbox /path/user1/mdbox/storage: rebuilding indexes 2024-08-29T19:37:27.989792+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481249>vhzsLROj0GYQVgcAfAk26A:P1: Error: Timeout (180s)while waiting for lock for transaction log file /path/user1/mdbox/storage/dovecot.map.index.log (WRITE lock held by pid 481240) 2024-08-29T19:37:27.995913+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481249>vhzsLROj0GYQVgcAfAk26A:P1: Error: sieve: msgid=39f9c444-0c08-4bda-bfc9-512c34584470@modulus.gr: fileinto action: failed to store into mailbox 'Mailbox': Timeout (180s) while waiting for lock for transaction log file /path/user1/mdbox/storage/dovecot.map.index.log (WRITE lock held by pid 481240)
This has happened a few times since we upgraded always by a lmtp process. It seems to be happening randomly and we have not been able to reproduce it.
The mailboxes are stored in a glusterfs server, and we use the dovecot director with shared mailboxes architecture.
Usually this was mitigated automatically by dovecot index rebuilding, but last time it happened on a mailbox with a lot of giga byte of mail data the following logs were produced for more than an hour:
2024-08-29T19:45:30.506694+03:00 server1 dovecot: lmtp(user1@modulus.gr)<481240><7TMeHNei0GYpeTkA1g8ogQ:P1>: Error: mmap_anon(/path/user/mdbox/mailboxes/Mailbox/dbox-Mails/dovecot.index.cache, 74076160) failed: Cannot allocate memory
During that period segfaults starting being generated for the imap processes accessing that mailbox probably caused by timeouts from imap processes attempting to fetch emails from the affected mailbox:
[3912187.557029] imap[481783]: segfault at 38 ip 00007fb03248abf8 sp 00007ffc88b264f0 error 4 in libdovecot-storage.so.0.0.0[7fb0323c7000+d8000] likely on CPU 6 (core 3, socket 0)
Are you by chance accessing user's mailbox from multiple different servers, like delivering via LMTP on one, and user accessing on another?
Also you probably want to hike up the vsize_limit, the default 256M is usually too little for mail processses. mmap'd files are counted against this limit.
Aki
We are using the Shared Mailboxes in Dovecot Cluster architecture so the users mailbox should only be accessed by a single backend. We have been running with this architecture for about 2 years with dovecot version 2.3.19 and we had not seen any
Inconsistency in map index` messages in the logs.
We had already increased the default vsize_limit to 512M for the lmtp process but maybe we have to increase it more if this error will be occurring repeatedly. This is the relevant configuration stanza for the lmtp service for all backends. service lmtp { unix_listener lmtp { mode = 0666 }
inet_listener lmtp-mod { port = 24 ssl = no } service_count = 0
# Number of processes to always keep waiting for more connections. process_min_avail = 0 process_limit = 0
# If you set service_count=0, you probably need to grow this. vsz_limit = 512M client_limit = 1 } But the question is what causes this inconsistency and if there is any change on 2.3.21 version that might have introduced it, because we did not get any similar logs on previous versions. When this does occur is there a chance that a mailbox might get corrupted? Is the only way to safely mitigate this to wait for the index rebuild to finish?
Thank you, Nikos Pyrgiotis
On 30/08/2024 16:11 EEST Nikolaos Pyrgiotis via dovecot dovecot@dovecot.org wrote:
We are using the Shared Mailboxes in Dovecot Cluster architecture so the user
s mailbox should only be accessed by a single backend. We have been running with this architecture for about 2 years with dovecot version 2.3.19 and we had not seen any
Inconsistency in map index` messages in the logs.We had already increased the default vsize_limit to 512M for the lmtp process but maybe we have to increase it more if this error will be occurring repeatedly. This is the relevant configuration stanza for the lmtp service for all backends. service lmtp { unix_listener lmtp { mode = 0666 }
inet_listener lmtp-mod { port = 24 ssl = no } service_count = 0
# Number of processes to always keep waiting for more connections. process_min_avail = 0 process_limit = 0
# If you set service_count=0, you probably need to grow this. vsz_limit = 512M client_limit = 1 } But the question is what causes this inconsistency and if there is any change on 2.3.21 version that might have introduced it, because we did not get any similar logs on previous versions. When this does occur is there a chance that a mailbox might get corrupted? Is the only way to safely mitigate this to wait for the index rebuild to finish?
Thank you, Nikos Pyrgiotis
You can try running force-resync on the affected mailbox.
Also I would recommend upping the limit to 2G for lmtp, indexer, indexer-worker,lmtp etc. because otherwise user's index cache might not fit into memory.
Aki
Thank you for your recommendation. We will try to increase the vsz_limit for these services and see if it helps to mitigate the issue.
But is there a possible bug in dovecot 2.3.21 version linked with the mdbox format that is causing the inconsistency in map index
in the first place or it is just a configuration error? Other users have also reported these error messages on this older thread https://dovecot.org/mailman3/archives/list/dovecot@dovecot.org/thread/73CEPD... .
Is there anything i can do to assist in identifying the cause of these errors?
Thank you, Nikos Pyrgiotis
On 30. Aug 2024, at 16.39, Nikolaos Pyrgiotis via dovecot dovecot@dovecot.org wrote:
But is there a possible bug in dovecot 2.3.21 version linked with the mdbox format that is causing the
inconsistency in map index
in the first place or it is just a configuration error? Other users have also reported these error messages on this older thread https://dovecot.org/mailman3/archives/list/dovecot@dovecot.org/thread/73CEPD... .
What was the previous version you were running? 2.3.20? I don't think there are really much of any changes related to dbox or index file handling. We did do pretty large fixes to mdbox corruption handling, but looks like they're still waiting for v2.3.22 release. Maybe those would help with these inconsistency issues also, or at least fixing them.
Hello Timo,
We had been running version 2.3.20 previously, not 2.3.19. We will upgrade to version 2.3.22 when it is out and monitor if these errors stop.
Thank you, Nikos
participants (4)
-
Aki Tuomi
-
Nikolaos Pyrgiotis
-
npyrgiotis@modulus.gr
-
Timo Sirainen