I managed to figure this out.
Just wanted to follow up if anyone in the future encounters the same issue.
I am using HAProxy along with dovecot replication. When an email comes in... it is round robin'd to each of the 2 dovecot/postfix servers. I have postfix running on each server and it uses the local dovecot LMTP service for storing the mails.
We sometimes get surges of emails(hundreds or thousands in a couple of seconds). Each alternative request hits a different mail server and then should be replicated. The replication works and we don't encounter missing or duplicated emails.
However, when I purge expunged emails, somehow this does not fully remove them when it's setup in this fashion.
I ran 2 tests:
1) I send 1000 emails to the loadbalancer(round-robin), expunge on both servers & purge them on both servers, recreate the indexes... ~200 emails come back.
2) I send 1000 emails to one of the mail servers, expunge & purge them, recreate the indexes... 0 emails come back.
My fix is to remove the round robin load balancing and use sticky tables in HAproxy.
From: Zelic Bojan
Sent: Thursday, August 20, 2020 1:42 PM
To: dovecot@dovecot.org <dovecot@dovecot.org>
Subject: Expuning & Purging doesn't fully remove emails?
Hello, I'm facing an issue where deleted emails keep re-appearing after my mailbox index gets recreated. I'm running version 2.2.36 of dovecot, but I tested the same scenario under 2.3.10. I'm also using mdbox, autoexpunge, and using
dovecot replication.
I've had several instances now where some expunged emails show up again in a mailbox. I noticed this error:
doveadm: Error: Corrupted dbox file /var/mail/virtual/mailbox@domain.com/mdbox/storage/m.3228 (around offset=1988744): msg header has bad magic value
which caused the index to get rebuilt... however several times now, the indexes got rebuilt even though there doesnt seem like there was any error... so im not sure why that is.
lmtp(13910): Warning: fscking index file /var/mail/virtual/mailbox@domain/mdbox/storage/dovecot.map.index
lmtp(13910): Warning: fscking index file /var/mail/virtual/mailbox@domain/mdbox/storage/dovecot.map.index
lmtp(13910): Warning: mdbox /var/mail/virtual/mailbox@domain/mdbox/storage: rebuilding indexes
I'm not sure why these mails keep coming back though... or if there's anything that I can do to limit the number of emails that get restored.
I want to make sure expunged & purged emails stay expunged & purged. If I run a purge and then force index recreation... why would expunged emails come back? Shouldn't I expect them all to be deleted & purged? Does expunge not expunge all emails? (In production,
I'm running autoexpunge but this test below shows what happens when I attempt to expunge everything)
doveadm search -u email@domain all | wc -l
# output 22096
doveadm expunge -u email@domain mailbox '*' all
doveadm search -u email@domain all | wc -l
# output: 0
doveadm purge -u email@domain
doveadm dump /var/mail/virtual/email@domain/mdbox/storage/ | grep -c 'ref.*\b0\b'
# output: 0
doveadm force-resync -u email@domain Inbox
# output:
# doveadm(email@domain): Warning: fscking index file /var/mail/virtual/email@domain/mdbox/storage/dovecot.map.index
# doveadm(email@domain): Warning: mdbox /var/mail/virtual/email@domain/mdbox/storage: rebuilding indexes
# doveadm(email@domain): Warning: fscking index file /var/mail/virtual/email@domain/mdbox/storage/dovecot.map.index
doveadm search -u email@domain all | wc -l
# output: 843
I would expect the output to be 0. Theoretically I deleted all emails and purged all emails. Nothing should be left on the disk? However... I can see there are still m.* files in mdbox/storage for the mailbox.
Overall, I'm not sure why the index got recreated... but I'm trying to limit the impact of mailbox corruption so that deleted emails do not come back if the index is somehow recreated again.
If I were to re-run expunge, purge, and force-resync a 2nd time... it does get emptied out, but I'm not looking to run force-resync intentially since it causes dataloss with mdbox, and re-running only expunge & purge doesn't seem to do anything.
Bojan Zelic
Sr. IT Infrastructure Engineer