Index Corruption Problem with new VM Host - But Only With Replication Enabled

Reuben Farrelly reuben-dovecot at reub.net
Fri Feb 18 04:59:16 UTC 2022


Hi,

I've recently migrated my two VMs across from Linode (who use KVM) onto 
a local VPS service (which also uses KVM).  Since doing so I have 
started to see some strange problems with Dovecot relating to indexes 
and replication.

I have copied the configuration files across from old host to new host. 
The kernel is the same - as this is Gentoo everything was rebuilt and 
installed from fresh, but with the same options (use flags).  Even the 
Linux kernel is the same version with the exact same options (as is 
Dovecot).  The filesystem is the same EXT4 with the same options too.

Here is what is logged:

Feb 18 15:20:41 tornado.reub.net dovecot: 
imap(reuben)<20031><d0yxIEPY3ROfxG3u>: Error: Mailbox INBOX: 
/home/reuben/Maildir/dovecot.index reset, view is now inconsistent
Feb 18 15:20:41 tornado.reub.net dovecot: 
imap(reuben)<20031><d0yxIEPY3ROfxG3u>: Disconnected: IMAP session state 
is inconsistent, please relogin. in=146 out=1398 deleted=0 expunged=0 
trashed=0 hdr_count=0 hdr_bytes=0 body_count=0 body_bytes=0

The trigger is the move/delete of an email.  Just reading seems to be 
OK.  But as soon as you move/delete an email this event occurs and is 
logged.  The move fails on the first attempt (error is logged, client 
refreshes) but then if attempted again it succeeds.

This error causes the clients to be thrown off and the client view 
refreshed which seems to require a redownload of everything.  It is 
extremely disruptive to client access.

The odd thing is that as soon as I disable replication by commenting out 
the mail_replica line, *all* of the index corruption messages and 
symptoms go away completely and I don't see this problem occur. As soon 
as I re-add it they come back.  So it appears that replication at least 
is triggering the problem.

Replication is imap1 -> imap2 and on the other host imap2 - imap1

The odd thing is that this never was a problem while the system was on 
Linode.

I have removed all files on the remote and let dovecot re-replicate 
everything but this didn't help.

I can go back to the VPS host and talk to them about it as I have an 
open ticket but from their perspective it looks like an application 
level problem, so there's not a lot for them to go on especially given 
everything else is working OK.  Could the underlying host be playing a 
role in this?  It seems unlikely given turning replication or or off 
causes the problem to appear or disappear.

All clients are connecting to only the master of the two replication 
partners.  Many have multiple concurrent sessions (eg my phone, and 
Thunderbird at the same time) but all are connecting to only one 
instance at a time.  The second instance is only there for a failover 
scenario (update a DNS entry to failover).

There is no NFS or any sort of sharing set up on the VMs.  They are 
dedicated virtual disks.

Can anyone give me any ideas how to troubleshoot this further or things 
to try?

Some system details:

# 2.3.18 (9dd8408c18): /etc/dovecot/dovecot.conf
# Pigeonhole version 0.5.18 (0bc28b32)
# OS: Linux 5.16.10-gentoo x86_64 Gentoo Base System release 2.8
# Hostname: tornado.reub.net
auth_mechanisms = plain login
auth_username_format = %Ln
doveadm_password = # hidden, use -P to show it
first_valid_uid = 1000
imap_client_workarounds = tb-lsub-flags tb-extra-mailbox-sep
last_valid_uid = 1099
login_log_format_elements = user=<%u> auth-method=%m remote=%r local=%l 
%c %k
mail_attribute_dict = file:%h/Maildir/dovecot-attributes
mail_location = maildir:~/Maildir
mail_plugins = notify replication
managesieve_notify_capability = mailto
managesieve_sieve_capability = fileinto reject envelope 
encoded-character vacation subaddress comparator-i;ascii-numeric 
relational regex imap4flags copy include variables body enotify 
environment mailbox date index ihave duplicate mime foreverypart extracttext
namespace inbox {
   inbox = yes
   location =
   mailbox Drafts {
     special_use = \Drafts
   }
   mailbox Junk {
     special_use = \Junk
   }
   mailbox Sent {
     special_use = \Sent
   }
   mailbox "Sent Messages" {
     special_use = \Sent
   }
   mailbox Trash {
     special_use = \Trash
   }
   prefix =
}
passdb {
   args = failure_show_msg=yes %s
   driver = pam
}
plugin {
   mail_replica = tcp:imap2.reub.net:4814
   replication_full_sync_interval = 2 hours
   sieve = file:~/sieve;active=~/.dovecot.sieve
}
postmaster_address = postmaster at reub.net
protocols = imap lmtp sieve submission sieve
recipient_delimiter = -
service aggregator {
   fifo_listener replication-notify-fifo {
     mode = 0666
     user = root
   }
   unix_listener replication-notify {
     mode = 0666
     user = root
   }
}
service auth {
   unix_listener /var/spool/postfix/private/auth {
     group = postfix
     mode = 0666
     user = postfix
   }
   unix_listener auth-userdb {
     mode = 0777
   }
}
service doveadm {
   inet_listener {
     address = 2404:9400:2264:7200::143
     port = 4813
     ssl = yes
   }
   inet_listener {
     address = 2404:9400:2264:7200::143
     port = 4814
     ssl = no
   }
   user = root
}
service imap-login {
   inet_listener imap {
     port = 143
   }
}
service lmtp {
   inet_listener lmtp {
     address = ::1
     port = 24
   }
   unix_listener /var/spool/postfix/private/dovecot-lmtp {
     group = postfix
     mode = 0660
     user = postfix
   }
}
service managesieve-login {
   inet_listener sieve {
     address = 127.0.0.1
     port = 4190
   }
}
service replicator {
   process_min_avail = 1
   unix_listener replicator-doveadm {
     mode = 0666
   }
}
service submission-login {
   inet_listener submission {
     address = 103.4.234.81 2404:9400:2264:7200::587
     port = 587
   }
}
ssl_cert = </etc/ssl/dovecot/reub.net.pem
ssl_client_ca_dir = /etc/ssl/certs
ssl_dh = # hidden, use -P to show it
ssl_key = # hidden, use -P to show it
submission_client_workarounds = whitespace-before-path
submission_relay_host = localhost
submission_relay_port = 587
submission_relay_trusted = yes
userdb {
   driver = passwd
   result_success = continue-ok
}
protocol lmtp {
   mail_plugins = notify replication sieve
}
protocol lda {
   mail_plugins = notify replication sieve
}
protocol imap {
   imap_metadata = yes
   mail_max_userip_connections = 25
}

Thanks,
Reuben



More information about the dovecot mailing list