Hi,
I’m dealing for few days with replication issues between two dovecot containerized instances in Kubernetes environment.
After some search, I found this thread that reports exactly the same symptoms I encountered :
https://dovecot.org/pipermail/dovecot/2019-October/117353.html
I can confirm that downgrade of dovecot to 2.3.7.2 fixes the issue.
Testing usage of -T parameter with a very low value (10s) on 2.3.8 version is reducing perceptible impact of the issue (mails are replicated within the following five minutes).
I’m not expert in dovecot architecture but I tried to put an eye in the code and I’m wondering if this commit is not creating a potential phantom stalled I/O timeout ?
In fact, I was unable to see when the counter incremented by io_set_pending is decremented ?
Hoping that this information can help…
Regards.
Fabien.
+--------+
For technical details :
Container images are based on Arch Linux distro image.
Here is the configuration of Dovecot server :
dovecot -n
# 2.3.7.2 (3c910f64b): /etc/dovecot/dovecot.conf
# OS: Linux 4.19.66-coreos x86_64 ext4
# Hostname: mailstore-0.mailstore.piafe.svc.cluster.local
disable_plaintext_auth = no
doveadm_password = # hidden, use -P to show it
doveadm_port = 12345
login_greeting = PIAFE Mail Server Ready
mail_location = mdbox:/var/mail/%Ld/%Ln
mail_plugins = " notify replication quota"
namespace inbox {
hidden = no
inbox = yes
list = yes
location =
mailbox Brouillons {
auto = subscribe
special_use = \Drafts
}
mailbox Corbeille {
auto = subscribe
special_use = \Trash
}
mailbox "Courrier indésirable" {
auto = subscribe
special_use = \Junk
}
mailbox Envoyés {
auto = subscribe
special_use = \Sent
}
prefix =
separator = /
subscriptions = yes
type = private
}
passdb {
args = /etc/dovecot/ldap.conf.ext
driver = ldap
}
plugin {
mail_replica = tcp:mailstore-1.mailstore.piafe.svc.cluster.local
quota = dirsize:User quota
quota_grace = 10%%
quota_max_mail_size = 20M
quota_rule = *:storage=1G
quota_rule2 = Trash:storage=+200M
quota_status_access = DUNNO
quota_status_nouser = DUNNO
quota_status_overquota = 552 5.5.2 Mailbox is full
}
postmaster_address = postmaster@piafe.recouv.fr
replication_dsync_parameters = -d -N -l 30 -U -T 10
service aggregator {
fifo_listener replication-notify-fifo {
user = mailer
}
unix_listener replication-notify {
user = mailer
}
}
service doveadm {
inet_listener {
port = 12345
}
inet_listener http {
port = 8080
}
}
service lmtp {
inet_listener lmtp {
address = *
port = 24
}
}
service replicator {
process_min_avail = 1
unix_listener replicator-doveadm {
group = mailer
mode = 0660
}
}
ssl = no
userdb {
args = /etc/dovecot/ldap.conf.ext
default_fields = uid=500 gid=500 home=/var/mail/%Ld/%Ln
driver = ldap
}
protocol imap {
mail_plugins = " notify replication quota quota"
}