Hi,

 

I’m dealing for few days with replication issues between two dovecot containerized instances in Kubernetes environment.

After some search, I found this thread that reports exactly the same symptoms I encountered : https://dovecot.org/pipermail/dovecot/2019-October/117353.html

I can confirm that downgrade of dovecot to 2.3.7.2 fixes the issue.

 

Testing usage of -T parameter with a very low value (10s) on 2.3.8 version is reducing perceptible impact of the issue (mails are replicated within the following five minutes).

 

I’m not expert in dovecot architecture but I tried to put an eye in the code and I’m wondering if this commit is not creating a potential phantom stalled I/O timeout ?

https://github.com/dovecot/core/commit/ec817bb2185bb21b34ba6bdd83b32af16dd0b4ad#diff-f0ef7f961c147b56c28b246b06eb5eb6

 

In fact, I was unable to see when the counter incremented by io_set_pending is decremented ?

 

Hoping that this information can help…

 

Regards.

Fabien.

+--------+

For technical details :

Container images are based on Arch Linux distro image.

Here is the configuration of Dovecot server :

dovecot -n

# 2.3.7.2 (3c910f64b): /etc/dovecot/dovecot.conf

# OS: Linux 4.19.66-coreos x86_64  ext4

# Hostname: mailstore-0.mailstore.piafe.svc.cluster.local

disable_plaintext_auth = no

doveadm_password = # hidden, use -P to show it

doveadm_port = 12345

login_greeting = PIAFE Mail Server Ready

mail_location = mdbox:/var/mail/%Ld/%Ln

mail_plugins = " notify replication quota"

namespace inbox {

  hidden = no

  inbox = yes

  list = yes

  location =

  mailbox Brouillons {

    auto = subscribe

    special_use = \Drafts

  }

  mailbox Corbeille {

    auto = subscribe

    special_use = \Trash

  }

  mailbox "Courrier indésirable" {

    auto = subscribe

    special_use = \Junk

  }

  mailbox Envoyés {

    auto = subscribe

    special_use = \Sent

  }

  prefix =

  separator = /

  subscriptions = yes

  type = private

}

passdb {

  args = /etc/dovecot/ldap.conf.ext

  driver = ldap

}

plugin {

  mail_replica = tcp:mailstore-1.mailstore.piafe.svc.cluster.local

  quota = dirsize:User quota

  quota_grace = 10%%

  quota_max_mail_size = 20M

  quota_rule = *:storage=1G

  quota_rule2 = Trash:storage=+200M

  quota_status_access = DUNNO

  quota_status_nouser = DUNNO

  quota_status_overquota = 552 5.5.2 Mailbox is full

}

postmaster_address = postmaster@piafe.recouv.fr

replication_dsync_parameters = -d -N -l 30 -U -T 10

service aggregator {

  fifo_listener replication-notify-fifo {

    user = mailer

  }

  unix_listener replication-notify {

    user = mailer

  }

}

service doveadm {

  inet_listener {

    port = 12345

  }

  inet_listener http {

    port = 8080

  }

}

service lmtp {

  inet_listener lmtp {

    address = *

    port = 24

  }

}

service replicator {

  process_min_avail = 1

  unix_listener replicator-doveadm {

    group = mailer

    mode = 0660

  }

}

ssl = no

userdb {

  args = /etc/dovecot/ldap.conf.ext

  default_fields = uid=500 gid=500 home=/var/mail/%Ld/%Ln

  driver = ldap

}

protocol imap {

  mail_plugins = " notify replication quota quota"

}