Hi
I'm currently debugging some replication issues between two dovecot 2.3.9.2 servers, where one is live and the other is just a copy used for backup with no imap user access. After initial alignment (with various error messages such as the stalled io messages a fnctl lock messages) I am seeing replication miss messages or stop altogether on mailboxes, even with no further error messages.
doveadm: Error: dsync(REMOTE_HOSTNAME): I/O has stalled, no activity for 600 seconds (last sent=mail_change (EOL), last recv=mailbox)
doveadm: Error: Couldn't lock /var/vmail/DOMAIN/USER//.dovecot-sync.lock: fcntl(/var/vmail/DOMAIN/USER//.dovecot-sync.lock, write-lock, F_SETLKW) locking failed: Timed out after 30 seconds (WRITE lock held by pid 30307)
I was surprised by this because although I know there were replication issues in 2.3.8 I understood these were resolved in 2.3.9 when both servers had 2.3.9.
I am still investigating and will post further if I get any useful insights.
However, I have a question, which despite using dovecot for many years in this configuration has never occurred to me before. I configured dovecot using the wiki https://wiki.dovecot.org/Replication using tcp and ssl. Both servers have an identical dovecot configuration except for:
different hostnames
on the backup server I have removed expire and quota plugins in the global mail_plugins
in the configuration of mail_replica tcps://hostname:port each server points to the other server's hostname
What I just realized is that nowhere in the wiki does it state that both servers should be set up for replication. I had always assumed that was the logical thing to do. So the question is, for successful replication is it sufficient to setup one master configuration and just have a replication process listening on the other master, or should both servers be set up for replication in an almost identical way (with the 3 exceptions above)?
thanks for any insights.
John
On 12.1.2020 13.49, John wrote:
Hi
I'm currently debugging some replication issues between two dovecot 2.3.9.2 servers, where one is live and the other is just a copy used for backup with no imap user access. After initial alignment (with various error messages such as the stalled io messages a fnctl lock messages) I am seeing replication miss messages or stop altogether on mailboxes, even with no further error messages.
doveadm: Error: dsync(REMOTE_HOSTNAME): I/O has stalled, no activity for 600 seconds (last sent=mail_change (EOL), last recv=mailbox)
doveadm: Error: Couldn't lock /var/vmail/DOMAIN/USER//.dovecot-sync.lock: fcntl(/var/vmail/DOMAIN/USER//.dovecot-sync.lock, write-lock, F_SETLKW) locking failed: Timed out after 30 seconds (WRITE lock held by pid 30307)
I was surprised by this because although I know there were replication issues in 2.3.8 I understood these were resolved in 2.3.9 when both servers had 2.3.9.
I am still investigating and will post further if I get any useful insights.
However, I have a question, which despite using dovecot for many years in this configuration has never occurred to me before. I configured dovecot using the wiki https://wiki.dovecot.org/Replication using tcp and ssl. Both servers have an identical dovecot configuration except for:
different hostnames
on the backup server I have removed expire and quota plugins in the global mail_plugins
in the configuration of mail_replica tcps://hostname:port each server points to the other server's hostname
What I just realized is that nowhere in the wiki does it state that both servers should be set up for replication. I had always assumed that was the logical thing to do. So the question is, for successful replication is it sufficient to setup one master configuration and just have a replication process listening on the other master, or should both servers be set up for replication in an almost identical way (with the 3 exceptions above)?
thanks for any insights.
John
Did you check what the process 30307 is?
It is enough for the backup server to have only the doveadm server configured.
Aki
participants (2)
-
Aki Tuomi
-
John