On 4 Apr 2018, at 01:34, Reuben Farrelly reuben-dovecot@reub.net wrote:
Hi,
Message: 2 Date: Mon, 2 Apr 2018 22:06:07 +0200 From: Michael Grimm trashcan@ellael.org To: Dovecot Mailing List dovecot@dovecot.org Subject: 2.3.1 Replication is throwing scary errors Message-ID: 29998016-D62F-4348-93D1-613B13DA90DB@ellael.org Content-Type: text/plain; charset=utf-8 Hi [This is Dovecot 2.3.1 at FreeBSD STABLE-11.1 running in two jails at distinct servers.] I did upgrade from 2.2.35 to 2.3.1 today, and I do become pounded by error messages at server1 (and vice versa at server2) as follows: | Apr 2 17:12:18
server1.lan dovecot: doveadm: Error: dsync(server2.lan): I/O has stalled,
no activity for 600 seconds (last sent=mail_change, last recv=mail_change (EOL)) | Apr 2 17:12:18server1.lan dovecot: doveadm: Error: Timeout during state=sync_mails
(send=changes recv=mail_requests) [?] | Apr 2 18:59:03server1.lan dovecot: doveadm: Error: dsync(server2.lan): I/O has stalled,
no activity for 600 seconds (last sent=mail, last recv=mail (EOL)) | Apr 2 18:59:03server1.lan dovecot: doveadm: Error: Timeout during state=sync_mails
(send=mails recv=recv_last_common) I cannot see in my personal account any missing replications, *but* I haven't tested this thoroughly enough. I do have customers being serviced at these productive servers, *thus* I'm back to 2.2.35 until I do understand or have learned what is going on. Any ideas/feedback? FYI: I haven't seen such errors before. Replication has been working for years now, without any glitches at all. Regards, MichaelIt's not just you. This issue hit me recently, and it was impacting replication noticeably. I am following git master-2.3 .
I am seeing the same as Michael Grimm also on FreeBSD-11. You’ll also notice in doveadm replicator status ‘*’ that the failed flag is raised for those users and that there are processes just hanging forever when those logs start to appear:
<user> 45949 0.0 0.0 47888 13276 - I 20:20 0:00.10 doveadm-server: [<user> Verwijderde items send:mail_requests recv:changes] (doveadm-server) <user2> 45964 0.0 0.0 49860 11608 - I 20:20 0:00.05 doveadm-server: [IP6 <user2> INBOX import:1/3] (doveadm-server) <user3> 45965 0.0 0.1 58256 19820 - I 20:20 0:00.11 doveadm-server: [IP6 <user3> INBOX import:16/18] (doveadm-server) <user4> 46480 0.0 0.0 53536 16288 - I 20:22 0:00.08 doveadm-server: [IP6 <user4> INBOX import:4/6] (doveadm-server) <user5> 46745 0.0 0.0 51496 14184 - I 20:22 0:00.07 doveadm-server: [IP6 <user5> INBOX import:5/6] (doveadm-server)
I also reverted to 2.2.35 because I started to get complaints from my users that mail was missing.
Cheers Remko