2.3.1 Replication is throwing scary errors
Michael Grimm
trashcan at ellael.org
Tue Apr 3 21:26:57 EEST 2018
Michael Grimm <trashcan at ellael.org> wrote:
> [This is Dovecot 2.3.1 at FreeBSD STABLE-11.1 running in two jails at distinct servers.]
>
> I did upgrade from 2.2.35 to 2.3.1 today, and I do become pounded by error messages at server1 (and vice versa at server2) as follows:
>
> | Apr 2 17:12:18 <mail.err> server1.lan dovecot: doveadm: Error: dsync(server2.lan): I/O has stalled, \
> no activity for 600 seconds (last sent=mail_change, last recv=mail_change (EOL))
> | Apr 2 17:12:18 <mail.err> server1.lan dovecot: doveadm: Error: Timeout during state=sync_mails \
> (send=changes recv=mail_requests)
[snip]
> FYI: I haven't seen such errors before. Replication has been working for years now, without any glitches at all.
That statement of mine has been incorrect:
#) I did investigate a bit further, and I do see those errors at about 20 days spread over the last year.
#) And what puzzles me even more is the fact that only server2 reports those errors, not a single line in server1's log files.
#) All those error messages above are paralleled by messages like:
Apr 2 17:10:49 <mail.err> server2.lan dovecot: doveadm: Error: Couldn't lock /home/to/USER1/.dovecot-sync.lock: \
fcntl(/home/to/USER1/.dovecot-sync.lock, write-lock, F_SETLKW) locking failed: Timed out after 30 seconds \
(WRITE lock held by pid 51110)
#) I did upgrade both servers to 2.3.1 a couple of hours ago, and haven't seen a single error, yet.
I do have to admit that I do not understand what is going on at server2, and I am quite sure it has nothing to do with dovecot.
Sorry for the noise.
It has nothing to do with dovecot 2.3.1
Regards,
Michael
More information about the dovecot
mailing list