[2.3.8] possible replication issue
Piper Andreas
piper at hrz.uni-marburg.de
Thu Dec 5 08:34:50 EET 2019
Hello,
upgrading to 2.3.9 unfortunately does *not* solve this issue:
I upgraded one of my replicators from 2.3.7.2 to 2.3.9 and after some
seconds replication stopped. The other replicator remained with 2.3.7.2.
After downgrading to 2.3.7.2 replication is again working fine.
I did not try to upgrade both replicators up to now, as this is a live
production system. Is there a chance, that upgrading both replicators
will solve the problem?
The machines are running Ubuntu 18.04
Any help is appreciated.
Thanks,
Andreas
Am 18.10.19 um 13:52 schrieb Carsten Rosenberg via dovecot:
> Hi,
>
> some of our customers have discovered a replication issue after
> upgraded from 2.3.7.2 to 2.3.8.
>
> Running 2.3.8 several replication connections are hanging until defined
> timeout. So after some seconds there are $replication_max_conns hanging
> connections.
> Other replications are running fast and successful.
>
> Also running a doveadm sync tcp:... is working fine for all users.
>
> I can't see exactly, but I haven't seen mailboxes timeouting again and
> again. So I would assume it's not related to the mailbox.
>
> From the logs:
>
> server1:
> Oct 16 08:29:25 server1 dovecot[5715]:
> dsync-local(username1 at domain.com)<FXnVDW22pl0tGAAA1cwDxA>: Error:
> dsync(172.16.0.1): I/O has stalled, no activity for 600 seconds (version
> not received)
> Oct 16 08:29:25 server1 dovecot[5715]:
> dsync-local(username1 at domain.com)<FXnVDW22pl0tGAAA1cwDxA>: Error:
> Timeout during state=master_recv_handshake
>
> server2:
>
> Oct 16 08:29:25 server2 dovecot[8113]: doveadm: Error: read(server1)
> failed: EOF (last sent=handshake, last recv=handshake)
>
> There aren't any additional logs regarding the replication.
>
> I have tried increasing vsz_limit or reducing replication_max_conns.
> Nothing changed.
>
> --
>
> Both customers have 10k+ users. Currently I couldn't reproduce this on
> smaller test systems.
>
> Both installation were downgraded to 2.3.7.2 to fix the issue for now
>
> --
>
> I've attached a tcpdump showing the client showing the client stops
> sending any data after the mailbox_guid table headers.
>
>
>
> Any idea what could be wrong here or the debug this issue?
>
> Thanks.
>
> Carsten Rosenberg
>
--
________________________________________________________________________
Dr. Andreas Piper, Hochschulrechenzentrum der Philipps-Univ. Marburg
Hans-Meerwein-Straße 6, 35032 Marburg, Germany
Phone: +49 6421 28-23521 Fax: -26994 E-Mail: piper at HRZ.Uni-Marburg.DE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5394 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://dovecot.org/pipermail/dovecot/attachments/20191205/33e88f8e/attachment.p7s>
More information about the dovecot
mailing list