There has definetly been fixes in this area since 2.2.13 and most likely your issue is fixed.


---
Aki Tuomi
Dovecot oy

-------- Original message --------
From: fauno <fauno@partidopirata.com.ar>
Date: 04/08/2018 18:44 (GMT+02:00)
To: dovecot@dovecot.org
Subject: replication fails and corrupts index with zlib enabled

Hi, I have two Debian Jessie servers with Dovecot 2.2.13 TCP replication
on that have worked fine for years, but now one of them is running low
on disk space, so I wanted to try enabling zlib.

I crafted a script following the description given in
https://wiki.dovecot.org/Plugins/Zlib and xz'ed some inboxes on the
stand-by server, the one with low disk space.  So every email in those
inboxes is xz'ed but the file name hasn't changed and contains the
original size.

This server is on stand-by so most of the email is replicated
unidirectionally to it.  But administrative emails like cronjobs and
monitoring are delivered locally, so it replicates those to the hot server.

The issue appeared when this stand-by server receives such an email and
tries to replicate them to the other server.

I'm attaching the full snippet of the log from the hot server, because
it throws a longer backtrace.  The short version is like this:

dovecot[8438]: imap(redacted@address.org): Error: Cached message size
larger than expected (478 > 289)
dovecot[8438]: imap(redacted@address.org): Error: Maildir filename has
wrong S value, renamed the file from
/srv/email/address.org/redacted/Maildir/cur/1533393328.M502775P20341.standby_server,S=478:2,
to
/srv/email/address.org/redacted/Maildir/cur/1533393328.M502775P20341.standby_server,S=289:2,
dovecot[8438]: imap(redacted@address.org): Error: Corrupted index cache
file /srv/email/address.org/redacted/Maildir/dovecot.index.cache: Broken
physical size for mail UID 45123

After this there's an error and the replication fails.  The file is
there, it's gzipped and can be zcat'ed but it appears as a blank email
on clients.

I've recovered a backup but the issue persists.  I also changed from xz
to gz because the Debian package docs only mention gz and bzip2, but the
issue is the same.

From what I understand and tested, the stand-by server is receiving the
email and compressing it but maintaining the original size on the file
name.  So that's ok, but when the hot server receives the copy, it
believes the size is wrong and changes it to the compressed size.  Then
for some reason the index gets corrupted.

I'm attaching the doveconf for both servers.  They're mostly the same,
and the only changes introduced were the zlib plugin and its options.
Also the script(s) that I used to compress the inboxes.

Am I correct?  Is it an issue of replicator not understanding the emails
are compressed?  I couldn't find anything related to zlib with
replication.  Maybe it's something fixed in newer versions and I should
go that rabbit hole?

Thanks! :)