On 10. Sep 2020, at 14.07, Robert Nowotny rnowotny1966@gmail.com wrote:
On 9. Sep 2020, at 11.14, Robert Nowotny
mailto:rnowotny1966@gmail.com> wrote: Sep 3 08:33:25 lxc-imap dovecot: imap(mpaul)<48684><2/9E5mKuAezAqKjk>: Error: Mailbox Sent: UID=2171: read(zlib(/home/vmail/virtualmailboxes/mpaul/storage/m.119)) failed: read(/home/vmail/virtualmailboxes/mpaul/storage/m.119) failed: Broken pipe (FETCH BODY[])
Also this way you can see if the broken mail is actually xz or zstd or zlib. It would be nice to know if there are any zstd or zlib compressed mails that have problems. We did a lot of stress testing with zstd and also with xz, but haven't been able to reproduce any problems. It's also strange that it says taht the error is "Broken pipe" - that doesn't indicate that the mail is corrupted but that there is something more strange going on. So perhaps you don't actually have any mails written as corrupted, but Dovecot is just somehow having trouble reading the mails.
I managed to reproduce this. The files aren't corrupted, it's just that reading is failing. The attached patch should fix the xz code and should make your files readable again.
Yes, it works ! I switched now to "zstd" compression and currently re-compressing all user imap folders (one by one)
Actually, with further testing it looks like the mails were written truncated. My patch simply hides the truncation when reading the mail. But it's anyway best that can be done about them.
zstd also has a bug in writing compressed output, but it should always crash instead of writing broken output.
We'll try to get fixes to these merged to git soon.
currently I am re-compressing the mailboxes and so far i can see only one (1) Error, after about 120GB of recomresses emails :
Sep 10 11:32:24 lxc-imap dovecot: imap(ebay)<37070><1aDvPvKus+/AqKjk>: Panic: file ostream.c: line 287 (o_stream_sendv_int): assertion failed: (!stream->blocking)
This is fixed now.
until now I only have one broken mailbox, I need to repair (I will try that later, just finish all the other users before):
Sep 10 12:13:13 lxc-imap dovecot: imap(mtrenner)<41872>
: Error: lzma.read(/home/vmail/virtualmailboxes/mtrenner/storage/m.349): corrupted data at 3763912 Sep 10 12:13:13 lxc-imap dovecot: imap(mtrenner)<41872> : Error: Mailbox Sent: UID=1480: read(zlib(/home/vmail/virtualmailboxes/mtrenner/storage/m.349)) failed: read(/home/vmail/virtualmailboxes/mtrenner/storage/m.349) failed: lzma.read(/home/vmail/virtualmailboxes/mtrenner/storage/m.349): corrupted data at 3763912 (FETCH BODY[])
For this you can use the method I mentioned in earlier mail (doveadm fetch+expunge+save).
zstd also has a bug in writing compressed output, but it should always crash instead of writing broken output. We'll try to get fixes to these merged to git soon.
Uh-Oh, that would be good ... pls notify ! how such thing can slip through the tests ? I just can not remember, the last error with compression was like 10 years ago (dont know if it was me who reported that)
We had a bunch of tests, but this required a large input buffer to be feeded to the zstd compression code, which didn't happen all that easily. I only accidentally caught it with the same test as I wrote for reproducing the xz bug.
Here's the list of fixes related to both xz and zstd:
https://github.com/dovecot/core/commit/48083d9e7fdbe257b0be33043ecf0ca87489e... https://github.com/dovecot/core/commit/48083d9e7fdbe257b0be33043ecf0ca87489e... https://github.com/dovecot/core/commit/a96e742047635ecb8df67c9dbb36b05e0b8fa... https://github.com/dovecot/core/commit/a96e742047635ecb8df67c9dbb36b05e0b8fa... https://github.com/dovecot/core/commit/a775fe3d066f1a2e12d0093fa52527270ad8a... https://github.com/dovecot/core/commit/a775fe3d066f1a2e12d0093fa52527270ad8a... https://github.com/dovecot/core/commit/d559f587677377a34c1f32d321719aa4838cf... https://github.com/dovecot/core/commit/d559f587677377a34c1f32d321719aa4838cf... https://github.com/dovecot/core/commit/3d0f6cf3e04da0d85bbb853fbf4c6dff1d08a... https://github.com/dovecot/core/commit/3d0f6cf3e04da0d85bbb853fbf4c6dff1d08a...
Note that if you apply these patches, they'll now correctly detect the truncated xz emails and will cause dsync to fail when reading them. So you might want to use the previous patch to finish the dsync migration.