On 20 Oct 2019, at 11.37, Tom Sommer via dovecot <dovecot@dovecot.org> wrote:
This is mail_debug from one of the accounts in question: Oct 18 13:39:37 imap(XXXX)<7552><ebbmyS2VPPFOxv4t>: Debug: Mailbox INBOX: Mailbox opened because: SELECT Oct 18 13:39:37 imap(XXXX)<7552><ebbmyS2VPPFOxv4t>: Debug: Mailbox INBOX: UID 17854: Opened mail because: prefetch Oct 18 13:39:37 imap(XXXX)<7552><ebbmyS2VPPFOxv4t>: Debug: Mailbox INBOX: UID 17854: Opened mail because: full mail .. Oct 18 13:39:48 imap(XXXX)<7552><ebbmyS2VPPFOxv4t>: Debug: Mailbox INBOX: UID 17947: Opened mail because: full mail Quite a lot of mail downloads for a single session. I wonder if the user really had that many new mails or if they were being redownloaded for some reason?
They might redownload because of UID FETCH failing?
The client successfully downloaded all but the last mail. So it should be redownloading only the latest one, not all of them. (I don't think there are any clients stupid enough to redownload everything..)
Oct 18 13:40:56 imap(XXXX)<7552><ebbmyS2VPPFOxv4t>: Debug: Mailbox Junk: Mailbox opened because: autoexpunge Oct 18 13:40:56 imap(XXXX)<7552><ebbmyS2VPPFOxv4t>: Debug: Mailbox Junk E-mail: Mailbox opened because: autoexpunge Oct 18 13:40:56 imap(XXXX)<7552><ebbmyS2VPPFOxv4t>: Info: Connection closed: read(size=7902) failed: Connection reset by peer (UID FETCH running for 0.542 + waiting input/output for 78.357 secs, 60 B in + 39221480+8192 B out, state=wait-output) in=290 out=39401283 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=94 body_bytes=39210315 state=wait-output means Dovecot was waiting for client to read the data it is sending. In v2.3.7 there was some changes related to this, but were you previously successfully running v2.3.7? In v2.3.8 I can't really think of such changes.
Yes, we were successfully running 2.3.7.2 before, the issue started just after the upgrade
It can't be related to changes in the indexes? Increasing I/O
There were no input/output errors in the logs prior to 2.3.8
How large are the IO latencies now and before? The IO wait% in e.g. iostat? And load average in general?
I can't see any reason for IO to be different in v2.3.8 than v2.3.7. The only thing even close to it is the one index file bugfix. I did some further testing with it and I can't see it doing any more work now than it used to.