Maybe I was a bit unclear: I have about 1000 error messages per day from random accounts (about 500 in total so far) on all clusters. These are transparent to the user, so it's more like background noise at the moment.
Do you have ecc memory?
No VM involved. All machines are baremetal DRBD two-node clusters.
How old are your drives? Do you scrub the raids? How reliable is your drbd setup? Does drbd even sync raid fixes? Do you have networking issues?
Connect a hdd directly and use that for a few accounts, do you still have the problem? I will bet not.
Why do you have this bare metal? Do you need performance? Otherwise switch to reliable storage that is a bit less performing. You will get headaches from these multiple drbd setups.
As far as I see it I can not nail it down to specific accounts, POP3 vs. IMAP, LMTP delivery vs. IMAP store or Sieve vs. non-Sieve etc.
It is not dovecot, otherwise it would be here more often listed.