Broken uidlist when using NFS on newer kernels

Aki Tuomi aki.tuomi at open-xchange.com
Wed Oct 13 07:20:17 EEST 2021


Hi!

LDA should work just fine, as long as you follow the same rules as with LMTP, you must only access the user concurrently on one backend. The problem usually comes when you accidentically access the user from the other backend while the user is active on other backend.

The fix you made might seemingly work, but it's going to break something in future. The \0 are not introduced by dovecot.

Aki

> On 12/10/2021 21:45 Jeremy Hanmer <jhanmer at gmail.com> wrote:
> 
> 
> I looked into LMTP, but reconfiguring our 1.5 million mailboxes just to work around what seems like an obvious bug in the code is a hard sell. I patched maildir-uidlist.c to strip out the leading null bytes and things seem to behave just fine, but it feels wrong and I was hoping to get input from someone more familiar with the codebase.
> 
> 
> On Tue, Oct 12, 2021 at 8:39 AM Alessio Cecchi <alessio at skye.it> wrote:
> > Hi Jeremy,
> > I had the same problem as you.
> > We run an email hosting service with Maildir on NetApp NFS, Dovecot Director and Backend servers for POP/IMAP and messagges deliverd via dovecot-lda by MXs. After the upgrade from CentOS 6 to CentOS 7 I found the same issue as you (on dovecot-uidlist).
> > After many tests we decided to switch from LDA to LMTP, that was already in our roadmap, so read and delivery of messagges is always on the same backend. And the problem was solved.
> > I haven't found any others workarounds.
> > Swith from LDA to LMTP was not so simple for us since our MX wasn't able to talk LMTP but we have write some custom C++ code and was done. You should also consider to add some directors since also incoming emails will transit from it.
> > 
> > If you would like to talk about how we solve on MXs side I will happy to talk with you.
> > Ciao
> > 
> > Il 08/10/21 21:01, Jeremy Hanmer ha scritto:
> > 
> > > I know this has been reported in the past, but I think I have some useful new information on the problem. After an OS upgrade from Ubuntu Xenial (4.4.0 kernel) to Ubuntu Focal (5.4.0 kernel) and corresponding upgrade from Dovecot 2.2.27 to 2.3.7.2, we've started seeing broken uidlist files to an extent that's making larger mail boxes nearly unusable because the file is constantly being regenerated. I've also used the 2.3.16-2+ubuntu20.04 version distributed from dovecot.org (http://dovecot.org) and the behavior is unchanged. The environment consists of NFS mounts from a NetApp device, with a couple dozen MX servers receiving mail and about a hundred IMAP/POP servers. 
> > > 
> > > 
> > > 
> > > This is the exact error (note the blank after "invalid data"):
> > > 
> > > Error: Mailbox INBOX: Broken file /mnt/morty/morty2/gravest/x15775549/Maildir/dovecot-uidlist line 373: Invalid data: 
> > > 
> > > 
> > > 
> > > I've been able to trigger the problem rather easily by piping an email to dovecot-lda in a loop and reading the resulting dovecot-uidlist file on a different server. What it shows is that occasionally we're seeing the last line of the file prepended with a number of null bytes equal to the line that's being written (for example, if the entry is "35322 :1633719038.M516419P3623238.pdx1-sub0-mail-mx202,S=2777,W=2832", we'll have it prepended by 69 null bytes). This then breaks the IMAP process' ability to read the file. My first thought was to extend the retry functionality so the imap proces makes more attempts to read the file when it detects a problem like this, but would love input from someone more familiar with the codebase.
> > > 
> > -- 
> > Alessio Cecchi
> > Postmaster @ http://www.qboxmail.it
> > https://www.linkedin.com/in/alessice


More information about the dovecot mailing list