[Dovecot] maildir and end-of-line encoding

Timo Sirainen tss at iki.fi
Fri Nov 23 08:29:10 EET 2012


On 8.11.2012, at 4.57, Christoph Anton Mitterer wrote:

> On Wed, 2012-11-07 at 17:33 +0200, Timo Sirainen wrote:
>> Dovecot automatically adds CRs where necessary. Even within the same file there can be mixed LF/CRLF lines.
> Can you detail this a bit, or point me to the specific code areas?
> 
> 1) Is only CR added? Or also LF?

If CR is alone, it's not treated as newline. So only CRs may be added before LF.

> 2) What happens e.g. when LFCR is found? Is that then "doubled" to
> CRLFCR or even CRLFCRLF?

CRLFCR

> 3) When does it "add" these chars? Only when using dovecot-lda? Or also
> when some other MDA places files into e.g. a maildir?

When saving a mail, based on mail_save_crlf setting the CRs are either added or removed when writing the mail to disk. When reading a mail and sending to IMAP/POP3 client the CRs are always added. (doveadm fetch text doesn't add/remove CRs I think.)

> I did some reading on the RFC 5322 which says:
> 
> - new mails must not have single CR or LF, both may only occur as CRL
> 
> - but from the previous RFCs, it allows existing messages to have CR and
> LF alone, in which case they are not newlines as CRLF, but rather the CR
> and LF characters in the their meaning as control characters.
> 
> 
> 4) So from that point of view... automatic conversion may actually
> "corrupt" things in a strict sense.
> (One should hope of course, that only few people use(d) CR or LF alone
> to get their control character meaning... but rather that these are just
> cases of accidents.)

SMTP and IMAP protocols are the only normal ways to get messages into a system. Both of them require CRLF newlines. So there's really no way for Dovecot to ever see valid LF-only newlines. One exception is Content-Type: binary, but that's not really supported by Dovecot (or any commonly used SMTP servers either I think).

> 5) I agree with you that mails should be stored with CRLF, as this is
> their native format.... and I found nothing on the maildir[++] standards
> that would forbid that (neither that would encourage it).
> But for mbox there are "definitions" that _always_ LF is used (AFAIU,
> even on non-UNIX platforms.

mbox isn't really standardized. Anyway, storing mails with CRLF allows some optimizations, but if the mails aren't stored compressed it wastes a bit of disk space.

> 6) I went through my mails and basically I found everything:
> CR, LF, CRLF and even LFCR.
> Now I have no real idea how to deal with that?
> Keep all as is? Make all LFs CRLFs  and/or  all CFs to CRLFs? What about
> the LFCRs? Handle them as group and perhaps swap them to CRLF. Or doing
> the same as with single LFs and CRs.

Why do you need to do something about them? Dovecot should handle all of them fine.




More information about the dovecot mailing list