On Wed, 2012-11-07 at 17:33 +0200, Timo Sirainen wrote:
Dovecot automatically adds CRs where necessary. Even within the same file there can be mixed LF/CRLF lines. Can you detail this a bit, or point me to the specific code areas?
Is only CR added? Or also LF?
What happens e.g. when LFCR is found? Is that then "doubled" to CRLFCR or even CRLFCRLF?
When does it "add" these chars? Only when using dovecot-lda? Or also when some other MDA places files into e.g. a maildir?
I did some reading on the RFC 5322 which says:
new mails must not have single CR or LF, both may only occur as CRL
but from the previous RFCs, it allows existing messages to have CR and LF alone, in which case they are not newlines as CRLF, but rather the CR and LF characters in the their meaning as control characters.
So from that point of view... automatic conversion may actually "corrupt" things in a strict sense. (One should hope of course, that only few people use(d) CR or LF alone to get their control character meaning... but rather that these are just cases of accidents.)
I agree with you that mails should be stored with CRLF, as this is their native format.... and I found nothing on the maildir[++] standards that would forbid that (neither that would encourage it). But for mbox there are "definitions" that _always_ LF is used (AFAIU, even on non-UNIX platforms.
I went through my mails and basically I found everything: CR, LF, CRLF and even LFCR. Now I have no real idea how to deal with that? Keep all as is? Make all LFs CRLFs and/or all CFs to CRLFs? What about the LFCRs? Handle them as group and perhaps swap them to CRLF. Or doing the same as with single LFs and CRs.
Cheers, Chris.