Hi all
Is there an option in dovecot to remove the >From escaping in the body of mbox mails?
Thanks
-- Dean Earley AKA Dee (dean@earlsoft.co.uk)
irc: irc://irc.blitzed.org/ web: http://personal.earlsoft.co.uk phone: +44 (0)780 8369596
On 5.10.2004, at 13:02, Dean Earley wrote:
Is there an option in dovecot to remove the >From escaping in the body of mbox mails?
No. I guess it would be possible to create >From-removing stream on top of raw-mbox-stream, but it would make the code slower and I don't think it's worth the trouble. Also when Dovecot saves messages to mbox, it doesn't do From-quoting. If the >From-removing was done, Dovecot would also have to do From-quoting to keep the behaviour consistent..
On Tue, Oct 05, 2004 at 01:50:35PM +0300, Timo Sirainen wrote:
On 5.10.2004, at 13:02, Dean Earley wrote:
Is there an option in dovecot to remove the >From escaping in the body of mbox mails?
No. I guess it would be possible to create >From-removing stream on top of raw-mbox-stream, but it would make the code slower and I don't think it's worth the trouble. Also when Dovecot saves messages to mbox, it doesn't do From-quoting. If the >From-removing was done, Dovecot would also have to do From-quoting to keep the behaviour consistent..
That's interesting-- I'd guess that email clients are not going to do the ">From" quoting unless they are reading/writing directly to the mbox themselves, and not when accessing IMAP/POP folders. This would mean that an email client would see a potentially incorrect message, and could corrupt an mbox by sending (for copy/append) a message with missing quoting. I would think that it would be the responsibility of the IMAP/POP server to maintain the mbox integrity (not to mention the correctness of the message). But I can see where it would be a pain, for sure.
mm
On 5.10.2004, at 16:59, Mark E. Mallett wrote:
That's interesting-- I'd guess that email clients are not going to do the ">From" quoting unless they are reading/writing directly to the mbox themselves, and not when accessing IMAP/POP folders. This would mean that an email client would see a potentially incorrect message, and could corrupt an mbox by sending (for copy/append) a message with missing quoting. I would think that it would be the responsibility of the IMAP/POP server to maintain the mbox integrity (not to mention the correctness of the message). But I can see where it would be a pain, for sure.
Dovecot (and eg. mutt) uses Content-Length header to figure out how large the message body is, so clients can't mess anything by sending From-lines. Also Dovecot requires that From-line has correct syntax and valid timestamp or it's not treated as From-line.
On Tue, Oct 05, 2004 at 05:13:36PM +0300, Timo Sirainen wrote:
On 5.10.2004, at 16:59, Mark E. Mallett wrote:
That's interesting-- I'd guess that email clients are not going to do the ">From" quoting unless they are reading/writing directly to the mbox themselves, and not when accessing IMAP/POP folders. This would mean that an email client would see a potentially incorrect message, and could corrupt an mbox by sending (for copy/append) a message with missing quoting. I would think that it would be the responsibility of the IMAP/POP server to maintain the mbox integrity (not to mention the correctness of the message). But I can see where it would be a pain, for sure.
Dovecot (and eg. mutt) uses Content-Length header to figure out how large the message body is, so clients can't mess anything by sending From-lines. Also Dovecot requires that From-line has correct syntax and valid timestamp or it's not treated as From-line.
Cool.. Just a little devil's advocation here:
That's great as long as dovecot is the only one that will ever touch the mbox. There are a lot of different "correct" formats of "From " lines, including some homegrown ones, and there are various code bases out there that recognize "From " separators in different ways. That's one reason that intelligent "From " quoting/recognition is not always better than being dumb about it (IMHO at least). I favor being aware of the fact that you might be maintaining an mbox that has to be compatible with all manners of access.
As an aside, mutt uses content-length only if available, and calculates it if not (I would assume dovecot generates it always, if it's going to rely on it?). (and in fact mutt, up to 1.4.2.1 at least, is broken in the way it recognizes "From " lines if the content-length is missing: specifically, it would recognize "From " lines even if not preceded by a blank line.. I patched it here.)
mm
On 5.10.2004, at 17:29, Mark E. Mallett wrote:
Dovecot (and eg. mutt) uses Content-Length header to figure out how large the message body is, so clients can't mess anything by sending From-lines. Also Dovecot requires that From-line has correct syntax and valid timestamp or it's not treated as From-line.
Cool.. Just a little devil's advocation here:
That's great as long as dovecot is the only one that will ever touch the mbox.
Right. Perhaps it should be optional. Although I'm not sure how hard I want Dovecot to try to be backwards compatible with all kinds of mbox software.
There are a lot of different "correct" formats of "From " lines, including some homegrown ones, and there are various code bases out there that recognize "From " separators in different ways. That's one reason that intelligent "From " quoting/recognition is not always better than being dumb about it (IMHO at least). I favor being aware of the fact that you might be maintaining an mbox that has to be compatible with all manners of access.
I think Dovecot has pretty similiar From-line requirements than UW-IMAP, or at least I once did some changes to make sure it parsed all the same ones.
As an aside, mutt uses content-length only if available, and calculates it if not (I would assume dovecot generates it always, if it's going to rely on it?).
I meant when Dovecot or mutt saves mails, it always adds the Content-Length header. When parsing mails without Content-Length, it's added only if doing so doesn't cause much extra disk I/O and if the message size is >= 1024 bytes (ie. when jumping over the message might actually reduce disk reads).
(and in fact mutt, up to 1.4.2.1 at least, is broken in the way it recognizes "From " lines if the content-length is missing: specifically, it would recognize "From " lines even if not preceded by a blank line.. I patched it here.)
I don't think From-line has to be preceded by empty line? My understanding is that mbox works like:
From ... LF message text LF
If message text doesn't end with LF, there's no empty line before From-line.
On Tue, Oct 05, 2004 at 05:55:27PM +0300, Timo Sirainen wrote:
(and in fact mutt, up to 1.4.2.1 at least, is broken in the way it recognizes "From " lines if the content-length is missing: specifically, it would recognize "From " lines even if not preceded by a blank line.. I patched it here.)
I don't think From-line has to be preceded by empty line? My understanding is that mbox works like:
From ... LF message text LF
Depends on your semantics I think :-) An entire message in an mbox ends with a blank line, so anything but the first "From " line in a mbox will be preceded by a blank line. Thus a naive way to find the start of a new message is to look for "From " lines following a blank line, and this is one of the things that ">From "quoting is all about.
At any rate, I ran across cases where there would be:
some non-blank text in the middle of a message body
From <valid-looking From line syntax>
and mutt, when there was no Content-length and it was trying to find the next message boundary, would see this as the start of a new message. Every other mail program that I tried would get it right; mutt did not. The unquoted "From " does occur in the wild when not following a blank line; it's fairly standard *not* to quote a "From " line that does not follow a blank line. For example, see:
http://www.ietf.org/internet-drafts/draft-hall-mime-app-mbox-02.txt
which attempts to summarize or reference some of the accepted variations on mbox formats.
Also, I think mutt's (and others) use of Content-length is an optimization that lets it avoid scanning every mbox line to skip from one message to the next, but doesn't relieve it of the necessity of "From " line quoting.
mm
On 5.10.2004, at 18:19, Mark E. Mallett wrote:
At any rate, I ran across cases where there would be:
some non-blank text in the middle of a message body From <valid-looking From line syntax>
and mutt, when there was no Content-length and it was trying to find the next message boundary, would see this as the start of a new message.
As would Dovecot with it's current parser. Dovecot also doesn't write the empty line if appended message didn't end with LF. Hmm. The parser would be annoying to fix to work correctly..
On Tue, 5 Oct 2004, Timo Sirainen wrote:
On 5.10.2004, at 16:59, Mark E. Mallett wrote:
That's interesting-- I'd guess that email clients are not going to do the ">From" quoting unless they are reading/writing directly to the mbox themselves, and not when accessing IMAP/POP folders. This would mean that an email client would see a potentially incorrect message, and could corrupt an mbox by sending (for copy/append) a message with missing quoting. I would think that it would be the responsibility of the IMAP/POP server to maintain the mbox integrity (not to mention the correctness of the message). But I can see where it would be a pain, for sure.
Dovecot (and eg. mutt) uses Content-Length header to figure out how large the message body is, so clients can't mess anything by sending From-lines. Also Dovecot requires that From-line has correct syntax and valid timestamp or it's not treated as From-line.
What happens if the message content is altered by some means, outside of the mail domain (for example, by manually editing a message [yes, I know it shoudln't happen, but sometimes it has been necessary]), changing the length? How does dovecot handle a discrepancy between actual length and that in Content-Length:?
Jethro.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jethro R Binks Computing Officer, IT Services University Of Strathclyde, Glasgow, UK
On 5.10.2004, at 19:18, Jethro R Binks wrote:
What happens if the message content is altered by some means, outside of the mail domain (for example, by manually editing a message [yes, I know it shoudln't happen, but sometimes it has been necessary]), changing the length? How does dovecot handle a discrepancy between actual length and that in Content-Length:?
If after Content-Length bytes doesn't exist a valid From-line, it's not used and is replaced with real length.
--On Tuesday, October 05, 2004 11:02 AM +0100 Dean Earley dean@earlsoft.co.uk wrote:
Is there an option in dovecot to remove the >From escaping in the body of mbox mails?
I recall reading a discussion somewhere about why this can be a bad idea, but now I can't seem to find it. Google turns up this:
participants (5)
-
Dean Earley
-
Jethro R Binks
-
Kenneth Porter
-
Mark E. Mallett
-
Timo Sirainen