[Dovecot] CATENATE/literal8 issue
Using 2.2.2, I see this:
C: 6 APPEND "INBOX" (\seen) "16-May-2013 22:05:14 -0600" CATENATE (URL
"/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER" TEXT ~{40}
S: 6 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.
Why is there this limitation? It seems to me that CATENATE is
confusing the content-type encoding of the data/part itself with the
encoding of the IMAP literal.
A literal 8 is nothing more than a series of OCTET's that *may*
contain nulls, but not necessarily. i.e., in the above example the 40
octets of data are US-ASCII text, which is perfectly acceptable to
send as a literal8. (Client rationale: If BINARY exists on the
server, we don't bother to scan IMAP literal's for null data -- we
just send them as literal8's. It's an optimization that I would hate
to get rid of.)
michael
On 21.5.2013, at 9.40, Michael M Slusarz slusarz@curecanti.org wrote:
Using 2.2.2, I see this:
C: 6 APPEND "INBOX" (\seen) "16-May-2013 22:05:14 -0600" CATENATE (URL "/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER" TEXT ~{40} S: 6 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.
Why is there this limitation? It seems to me that CATENATE is confusing the content-type encoding of the data/part itself with the encoding of the IMAP literal.
A literal 8 is nothing more than a series of OCTET's that *may* contain nulls, but not necessarily. i.e., in the above example the 40 octets of data are US-ASCII text, which is perfectly acceptable to send as a literal8. (Client rationale: If BINARY exists on the server, we don't bother to scan IMAP literal's for null data -- we just send them as literal8's. It's an optimization that I would hate to get rid of.)
Well, the problem is that if it does contain NULs, the MIME part needs to be converted to something that doesn't. And to do that it needs to modify the previous header, which with current code was already read.. So to fix that it would need to read the whole message into a temporary file before actually saving it, which makes performance worse for the normal case..
Or are you saying that the error is fine if the text contains NULs, but simply should be allowed as long as it doesn't?
Quoting Timo Sirainen tss@iki.fi:
On 21.5.2013, at 9.40, Michael M Slusarz slusarz@curecanti.org wrote:
Using 2.2.2, I see this:
C: 6 APPEND "INBOX" (\seen) "16-May-2013 22:05:14 -0600" CATENATE
(URL "/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER"
TEXT ~{40} S: 6 NO [UNKNOWN-CTE] Binary input allowed only when the first part
is binary.Why is there this limitation? It seems to me that CATENATE is
confusing the content-type encoding of the data/part itself with
the encoding of the IMAP literal.A literal 8 is nothing more than a series of OCTET's that *may*
contain nulls, but not necessarily. i.e., in the above example the
40 octets of data are US-ASCII text, which is perfectly acceptable
to send as a literal8. (Client rationale: If BINARY exists on the
server, we don't bother to scan IMAP literal's for null data -- we
just send them as literal8's. It's an optimization that I would
hate to get rid of.)Well, the problem is that if it does contain NULs, the MIME part
needs to be converted to something that doesn't. And to do that it
needs to modify the previous header, which with current code was
already read..
Is altering the header something that BINARY/CATENATE is allowed to
do? Especially regarding the header. I know there is language about
the server changing the CTE, but this is potentially troubling since
cryptographic signatures may rely on the header text. Changing things
will break the message.
I can see the server altering the body text to match the header. But
I think the reverse is bothersome.
Or are you saying that the error is fine if the text contains NULs,
but simply should be allowed as long as it doesn't?
This. As mentioned before, it seems the code is simply assuming that
the text part contains NULs without ever checking it. My reading of
the literal8 is that there is no requirement that NULs MUST exist in
the string.
In our code, the append data is often from code that the IMAP library
doesn't have access to. So at APPEND time, it is unaware whether the
data contains NUL or not - it just has a blob of data and a length.
If BINARY exists, it is much easier for us to simply send as literal8
and stream the data - no extra overhead is needed on our side. Since
each individual byte need to be handled by the server as it comes in,
it seems much more efficient to do NUL checking there.
michael
On 21.5.2013, at 21.24, Michael M Slusarz slusarz@curecanti.org wrote:
Or are you saying that the error is fine if the text contains NULs, but simply should be allowed as long as it doesn't?
This. As mentioned before, it seems the code is simply assuming that the text part contains NULs without ever checking it. My reading of the literal8 is that there is no requirement that NULs MUST exist in the string.
In our code, the append data is often from code that the IMAP library doesn't have access to. So at APPEND time, it is unaware whether the data contains NUL or not - it just has a blob of data and a length. If BINARY exists, it is much easier for us to simply send as literal8 and stream the data - no extra overhead is needed on our side. Since each individual byte need to be handled by the server as it comes in, it seems much more efficient to do NUL checking there.
It's not just about NUL. It's also about if plain LFs can be converted to CRLFs.
Anyway .. the BINARY APPEND converts only the MIME parts that you send with "Content-Transfer-Encoding: binary". Are you sending such header to Dovecot? If not, there's actually no difference to a regular APPEND from Dovecot's point of view (I think). If a non-binary MIME part contains NUL, what is Dovecot supposed to do? Change it to some other character? Fail the APPEND? Should there be a difference between how literal vs literal8 is handled in such case?
Quoting Timo Sirainen tss@iki.fi:
Anyway .. the BINARY APPEND converts only the MIME parts that you
send with "Content-Transfer-Encoding: binary". Are you sending such
header to Dovecot?
I don't think so. I noticed the CATENATE error when I was stripping a
simple text/html part out of a multipart/alternative message. The
"master" message header has a single MIME header:
Content-Type: multipart/alternative;
boundary="----WPFVNCCY4GPWDK6HNJXHWWE7J94BSS"
For the record, here's the entire transaction, along with the fallback
APPEND w/out using literal8 that was successful on the identical data:
C: 6 APPEND "INBOX" (\seen) "16-May-2013 22:05:14 -0600" CATENATE (URL
"/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER" TEXT ~{40}
S: 6 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.
C: 8 APPEND "INBOX" (\seen) "16-May-2013 22:05:14 -0600" CATENATE (URL
"/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER" TEXT {40+}
C: [LITERAL DATA: 40 bytes]
C: URL "/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=1.MIME" URL
"/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=1" TEXT {40+}
C: [LITERAL DATA: 40 bytes]
C: TEXT {113+}
C: [LITERAL DATA: 113 bytes]
C: TEXT {42+}
C: [LITERAL DATA: 42 bytes]
C: )
S: 8 OK [APPENDUID 1255685337 48885] Append completed.
If a non-binary MIME part contains NUL, what is Dovecot supposed to
do? Change it to some other character? Fail the APPEND? Should there
be a difference between how literal vs literal8 is handled in such
case?
I would say there is no doubt: fail the APPEND. It should be the
client's responsibility to correctly format the data.
I appreciate that Dovecot does its best to try to Do The Right Thing
(Cyrus is much stricter about input, for example). But at some point
us client authors have to be at least somewhat competent, and it is
not asking to much for us to accept that GIGO.
michael
Quoting Michael M Slusarz slusarz@curecanti.org:
Quoting Timo Sirainen tss@iki.fi:
Anyway .. the BINARY APPEND converts only the MIME parts that you
send with "Content-Transfer-Encoding: binary". Are you sending such
header to Dovecot?
I can verify this isn't working as you described above:
1 APPEND "INBOX" CATENATE (TEXT {49+} Content-Type: multipart/alternative; boundary="A" TEXT ~{1} 1 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.
michael
On Wed, 2013-05-22 at 09:38 -0600, Michael M Slusarz wrote:
Quoting Michael M Slusarz slusarz@curecanti.org:
Quoting Timo Sirainen tss@iki.fi:
Anyway .. the BINARY APPEND converts only the MIME parts that you
send with "Content-Transfer-Encoding: binary". Are you sending such
header to Dovecot?I can verify this isn't working as you described above:
1 APPEND "INBOX" CATENATE (TEXT {49+} Content-Type: multipart/alternative; boundary="A" TEXT ~{1} 1 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.
What do you do then if server advertises CATENATE but not BINARY?
Anyway for the other possibilities Dovecot could:
a) Put all CATENATEd messages through the istream-binary-converter, but just not do any actual C-T-E:binary conversion until the first ~{binary} part is found.
b) Just treat ~{n} exactly the same as ~{n}, unless it's the first part of CATENATE.
Maybe this should be aked about in IMAP mailing list .. (Didn't I already ask something about CATENATE+BINARY combination?)
Quoting Timo Sirainen tss@iki.fi:
On Wed, 2013-05-22 at 09:38 -0600, Michael M Slusarz wrote:
Quoting Michael M Slusarz slusarz@curecanti.org:
Quoting Timo Sirainen tss@iki.fi:
Anyway .. the BINARY APPEND converts only the MIME parts that you send with "Content-Transfer-Encoding: binary". Are you sending such header to Dovecot?
I can verify this isn't working as you described above:
1 APPEND "INBOX" CATENATE (TEXT {49+} Content-Type: multipart/alternative; boundary="A" TEXT ~{1} 1 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.
What do you do then if server advertises CATENATE but not BINARY?
Send as a regular literal. If there truly are nulls in the output,
there's not much we can do so we send as-is and hope for the best.
Anyway for the other possibilities Dovecot could:
a) Put all CATENATEd messages through the istream-binary-converter, but just not do any actual C-T-E:binary conversion until the first ~{binary} part is found.
b) Just treat ~{n} exactly the same as ~{n}, unless it's the first part of CATENATE.
Maybe this should be aked about in IMAP mailing list .. (Didn't I already ask something about CATENATE+BINARY combination?)
Yeah:
http://mailman2.u.washington.edu/pipermail/imap-protocol/2012-June/001787.ht...
No responses :)
It is concerning because RFC 4466 indicates that literal8's are
allowed for both APPEND and MULTIAPPEND, which is essentially an
extended APPEND. But RFC 4469 defines CATENATE TEXT as literal only:
RFC 4466: append-data = literal / literal8 / append-data-ext
RFC 4469: append-data =/ "CATENATE" SP "(" cat-part *(SP cat-part) ")" cat-part = text-literal / url text-literal = "TEXT" SP literal
To me CATENATE =~ MULTIAPPEND - it is just another form of an extended
APPEND. Not sure why it shouldn't be allowed there. But from a
strict ABNF standpoint, you are correct that I shouldn't be sending
literal8's. I'll ask myself on the IMAP list why this design choice
was made.
For the record... given the varying levels of BINARY support in
different IMAP servers (UW IMAP is flat-out broken), I've gone ahead
and bit the bullet and we now pre-scan outgoing append literals for
null characters and only use literal8's when absolutely necessary. I
was probably being too clever for my own good in assuming that I can
just send and assume the server will handle all issues.
With that being said... I was able to reliably reproduce a parsing
issue in Dovecot 2.2.x when doing a MULTIAPPEND w/literal8's. I need
to track down if this is a single message causing the issue or some
sort of cumulative bug that only appears once you've done something
like 200-300 sequential appends. I can verify that a switch from
literal8 -> literal fixes the issue. I'll try to create a
reproducible test case.
michael
Quoting Michael M Slusarz slusarz@curecanti.org:
It is concerning because RFC 4466 indicates that literal8's are
allowed for both APPEND and MULTIAPPEND, which is essentially an
extended APPEND. But RFC 4469 defines CATENATE TEXT as literal only:RFC 4466: append-data = literal / literal8 / append-data-ext
RFC 4469: append-data =/ "CATENATE" SP "(" cat-part *(SP cat-part) ")" cat-part = text-literal / url text-literal = "TEXT" SP literal
To me CATENATE =~ MULTIAPPEND - it is just another form of an
extended APPEND. Not sure why it shouldn't be allowed there.
Answered my own question here - sure enough, it was an oversight:
http://osdir.com/ml/ietf.imapext/2006-03/msg00030.html
michael
participants (2)
-
Michael M Slusarz
-
Timo Sirainen