[Dovecot] Sieve rule issue with certain character sets
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
I have a global sieve rule in place to filter mailing-lists. This has worked well so far. Recently however one subscriber on a list seems to create strange character set encodings in the 'From' and 'To' headers. This leads to unprocessed/unfiltered mails (no errors thrown). Is this a configuration or Pigeonhole issue (latest HG used)?
Headers from failing mail:
From: "=?UTF-8?B?VG9yaW50aGllbA==?=" <user@domain.tld> To: "=?UTF-8?B?ImJpbmQtdXNlcnNAbGlzdHMuaXNjLm9yZyI=?=" <bind-users@lists.isc.org>
Relevant sieve rule:
if allof (address :is ["To","CC"] ["bind-users@lists.isc.org","bind-users@isc.org","comp-protocols-dns-bind@isc.org"], header :contains "List-Id" "bind-users.lists.isc.org") { fileinto "Public/Mailing-Lists/Bind-Users"; }
Regards Thomas
-----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
iEYEARECAAYFAk0dvZ4ACgkQ+meF/S97aXgYTACfZC70bsYn3d4/VRY2GfK5lF0k xOIAnRsBliv9EErA919vc5KoTAHhq3rC =IE3C -----END PGP SIGNATURE-----
On 12/31/2010 12:25 PM, Thomas Leuxner wrote:
I executed the following to investigate this issue:
sieve-test -t - -Tlevel=matching ~/frop.sieve ~/frop.eml
## Started executing script 'frop'
3: address test
3: starting :is' match with
i;ascii-casemap' comparator:
3: extracting To' headers from message 3: parsing address header value
""bind-users@lists.isc.org""
<bind-users@lists.isc.org>'
3: extracting all' part from non-address value
""bind-users@lists.isc.org"" <bind-users@lists.isc.org>'
3: matching value ""bind-users@lists.isc.org"" <bind-users@lists.isc.org>' 3: with key
bind-users@lists.isc.org' => 0
3: with key bind-users@isc.org' => 0 3: with key
comp-protocols-dns-bind@isc.org' => 0
3: extracting `CC' headers from message
3: finishing match with result: not matched
3: jump if result is false
3: jumping to line 5
## Finished executing script 'frop'
Performed actions:
(none)
Implicit keep:
- store message in folder: INBOX
sieve-test(stephan): Info: final result: success
Apparently, the MIME-encoded part of those address headers includes double quotes, duplicating the ones surrounding the encoded part already. As can be seen from the above trace, this decodes into an invalid address representation, causing Pigeonhole to handle it as opaque text.
If those quotes are really supposed to be part of the 'phrase' part of the e-mail address, I think those should have been escaped somehow. I don't think that encoding can be an alternative for that. Timo, any thoughts?
Regards,
Stephan.
On Sat, 2011-01-01 at 11:35 +0100, Stephan Bosch wrote:
To: "=?UTF-8?B?ImJpbmQtdXNlcnNAbGlzdHMuaXNjLm9yZyI=?=" <bind-users@lists.isc.org>
I think this is a valid address..
I think we need to change the parsing code here. Don't use mail_get_first_header_utf8() or mail_get_headers_utf8() if you intend to parse the value. First parse the addresses, then convert the display-names to UTF8 if necessary. I'll change the sorting code to do this too.
On 1/1/2011 12:21 PM, Timo Sirainen wrote:
I gave RFC2047 and RFC822 a quick read. The phase part is a sequence of one or more `word' syntax items. In Section 5 of RFC2047,
http://tools.ietf.org/html/rfc2047#section-5
at point (3) the `encoded-word' syntax is mentioned as a replacement of the word syntax with a phase part. In RFC822 the word syntax was either an atom or a quoted-string. In the situation above, obviously, a quoted sting is used. In the item list that follows in section 5 of RFC2047, however, the above situation is explicitly denied:
- An 'encoded-word' MUST NOT appear within a 'quoted-string'.
So, there does seem to be a bug in the mailer used by the person sending the message.
Regards,
Stephan.
On 01/01/2011 01:53 PM, Timo Sirainen wrote:
Fixed:
http://hg.rename-it.nl/dovecot-2.0-pigeonhole/rev/99f8dc1e246a
Regards,
Stephan.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 01.01.2011 um 14:12 schrieb Stephan Bosch:
Thanks to both of you. Will patch and report back when I see mitigation on the ML. (Depending on the "strange poster" though :) )
Regards Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
iEYEARECAAYFAk0fKyYACgkQ+meF/S97aXiMWACfRAqWofSBNcGlAYFBKl9zvM1m IT8An2HNoCriNv7ngSLk6a/4SjFAxWUM =mXJk -----END PGP SIGNATURE-----
On Sat, 2011-01-01 at 14:12 +0100, Stephan Bosch wrote:
I think there are other places that need fixing too:
src/lib-sieve/plugins/enotify/mailto/ntfy-mailto.c: if ( mail_get_headers_utf8 src/lib-sieve/plugins/notify/ext-notify-common.c: if ( mail_get_headers_utf8(msgdata->mail, "from", &header) >= 0 ) src/lib-sieve/plugins/vacation/cmd-vacation.c: if ( mail_get_headers_utf8
On 1/1/2011 3:03 PM, Timo Sirainen wrote:
Check.
Fixed:
http://hg.rename-it.nl/dovecot-2.0-pigeonhole/rev/146a2a9d5cb0
Regards,
Stephan
On Sat, 2011-01-01 at 15:24 +0100, Stephan Bosch wrote:
src/lib-sieve/plugins/notify/ext-notify-common.c: if ( mail_get_headers_utf8(msgdata->mail, "from",&header)>= 0 ) Check.
Actually I'm now less sure about this :) It's inserted into message body and intended to be human readable? Then the _utf8() would have been right I guess.
On 01/01/2011 03:33 PM, Timo Sirainen wrote:
Uh, you are right. Fixed:
http://hg.rename-it.nl/dovecot-2.0-pigeonhole/rev/442a5fb51d76
Regards,
Stephan.
D'oh. Now why didn't I reply to the second part?
On 1/1/2011 12:21 PM, Timo Sirainen wrote:
In light of my previous e-mail, I think we can suffice with not using the _utf8() functions when the address needs to be parsed. The phrase part is not used and encodings are not allowed in the actual address itself.
Regards,
Stephan
participants (3)
-
Stephan Bosch
-
Thomas Leuxner
-
Timo Sirainen