sieve filtering utf 8 strings

Sergey Schwartz sergey.schwartz at bgoperator.com
Thu Sep 3 12:32:55 UTC 2015


Stephan,

You rock!!!
The extra space is the bad guy :c)

Looks like RoundCube webmail cuts off extra spaces from the subject in 
the UI.
If I copy/paste subject from RC the second space is missing.
Thunderbird showed all spaces as they are in the message source and 
filter works just fine now.

Regarding the subjects itself - I didn't copy complete source, sorry, 
didn't thought you'll need it.

Here is complete example

Subject: =?UTF-8?Q?LDS_(robot):_=D0=9B=D0=B8=D1=81=D1=82?=
  =?UTF-8?Q?_=D0=B1=D1=80=D0=BE=D0=BD=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD?=
  =?UTF-8?Q?=D0=B8=D1=8F__=D0=BE=D1=82=D0=BF=D1=80=D0=B0=D0=B2=D0=BB=D0=B5?=
  =?UTF-8?Q?=D0=BD__id=3D12120000443610341?=
  =?UTF-8?Q?8_=D0=B7=D0=B0=D1=8F=D0=B2=D0=BA=D0=B0_845522195<br>=0A?=


dovecot --version
2.2.18 (e157d13efac9)


Best regards,
Sergey Schwartz

Senior System Administrator
Biblio Globus Tour Operator
www.bgoperator.ru

T: +7 495 5042500 ext 1532
E: sergey.schwartz at bgoperator.com

03.09.2015 03:20, Stephan Bosch пишет:
> Op 9/2/2015 om 5:03 PM schreef Sergey Schwartz:
>> Guys,
>>
>> I'm completely stuck, so asking for advice.
>> My user has a sieve script which checks message header if it contains
>> words in russian like 'Лист бронирования отправлен'.
>>
>> Pritty simple script
>>
>> # rule:[Отправлено]
>> if allof (header :contains "subject" "LDS (robot): Лист бронирования
>> отправлен", header :contains "from" "noreply at bgoperator.com")
>> {
>>      fileinto "Отправлено";
>> }
>>
>> I don't have errors compiling the script or executing it via LMTP, but
>> it doesn't work.
>> Normally user receives messages from robot with subject encoded as
>> quoted-printable
>>
>> Subject: =?UTF-8?Q?LDS_(robot):_=D0=9B=D0=B8=D1=81=D1=82?=
>>   =?UTF-8?Q?_=D0=B1=D1=80=D0=BE=D0=BD=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD?=
>>
>>   =?UTF-8?Q?=D0=B8=D1=8F__=D0=BE=D1=82=D0=BF=D1=80=D0=B0=D0=B2=D0=BB=D0=B5?=
>>
>>
>>
>> When I send a test message via Thunderbird with required words - sieve
>> works fine and subject is encoded in base64
>>
>> Subject:
>> =?UTF-8?B?0JvQuNGB0YIg0LHRgNC+0L3QuNGA0L7QstCw0L3QuNGPINC+0YLQv9GA?=
>>   =?UTF-8?B?0LDQstC70LXQvQ==?=
>>
>> It is the same text, but encodind is different - base 64 works fine
>> and quoted-printable does not.
>> Is it possible to have both supported for sieve ?
> Both should be supported. I checked your encoded text using a test suite
> script (see below for a long answer) and it seems that your encoding is
> not what you expect.
>
> This:
>
> Subject: =?UTF-8?Q?LDS_(robot):_=D0=9B=D0=B8=D1=81=D1=82?=
>   =?UTF-8?Q?_=D0=B1=D1=80=D0=BE=D0=BD=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD?=
>   =?UTF-8?Q?=D0=B8=D1=8F__=D0=BE=D1=82=D0=BF=D1=80=D0=B0=D0=B2=D0=BB=D0=B5?=
>
> Yields:
>
> "LDS (robot): Лист бронирования  отправле"
>
> Notice the two spaces before отправле and the missing Cyrillic N at the
> end. The two spaces are caused by the double '__' in the third line of
> the encoded subject. The final N in the subject is just not encoded.
>
> This:
>
> Subject: =?UTF-8?Q?LDS_(robot):_=D0=9B=D0=B8=D1=81=D1=82?=
>   =?UTF-8?Q?_=D0=B1=D1=80=D0=BE=D0=BD=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD?=
>   =?UTF-8?Q?=D0=B8=D1=8F_=D0=BE=D1=82=D0=BF=D1=80=D0=B0=D0=B2=D0=BB=D0=B5?=
>   =?UTF-8?Q?=D0=BD?=
>
> Yields:
>
> "LDS (robot): Лист бронирования отправлен"
>
> Which is obviously OK.
>
> So, to me, it seems as though the program that creates these messages is
> encoding  the wrong text or is messing up encoding itself.
>
> Regards,
>
> Stephan.
>
>
> LONG ANSWER:
>
> I wrote a little test suite script like this:
>
> <SCRIPT>
> require "vnd.dovecot.testsuite";
>
> test_set "message" text:
> Subject: =?UTF-8?Q?LDS_(robot):_=D0=9B=D0=B8=D1=81=D1=82?=
>   =?UTF-8?Q?_=D0=B1=D1=80=D0=BE=D0=BD=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD?=
>   =?UTF-8?Q?=D0=B8=D1=8F__=D0=BE=D1=82=D0=BF=D1=80=D0=B0=D0=B2=D0=BB=D0=B5?=
> From: noreply at bgoperator.com
> To: friep at example.net
>
> Frop!
> .
> ;
>
> test "Test original" {
>      # rule:[Отправлено]
>      if not allof (
>          header :contains "subject" "LDS (robot): Лист бронирования
> отправлен",
>          header :contains "from" "noreply at bgoperator.com")
>      {
>          test_fail "Failed";
>      }
> }
>
> test_set "message" text:
> Subject: =?UTF-8?Q?LDS_(robot):_=D0=9B=D0=B8=D1=81=D1=82?=
>   =?UTF-8?Q?_=D0=B1=D1=80=D0=BE=D0=BD=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD?=
>   =?UTF-8?Q?=D0=B8=D1=8F_=D0=BE=D1=82=D0=BF=D1=80=D0=B0=D0=B2=D0=BB=D0=B5?=
>   =?UTF-8?Q?=D0=BD?=
> From: noreply at bgoperator.com
> To: friep at example.net
>
> Frop!
> .
> ;
>
> test "Test mended" {
>      # rule:[Отправлено]
>      if not allof (
>          header :contains "subject" "LDS (robot): Лист бронирования
> отправлен",
>          header :contains "from" "noreply at bgoperator.com")
>      {
>          test_fail "Failed";
>      }
> }
> </SCRIPT>
>
> I executed it from the source directory:
>
> $ src/testsuite/testsuite -Tlevel=matching -t - ~/frop.svtest
>
> <OUTPUT>
> Test case: /home/stephan/frop.svtest:
>
>
>        ## Started executing script 'frop.svtest'
>     3: testsuite: test_set command
>     3:   set test parameter 'message' = "Subject:
> =?UTF-8?Q?LDS_(robot):_=D0=9B=D0=B8=D1=81=D1=82?=
>   =?UTF-8?Q?_=D0=B1=D1=80=D0=BE=D0=BD=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD?=
>   =?UTF-8?Q?=D0=B8=D1=8F__=D0=BE=D1=82=D0=BF=D1=80=D0=B0=D0=B2=D0=BB=D0=B5?=
> From: noreply at bgoperator.com
> To: friep at example.net
>
> Frop!
> "
>
>    14: ** Testsuite test start: "Test original"
>    16: header test
>    16:   starting `:contains' match with `i;ascii-casemap' comparator:
>    16:   extracting `subject' headers from message
>    16:   matching value `LDS (robot): Лист бронирования  отправле'
>    16:     with key `LDS (robot): Лист бронирования отправлен' => 0
>    16:   finishing match with result: not matched
>    17: jump if result is false
>    17:   jumping to line 20
>    20: testsuite: test_fail command; FAIL current test
>   1: Test 'Test original' FAILED: Failed
>    20: jumping to line 24
>    24: testsuite: test_set command
>    24:   set test parameter 'message' = "Subject:
> =?UTF-8?Q?LDS_(robot):_=D0=9B=D0=B8=D1=81=D1=82?=
>   =?UTF-8?Q?_=D0=B1=D1=80=D0=BE=D0=BD=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD?=
>   =?UTF-8?Q?=D0=B8=D1=8F_=D0=BE=D1=82=D0=BF=D1=80=D0=B0=D0=B2=D0=BB=D0=B5?=
>   =?UTF-8?Q?=D0=BD?=
> From: noreply at bgoperator.com
> To: friep at example.net
>
> Frop!
> "
>
>    36: ** Testsuite test start: "Test mended"
>    38: header test
>    38:   starting `:contains' match with `i;ascii-casemap' comparator:
>    38:   extracting `subject' headers from message
>    38:   matching value `LDS (robot): Лист бронирования отправлен'
>    38:     with key `LDS (robot): Лист бронирования отправлен' => 1
>    38:   finishing match with result: matched
>    39: jump if result is false
>    39:   not jumping
>    40: header test
>    40:   starting `:contains' match with `i;ascii-casemap' comparator:
>    40:   extracting `from' headers from message
>    40:   matching value `noreply at bgoperator.com'
>    40:     with key `noreply at bgoperator.com' => 1
>    40:   finishing match with result: matched
>    40: jump if result is false
>    40:   not jumping
>    40: jumping to line 42
>    42: ** Testsuite test end
>
>   2: Test 'Test mended' SUCCEEDED
>        ## Finished executing script 'frop.svtest'
>
> FAIL: 1 of 2 tests failed.
> </OUTPUT>
>
> Regards,
>
> Stephan.
>
>



More information about the dovecot mailing list