[Dovecot] Encoding dovecot keywords
Hello.
If mailbox name include non latin chars, dovecot save it encode to modification UTF-7. Example "тест" convert to "&BEIENQQ6BEEEQg-". If replace & to + I can decode it:
<?php $str = '+BEIENQQ6BEEEQg-'; echo iconv('UTF-7', 'UTF-8', $str); ?>
But I can't understand what encoding use dovecot for keywords? I set "тест" (number 5) keyword:
cat dovecot-keywords 0 &bcienqrbbeiepgqybdaetw-_&bdwenqrcbdoema- 1 test 2 $label1 3 $label2 4 $label3 5 &beienqq6beeeqg-
but "&beienqq6beeeqg-" do not convert to "тест". Why?
Quoting Алексей Сундуков public-mail@alekciy.ru:
But I can't understand what encoding use dovecot for keywords? I set "тест" (number 5) keyword:
cat dovecot-keywords 0 &bcienqrbbeiepgqybdaetw-_&bdwenqrcbdoema- 1 test 2 $label1 3 $label2 4 $label3 5 &beienqq6beeeqg-
but "&beienqq6beeeqg-" do not convert to "тест". Why?
Because only mailboxes are encoded in Modified UTF-7. There is no
equivalent encoding for keywords. Keywords are *not* meant to be
directly viewable by an end-user. Converting between a keyword and
the representation displayed to the user is the job of the MUA.
michael
On 9.4.2011, at 0.50, Michael M Slusarz wrote:
Because only mailboxes are encoded in Modified UTF-7. There is no equivalent encoding for keywords.
Right.
Keywords are *not* meant to be directly viewable by an end-user. Converting between a keyword and the representation displayed to the user is the job of the MUA.
I don't really agree with that though. It's kind of stupid to have server-side keywords if user still has to configure each client to specify what they mean. It would have been nice if keywords had been strings instead of atoms.
I've been thinking about some day maybe trying to create some kind of a standard for keywordUTF7 <-> UTF8 translation.
Quoting Timo Sirainen tss@iki.fi:
Keywords are *not* meant to be directly viewable by an end-user. Converting between a keyword and the representation displayed to the user is the job of the MUA.
I don't really agree with that though. It's kind of stupid to have
server-side keywords if user still has to configure each client to
specify what they mean. It would have been nice if keywords had been
strings instead of atoms.
I totally agree with this statement. It's just that there isn't any
IMAP standard in place to do this kind of thing. Theoretically you
could do some mapping via the METADATA extension but this won't work
cross-MUA due to the lack of a standard.
Example: in the recently released IMP 5 (shameless plug), we ignore
all non-standard flags or non-known keywords by default since much of
the time the bare keywords aren't all that useful (e.g. $Label1).
There's an option for advanced users to show these raw keywords, but
determination of what keywords go with what label necessarily needs to
be done by the user.
I've been thinking about some day maybe trying to create some kind
of a standard for keywordUTF7 <-> UTF8 translation.
I am all for this. I would be more than willing to help you draft the
standard.
michael
Ok, but if I write my own MUA application? If different MUA use it's own encoding format I think it's bad, very very bad way. Because application can't decode keywords written by another application.
Can someone known a solution to this problem?
2011/4/9 Michael M Slusarz slusarz@curecanti.org:
Quoting Алексей Сундуков public-mail@alekciy.ru:
But I can't understand what encoding use dovecot for keywords? I set "тест" (number 5) keyword:
cat dovecot-keywords 0 &bcienqrbbeiepgqybdaetw-_&bdwenqrcbdoema- 1 test 2 $label1 3 $label2 4 $label3 5 &beienqq6beeeqg-
but "&beienqq6beeeqg-" do not convert to "тест". Why?
Because only mailboxes are encoded in Modified UTF-7. There is no equivalent encoding for keywords. Keywords are *not* meant to be directly viewable by an end-user. Converting between a keyword and the representation displayed to the user is the job of the MUA.
michael
On 9.4.2011, at 13.41, Алексей Сундуков wrote:
Ok, but if I write my own MUA application? If different MUA use it's own encoding format I think it's bad, very very bad way. Because application can't decode keywords written by another application.
Yeah, it sucks.
Can someone known a solution to this problem?
No. You could try proposing a standard though. http://www.washington.edu/imap/lists/imap-protocol.html
2011/4/9 Michael M Slusarz slusarz@curecanti.org:
Quoting Алексей Сундуков public-mail@alekciy.ru:
But I can't understand what encoding use dovecot for keywords? I set "тест" (number 5) keyword:
cat dovecot-keywords 0 &bcienqrbbeiepgqybdaetw-_&bdwenqrcbdoema- 1 test 2 $label1 3 $label2 4 $label3 5 &beienqq6beeeqg-
but "&beienqq6beeeqg-" do not convert to "тест". Why?
Because only mailboxes are encoded in Modified UTF-7. There is no equivalent encoding for keywords. Keywords are *not* meant to be directly viewable by an end-user. Converting between a keyword and the representation displayed to the user is the job of the MUA.
michael
Hi.
2011/4/10 Timo Sirainen tss@iki.fi
On 9.4.2011, at 13.41, Алексей Сундуков wrote:
Ok, but if I write my own MUA application? If different MUA use it's own encoding format I think it's bad, very very bad way. Because application can't decode keywords written by another application.
Yeah, it sucks.
Well, it could be worse. Dovecot's max length is configurable. We just need atom compatible keywords. For latin written languages, this could be doable. For other languages, it sucks. It would be better if keywords were that UTF-7 modified thing, just like folder names, but that is not the standard :-(
Can someone known a solution to this problem?
No. You could try proposing a standard though. http://www.washington.edu/imap/lists/imap-protocol.html
This is cool!
Well, changing IMAP keywords to UTF-7-mod + specifiying min size supported by IMAP servers and max size in capabilities anyhow would be a not too intrusive change, right?
Regards. Erny
Quoting Алексей Сундуков public-mail@alekciy.ru:
But I can't understand what encoding use dovecot for keywords? I set "тест" (number 5) keyword:
cat dovecot-keywords 0 &bcienqrbbeiepgqybdaetw-_&bdwenqrcbdoema- 1 test 2 $label1 3 $label2 4 $label3 5 &beienqq6beeeqg-
but "&beienqq6beeeqg-" do not convert to "тест". Why?
Because only mailboxes are encoded in Modified UTF-7. There is no equivalent encoding for keywords. Keywords are *not* meant to be
2011/4/9 Michael M Slusarz slusarz@curecanti.org: directly
viewable by an end-user. Converting between a keyword and the representation displayed to the user is the job of the MUA.
michael
participants (4)
-
Ernesto Revilla Derksen
-
Michael M Slusarz
-
Timo Sirainen
-
Алексей Сундуков