[Dovecot] Unique message IDs?
Hello,
I'm working with indexing my mail box. What I need is to have index in which mailing specified header string is located. I work with the messages via IMAP. I know that sounds like FTS can help me but no, I don't want to index whole message nor I want to patch FTS source to make it to index headers only.
I need a way to identify message across my mail box. The idea is that I can move messages between IMAP folders and the index still should be able to identify it.
What I want to know, will UID strings be unique across all my messages no matter which IMAP folder it belongs now to, or the UID can change if I move message? If I remove dovecot-uidlist from the folder will UIDs of messages in the folder be changed? What happened if I put a message to the given IMAP folder (and dovecot-uidlist will record UID for it), then shut down dovecot, and (given maildir storage model) move the message's file info another folder in the same mail box - after I start dovecot, will the UID for the message be the same or not?
Thank you in advance, Alexander Chekalin
On 21.11.2011, at 22.02, Alexander Chekalin wrote:
I need a way to identify message across my mail box. The idea is that I can move messages between IMAP folders and the index still should be able to identify it.
Message GUIDs are pretty good for that.
What I want to know, will UID strings be unique across all my messages no matter which IMAP folder it belongs now to, or the UID can change if I move message? If I remove dovecot-uidlist from the folder will UIDs of messages in the folder be changed? What happened if I put a message to the given IMAP folder (and dovecot-uidlist will record UID for it), then shut down dovecot, and (given maildir storage model) move the message's file info another folder in the same mail box - after I start dovecot, will the UID for the message be the same or not?
With Maildir the message GUID is typically the same as the Maildir base filename (i.e. everything before ':' character). Assuming you're using Dovecot v2.x, when mail is copied to another mailbox its filename is preserved. So deleting dovecot* files won't lose the GUID.
The only problem is that if you copy the same mail twice to another mailbox, it can't of course have the same filename twice, so Dovecot will assign it a new filename. But in a new enough version (probably v2.0.something) it still preserves the GUID by writing it to dovecot-uidlist file. In this situation if you delete the uidlist, the GUID changes to its filename.
Message GUIDs are pretty good for that.
Oh, thank you! Nice news!
With Maildir the message GUID is typically the same as the Maildir base filename (i.e. everything before ':' character).
But what if I one day decide to convert my maildir's to mbox'es? I really plan to do such conversion in a while (as soon as I finish the indexing system).
Yours, Alexander
On Tue, 2011-11-22 at 00:47 +0300, Alexander Chekalin wrote:
With Maildir the message GUID is typically the same as the Maildir base filename (i.e. everything before ':' character).
But what if I one day decide to convert my maildir's to mbox'es? I really plan to do such conversion in a while (as soon as I finish the indexing system).
mbox? or mdbox? With mbox there are no proper GUIDs, but Dovecot kind of fakes it by returning MD5 of specific headers as GUIDs (so not 100% reliable). With mdbox GUIDs work even better than with Maildir, the GUID is always stored in the message's metadata.
With Maildir the message GUID is typically the same as the Maildir base filename (i.e. everything before ':' character).
But what if I one day decide to convert my maildir's to mbox'es? I really plan to do such conversion in a while (as soon as I finish the indexing system).
mbox? or mdbox? With mbox there are no proper GUIDs, but Dovecot kind of fakes it by returning MD5 of specific headers as GUIDs (so not 100% reliable). With mdbox GUIDs work even better than with Maildir, the GUID is always stored in the message's metadata.
I'd like to use the best optimized one (mdbox), but there is a reason not to do that is when I use mbox or maildir I can see where given IMAP folder mails are stored, so, say, if I want to copy only one IMAP folder to some remote site, I just copy know dir or file. With mdbox this is different, I simply can not guess where my messages exactly are.
If it be possible to have per-IMAP-folders mdboxes, I love to use it. But from what I know and tried this is not way mdbox used to work, right?
Yours, Alexander Chekalin
On 22.11.2011, at 7.39, Alexander Chekalin wrote:
I'd like to use the best optimized one (mdbox), but there is a reason not to do that is when I use mbox or maildir I can see where given IMAP folder mails are stored, so, say, if I want to copy only one IMAP folder to some remote site, I just copy know dir or file. With mdbox this is different, I simply can not guess where my messages exactly are.
If it be possible to have per-IMAP-folders mdboxes, I love to use it. But from what I know and tried this is not way mdbox used to work, right?
sdbox would work like that. The reason mdbox doesn't work like that is because copying messages would be rather slow then.
The idea with mdbox is anyway that you'd use Dovecot's tools to manage the mailboxes rather than access them directly through filesystem. So if you want to copy one IMAP folder, you'd use either dsync or doveadm import to do it.
Quoting Alexander Chekalin achekalin@lazurit.com:
With Maildir the message GUID is typically the same as the
Maildir base filename (i.e. everything before ':' character).But what if I one day decide to convert my maildir's to mbox'es? I really plan to do such conversion in a while (as soon as I finish the indexing system).
mbox? or mdbox? With mbox there are no proper GUIDs, but Dovecot kind of fakes it by returning MD5 of specific headers as GUIDs (so not 100% reliable). With mdbox GUIDs work even better than with Maildir, the GUID is always stored in the message's metadata.
I'd like to use the best optimized one (mdbox), but there is a
reason not to do that is when I use mbox or maildir I can see where
given IMAP folder mails are stored, so, say, if I want to copy only
one IMAP folder to some remote site, I just copy know dir or file.
With mdbox this is different, I simply can not guess where my
messages exactly are.If it be possible to have per-IMAP-folders mdboxes, I love to use
it. But from what I know and tried this is not way mdbox used to
work, right?
You can always use the info from the wiki, I took some code from it to
create this little script, that dumps my spam folder and uses it to
learn. Basically dumps a mailfolder back into maildir format.
doveadm search -u $useraccount mailbox Spam | while read guid uid; do
doveadm fetch -u $useraccount text mailbox-guid $guid uid $uid >
msg.$uid; done
If your using mdbox on the other end, you could reimport them I
suppose, I haven't looked into doing that, since I haven't needed that
yet.
participants (3)
-
Alexander Chekalin
-
Patrick Domack
-
Timo Sirainen