Re: [Dovecot] dbox redesign
On Sat, May 12, 2007 9:10 am, Timo Sirainen <tss@iki.fi> said:
Fast copying
Would be nice if copying a message from one mailbox to another wouldn't require actually reading+writing the whole message contents. But I can't really figure out how to implement this without requiring that there is only a single dbox storage which contains the mails for all the mailboxes, and the mailboxes themselves are just Dovecot's index files containing pointers to the dbox storage.
The problem with having everything in one storage is that if the index files are broken, the messages can't be placed into correct mailboxes anymore.
Although one possibility would be treat mailboxes a bit similarly than keywords. So that when a message is copied to another mailbox, the message in dbox file is updated to contain information that it exists in such and such mailboxes. Hmm. Perhaps that would be good enough, yes.
Yes, I think treating mailboxes similary to keywords is ideal. There really is no reason to physically separate mailboxes on disk. All that is needed is this logical separation if it can be done in a reliable way.
Or maybe track this in mailbox-specific index files, and also have a corespodning text file that stores a list of messages that are contained in that mailbox... similar to maildir's dovecot-uidlist file. Then if you lose the index you can rebuild the index from the text file.
Bill
On Wed, 2007-05-16 at 06:40 -0400, Bill Boebel wrote:
Although one possibility would be treat mailboxes a bit similarly than keywords. So that when a message is copied to another mailbox, the message in dbox file is updated to contain information that it exists in such and such mailboxes. Hmm. Perhaps that would be good enough, yes.
Yes, I think treating mailboxes similary to keywords is ideal. There really is no reason to physically separate mailboxes on disk. All that is needed is this logical separation if it can be done in a reliable way.
Except if you want to handle some mailboxes in a special way it's easier if they're separated on disk. Such as renaming or deleting mailboxes is a lot easier.
Or maybe track this in mailbox-specific index files, and also have a corespodning text file that stores a list of messages that are contained in that mailbox... similar to maildir's dovecot-uidlist file. Then if you lose the index you can rebuild the index from the text file.
Except that such mailbox-messagelist file could also be counted as "index file", and losing it again loses the messages :) That's why I thought saving the mailbox name in the message file's headers would be better. If you then lose the mailbox name, you most likely have lost the message itself as well. Also it makes it easier to restore individual messages from backups.
Yes, I think treating mailboxes similary to keywords is ideal. There Except if you want to handle some mailboxes in a special way it's easier if they're separated on disk. Such as renaming or deleting mailboxes is a lot easier.They're based on filtering rules. I don't
Am Mittwoch, 16. Mai 2007 schrieb Timo Sirainen: think they support "copying" messages. So the virtual folders are easily rebuilt by just re-applying the filters into all the messages.
Not neccessarily if you add one level of indirection, simply numbering the mailboxes by index numbers internally and providing a number/name mapping somewhere. This way, a mailbox can be renamed easily simply by updating the map, and might by deleted by removing the map entry. Stale index number may be left in the messages and might cleaned up the next time a message's folder list is updated or messages are expunged.
Greetings,
Gunter
-- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ PEOPLE'S WHOLE LIVES *DO* PASS IN FRONT OF THEIR EYES BEFORE THEY DIE. THE PROCESS IS CALLED 'LIVING'. -- (Terry Pratchett, The Last Continent) +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+PGP-verschlüsselte Mails bevorzugt! +
Am Mittwoch, 16. Mai 2007 schrieb Gunter Ohrner:
mailboxes is a lot easier.They're based on filtering rules. I don't think they support "copying" messages. So the virtual folders are easily rebuilt by just re-applying the filters into all the messages.
Whoops, this yunk should not have been in the message... Looks as if I accidentially middle-clicked somehow... :-/
Greetings,
Gunter
-- *** Powered by AudioScrobbler --> http://www.last.fm/user/Interneci/ *** 21:54 | The Retrosic - Silence 21:49 | The Retrosic - Deathdealer 21:44 | The Retrosic - Bloodsport 21:40 | The Retrosic - Desperate Youth *** PGP-Verschlüsselung bei eMails erwünscht :-) *** PGP: 0x1128F25F ***
On Wed, 2007-05-16 at 20:27 +0200, Gunter Ohrner wrote:
Yes, I think treating mailboxes similary to keywords is ideal. There Except if you want to handle some mailboxes in a special way it's easier if they're separated on disk. Such as renaming or deleting mailboxes is a lot easier.They're based on filtering rules. I don't
Am Mittwoch, 16. Mai 2007 schrieb Timo Sirainen: think they support "copying" messages. So the virtual folders are easily rebuilt by just re-applying the filters into all the messages.
Not neccessarily if you add one level of indirection, simply numbering the mailboxes by index numbers internally and providing a number/name mapping somewhere. This way, a mailbox can be renamed easily simply by updating the map, and might by deleted by removing the map entry. Stale index number may be left in the messages and might cleaned up the next time a message's folder list is updated or messages are expunged.
Right. This would also make it use less space inside the dbox files. There already exists a mailbox list index in v1.1 which contains mailbox ID <-> name mappings. But I'm still a bit concerned of its stability. There are two things that could be done:
Have another human readable mailbox ID <-> name mapping file which is used if the binary index is corrupted. If mailboxes are created/deleted/renamed often, this would just slow things down. Might be a good idea optionally though.
If the ID <-> name mapping is lost, the mailboxes could be created using those IDs as their names. That would be a lot better than just having all the mails merged into a single mailbox. As additional help, there could be a couple of built-in mailbox IDs for INBOX, Trash and Drafts. Perhaps that could be admin-configurable, but then again adding new IDs could make it conflict with existing ones. Perhaps just a single 1=INBOX would be enough..
The mailbox IDs could have a validity number as well, similar to UIDVALIDITY for message UIDs. That would make sure that it's safe to use the validity+ID combination to uniquely and permanently identify a mailbox, even if the mailbox list mapping was completely rebuilt (in that case it would get a new validity).
Am Samstag, 19. Mai 2007 schrieb Timo Sirainen:
- Have another human readable mailbox ID <-> name mapping file which is used if the binary index is corrupted. If mailboxes are created/deleted/renamed often, this would just slow things down. Might be a good idea optionally though.
The Mailbox structure usually is not changed that often. Maybe just provide a way to dump/export the current mapping to a specially formatted text file and a way to manually load/import a provided dump file.
This way, administrators can configure daily cron jobs to dump the current mailbox state and if a mapping really gets lost, a "pretty good" mapping could be reconstructed without any runtime penalty.
- If the ID <-> name mapping is lost, the mailboxes could be created using those IDs as their names.
Yes, for example, with the option to overwrite this synthesized mapping with the latest dump.
Greetings,
Gunter
-- *** Powered by AudioScrobbler --> http://www.last.fm/user/Interneci/ *** 15:30 | Within Temptation - The Promise 15:24 | Within Temptation - Mother Earth 15:19 | Within Temptation - Ice Queen 14:21 | Within Temptation - What Have You Done (Rock Mix) *** PGP-Verschlüsselung bei eMails erwünscht :-) *** PGP: 0x1128F25F ***
Would be nice if copying a message from one mailbox to another wouldn't require actually reading+writing the whole message contents. But I can't really figure out how to implement this without requiring that there is only a single dbox storage which contains the mails for all the mailboxes, and the mailboxes themselves are just Dovecot's index files containing pointers to the dbox storage.
The problem with having everything in one storage is that if the index files are broken, the messages can't be placed into correct mailboxes anymore.
Although one possibility would be treat mailboxes a bit similarly than keywords. So that when a message is copied to another mailbox, the message in dbox file is updated to contain information that it exists in such and such mailboxes. Hmm. Perhaps that would be good enough, yes.
Yes, I think treating mailboxes similary to keywords is ideal. There really is no reason to physically separate mailboxes on disk. All that is needed is this logical separation if it can be done in a reliable way.
Or maybe track this in mailbox-specific index files, and also have a corespodning text file that stores a list of messages that are contained in that mailbox... similar to maildir's dovecot-uidlist file. Then if you lose the index you can rebuild the index from the text file.
This sounds suspiciously like 'virtual folders', that are supported by both Evolution and Thunderbird... how do they do it?
--
Best regards,
Charles
On Wed, 2007-05-16 at 07:47 -0400, Charles Marcus wrote:
Although one possibility would be treat mailboxes a bit similarly than keywords. So that when a message is copied to another mailbox, the message in dbox file is updated to contain information that it exists in such and such mailboxes. Hmm. Perhaps that would be good enough, yes.
Yes, I think treating mailboxes similary to keywords is ideal. There really is no reason to physically separate mailboxes on disk. All that is needed is this logical separation if it can be done in a reliable way.
Or maybe track this in mailbox-specific index files, and also have a corespodning text file that stores a list of messages that are contained in that mailbox... similar to maildir's dovecot-uidlist file. Then if you lose the index you can rebuild the index from the text file.
This sounds suspiciously like 'virtual folders', that are supported by both Evolution and Thunderbird... how do they do it?
They're based on filtering rules. I don't think they support "copying" messages. So the virtual folders are easily rebuilt by just re-applying the filters into all the messages.
participants (4)
-
Bill Boebel
-
Charles Marcus
-
Gunter Ohrner
-
Timo Sirainen