[Dovecot] Documentation for "altpath" / "altmove" / ?"Alternate storage"
I was thinking about documentation for alternate storage.
We have a few mentions at:
http://wiki2.dovecot.org/MailboxFormat/dbox http://wiki2.dovecot.org/Tools/Doveadm http://wiki2.dovecot.org/Tools/Doveadm/Altmove
But I was thinking it would be helpful if there could be a page about alternate storage.
I could write the page, but I currently have so many unknowns that I think it would be better to gather some more information before writing the first draft.
Note that I don't necessarily fully understand all the surrounding concepts, so I could have written something below which is based on a (vague) belief. Please correct me if I am wrong, or even if it's sort-of right but doesn't quite hit the point squarely.
We might want to have a name for the wiki page. Perhaps "AlternateStorage"? Of course if it only applies to dbox, then another idea is to make it a section on the page "MailboxFormat/dbox". All suggestions gratefully received.
I would be interested in an overview of what alternate storage is. It seems to be a way of transparently moving message data to alternare storage, with the idea that the alternate storage may be on a different filesystem which may be cheaper and/or slower than the main storage, which may be an economic way to store messages where the fastest access to them is less important. It is transparent in as much as IMAP/POP users will not normally be able to tell if any given message has been moved to alternate storage. A single mail folder can end up containing a mixture of messages stored in main storage and alternate storage.
(Another question begged here of what consistent term should be used to refer to the main storage. Could be "main storage", "normal storage", "ordinary storage".)
Also I would be interested to know the development/stability status of alternate storage: experimental / alpha / beta / stable.
Also I would be interested to know the applicability against mailbox formats: mbox / maildir / sdbox / mdbox / cydir.
Also I would be interested to know which data gets moved to the alternate storage, and which data stays in the main storage: message-data / control-data / index-data / combined-control-and-index-data.
Also I would be interested to know how data can be moved to alternate storage. Seems to be that is only done by invocation of "doveadm altmove". There is a page for that already so we can link to that.
Also it would be interesting to have some idea of how it works. For example there might be an "alternate storage" flag in the indexes, or Dovecot tries the main location first and if not found there then it tries the alternate storage. Just a brief overview, and any pertinent ramifications of that.
Bill
On Fri, 2010-09-03 at 11:13 +0100, William Blunn wrote:
We might want to have a name for the wiki page. Perhaps "AlternateStorage"? Of course if it only applies to dbox, then another idea is to make it a section on the page "MailboxFormat/dbox". All suggestions gratefully received.
I think it should be either in dbox page or as a subpage under dbox, other storages are very unlikely to get support for it.
I would be interested in an overview of what alternate storage is. It seems to be a way of transparently moving message data to alternare storage, with the idea that the alternate storage may be on a different filesystem which may be cheaper and/or slower than the main storage, which may be an economic way to store messages where the fastest access to them is less important. It is transparent in as much as IMAP/POP users will not normally be able to tell if any given message has been moved to alternate storage. A single mail folder can end up containing a mixture of messages stored in main storage and alternate storage.
Right. Another way to use it is to store all new mails to a fast but small SSD and then keep moving the mails to HDDs as often as necessary.
(Another question begged here of what consistent term should be used to refer to the main storage. Could be "main storage", "normal storage", "ordinary storage".)
I think I have been calling it "primary storage".
Also I would be interested to know the development/stability status of alternate storage: experimental / alpha / beta / stable.
It's being used for the SSD -> HDD moves by one installation with probably some hundreds of users if not more (I don't know).
Also I would be interested to know the applicability against mailbox formats: mbox / maildir / sdbox / mdbox / cydir.
For cydir it could be implemented, but there's not much point. For mbox/maildir it would be complex.
Also I would be interested to know which data gets moved to the alternate storage, and which data stays in the main storage: message-data / control-data / index-data / combined-control-and-index-data.
Only the message texts get moved to alt storage.
Also I would be interested to know how data can be moved to alternate storage. Seems to be that is only done by invocation of "doveadm altmove". There is a page for that already so we can link to that.
With single-dbox the move operation can even be done manually by simply running:
mv /home/user/dbox/mailboxes/INBOX/u.123 /altstorage/user/dbox/mailboxes/INBOX/
I guess it should work just fine, but still I don't recommend anyone actually doing that. It might not work as easily some day in future, or there may be some race conditions or something I can't think of right now.
With multi-dbox the moving is more complex, but still it only copies the data in m.* files.
Also it would be interesting to have some idea of how it works. For example there might be an "alternate storage" flag in the indexes, or Dovecot tries the main location first and if not found there then it tries the alternate storage. Just a brief overview, and any pertinent ramifications of that.
Dovecot always tries to look for the u.123 or m.123 file from primary storage and if it's not found there, it looks it up from alternate storage. The state isn't stored in indexes. Really simple.
On 03/09/2010 15:01, Timo Sirainen wrote:
With multi-dbox the moving is more complex, but still it only copies the data in m.* files.
Dovecot always tries to look for the u.123 or m.123 file from primary storage and if it's not found there, it looks it up from alternate storage. The state isn't stored in indexes. Really simple.
Thank-you for all your other comments by the way. That is all good stuff and appreciated.
I think the only thing I had further questions on was how alternate storage works with mdbox.
In http://wiki2.dovecot.org/MailboxFormat/dbox it says we have
- dovecot.map.index* files contain the "map index"
- m.* files contain the mail data
I think I am not clear on the "dovecot.map.index* files". Is that really just one file "dovecot.map.index", and the globby asterisk just indicates that there may be may be old versions if the file has been updated by being re-written?
"dovecot.map.index*" is always stored in the primary storage? (Sorry if it sounds like I am asking questions you've already answered. I want to make sure I understand it properly.)
So if we are moving mail to alternate storage, we write them into a new "m.*" file in the alternate storage folder containing the messages we want to move. So that would mean it only really makes sense for any given numbered "m.*" file to exist in either the primary storage or the alternate storage but not both?
So when Dovecot is wanting to find the next unused "m.*" file number, it needs to consider files in both the primary storage and alternate storage?
Bill
On Fri, 2010-09-03 at 17:09 +0100, William Blunn wrote:
I think the only thing I had further questions on was how alternate storage works with mdbox.
In http://wiki2.dovecot.org/MailboxFormat/dbox it says we have
- dovecot.map.index* files contain the "map index"
- m.* files contain the mail data
I think I am not clear on the "dovecot.map.index* files". Is that really just one file "dovecot.map.index", and the globby asterisk just indicates that there may be may be old versions if the file has been updated by being re-written?
It's the same as dovecot.index files that you see for mailboxes. There are:
- dovecot.index updated once in a while (doesn't exist initially)
- dovecot.index.log always, updated always first
- dovecot.index.log.2 - the .log is rotated to this for a while
"dovecot.map.index*" is always stored in the primary storage?
Yes.
So if we are moving mail to alternate storage, we write them into a new "m.*" file in the alternate storage folder containing the messages we want to move.
Either a new m.* file or existing m.* file in there.
So that would mean it only really makes sense for any given numbered "m.*" file to exist in either the primary storage or the alternate storage but not both?
Right, it's a bug if it exists in both.
So when Dovecot is wanting to find the next unused "m.*" file number, it needs to consider files in both the primary storage and alternate storage?
Yes, but it looks up the next number from index file, not by scanning what files exist.
On 03/09/2010 11:13, William Blunn wrote:
I was thinking about documentation for alternate storage.
I have added a new section
http://wiki2.dovecot.org/MailboxFormat/dbox#Alternate_storage
and made various changes elsewhere
http://wiki2.dovecot.org/MailLocation/dbox#Alternate_storage http://wiki2.dovecot.org/MailLocation#Format
which should make it a bit easier to discover Alternate storage and have a go with it.
Bill
participants (2)
-
Timo Sirainen
-
William Blunn