On Thu, Jun 28, 2012 at 1:21 PM, Timo Sirainen <tss@iki.fi> wrote:
On 28.6.2012, at 20.14, Timo Sirainen wrote:
"An upshot of the way alternate storage works is that any given storage file (mailboxes/<folder>/dbox-Mails/u.* (sdbox) or storage/m.* (mdbox)) can only appear *either* in the primary storage area *or* the alternate storage area but not both — if the corresponding file appears in both areas then there is an inconsistency."
Whoever wrote that wasn't exactly correct (or clear). There's no problem having the same file in both primary and alt storage. Only if the files are different there's a problem, but that shouldn't happen..
Hmm. Although looking at the mdbox index rebuilding code:
/* duplicate file. either readdir() returned it twice (unlikely) or it exists in both alt and primary storage. to make sure we don't lose any mails from either of the files, give this file a new ID and rename it. */
It probably shouldn't be doing that. sdbox isn't doing that:
/* we were supposed to open the file in alt storage, but it exists in primary storage as well. skip it to avoid
adding it twice. */
That's probably due to the different structures they use. sdbox can safely use either because each email message has a unique filename, and if it exists in both places it doesn't matter.
mdbox though is different, multiple messages are stored in a single file. The index indicates in which file each message is located. When the data is moved to alt storage, the filename can change in which case the index is updated. IE: Primary/Msg06282012 -- contains Msg007, Msg008, Msg009 Primary/Msg06272012 -- contains Msg004, Msg005, Msg006 Primary/Msg06262012 -- contains Msg001, Msg002, Msg003
along comes archiving and the new format is: Primary/Msg06292012 -- contains Msg010, Msg011, Msg012 Primary/Msg06282012 -- contains Msg007, Msg009 Primary/Msg06272012 -- contains Msg004, Msg006 Primary/Msg06262012 -- contains Msg003 Alt/Msg06292012 00 contains Msg001, Msg002, Msg005, Msg008
Since the archive rules can be based on a lot of different scenarios[and a message can even be archived from the command line], the filenames between Primary and Alternate are not the same - and in fact the same filename in each place could have different messages. For example: if messages are archived when a user sets an imap flag on them.
So with the way it's written now, it's not possible to have a simple fallback by filename.
It would be possible if the naming convention was strictly enforced, ie after archiving you have: Primary/Msg06292012 -- contains Msg010, Msg011, Msg012 Primary/Msg06282012 -- contains Msg007, Msg009 Primary/Msg06272012 -- contains Msg004, Msg006 Primary/Msg06262012 -- contains Msg003 Alt/Msg06282012 -- contains Msg008 Alt/Msg06272012 -- contains Msg005 Alt/Msg06262012 -- contains Msg001, Msg002
Now the index can simply say what file a message is in and doesn't have to specify primary or secondary, and the primary file with that name can be checked first, and then if it is not there check the alternate.