On Fri, 23 Sep 2005, Timo Sirainen wrote:
The point is to have a mailbox format where the mailbox can consist of one or more files. Grouping multiple files in a single file makes it faster to read, but it's slower to expunge mails from the beginning of the file. So this format would allow sysadmin to specify rules on how large the files would be allowed to grow.
for about a decade now i've set up all my inbound mail to deliver to two mboxes -- one is an "inbox", the other is an "archive". the inbox is what i look at with my mta, and i delete things from it as soon as i'm done reading or dealing with them. the archive is there so i never have to think about whether i want to save something (and the disk space is totally manageable)... similar to how gmail works.
my "archive" is a collection of mbox files. one named "current" which is where new deliveries occur, and the others named YYYYMMDD.bz2, which are compressed/read-only archived mboxes. the rotation/compression occurs in a cronjob depending on the size of the current file.
it's a bit of a kludge, because the file boundaries are very obvious if you need to find a thread that's spread across a few of them. but it's all just mbox so it's easy to grep and concat a few files into a temporary mbox and extract a thread with any MUA.
i've wanted to turn this into a "real" format supported by dovecot for a while but i never seem to get to it... it sounds like you're headed in a similar direction.
This format is mostly designed to work nicely with separate index files such as Dovecot has. The flags are stored in the mailbox files mostly as a backup in case the index gets lost or corrupted.
here's one point where my thinking has differed -- i'd treat the mailbox files as read-only (plus one file which is append-only) and include an append-only modification log for recovery purposes... read-only mailbox files permit compression, and don't have the nightly cleanup lockout problems you mention later.
the log is affordable in terms of disk space. the mailbox files plus the log are sufficient to recover the current state of the mailbox.
in my case i'd probably grow the log forever... because i'd also never be doing expunges... but the process of doing an expunge can also update any info embedded into the (compressed|uncompressed) mailbox files.
the cost of a log in terms of disk writes for updates is probably better than updates to the mailbox files themselves -- assuming a log entry is compact enough that dozens fit in a 4KiB filesystem block you can amortize several updates into one synchronous disk write rather than having dozens of synchronous writes to separate blocks.
When the file is opened, it must be shared locked with flock(). Expunging mails requires exclusive flock(), so expunges can't happen while someone is still reading the file.
with read-only mailbox files there's no sharing restriction -- an expunge can create a new mbox file, update the index, and unlink the old one. (minor easy to solve race if a reader gets a filename from the index which is renamed before it's opened... just loop... expunges should be infrequent enough this has no livelock potential.)
but i guess that's not very quota friendly... oops.
Compatibility
If needed, it would be possible to create new/ and tmp/ directories under the mailbox to allow mail deliveries to be in maildir format. The maildir files would be then either just moved or maybe be appended to existing mail files.
yeah i've definitely wanted delivery to require no changes -- in my case it just happens to the mbox file named "current".
i think it's best not to deal with indices and other fancy things during delivery because there's no opportunity to amortize synchronized disk writes... and i know in the case of large ISP mail sites there tend to be a lot of users who never read their mail. you'll support more users on less hardware if the synchronous disk writes are at a minimum.
-dean