On Fri, 23 Sep 2005, dean gaudet wrote:
here's one point where my thinking has differed -- i'd treat the mailbox files as read-only (plus one file which is append-only) and include an append-only modification log for recovery purposes... read-only mailbox files permit compression,
Though they require sequential reading order for parsing, so think about reading a bunch of messages from the end of the mbox: one full decompression for indexing, then very close to full decompression for every message retrieval in the batch. You'd think that retrieving a sequential block via IMAP might help, but a lot of MUAs prefer single message random access.
To address the situation you want, readonly archival, my vision would be a compressed maildir (or equivalent), using each mail as a separately compressed entry. Zip is a pretty good format for this purpose. Though per-file compression is typically only about half as efficient as whole-mbox compression, you'd have much faster search and retrieval if the file entries had their own compression dictionaries.
-- -- Todd Vierling tv@duh.org tv@pobox.com todd@vierling.name