Timo Sirainen wrote:
The point is to have a mailbox format where the mailbox can consist of one or more files. Grouping multiple files in a single file makes it faster to read, but it's slower to expunge mails from the beginning of the file. So this format would allow sysadmin to specify rules on how large the files would be allowed to grow.
This seems like a lot of complexity for an unknown amount of performance. Sure, it is going to be loads faster than multi-megabyte mbox mailboxes, but you can color me unconvinced that this will be a significant win over maildir. The primary advantage to maildir is the utter simplicity of all operations; at no time do you need to completely rewrite any files and all operations are 100% atomic. The index format under maildir is also very simple, since you only need to keep track of the filename (and flags) rather than filename and offset and flags. And with modern filesystems, disk access is intelligently cached.
If you are trying to tune for where there are significant numbers of very small (< 2k) files (well smaller than the typical block size in the underlying filesystem), you may be aiming too small. It looks like the median file size in my maildir folders is about 3100 bytes. What sizes were you thinking the typical admin would set as the limit?
Personally, I think your time would be better spent integrating a database message store and let the database engine deal with storage and indexing issues. YMMV. ;-)
John
-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4501 Forbes Boulevard Suite H Lanham, MD 20706 301-459-3366 x.5010 fax 301-429-5748