On Fri, 23 Sep 2005, John Peacock wrote:
Timo Sirainen wrote:
The point is to have a mailbox format where the mailbox can consist of one or more files. Grouping multiple files in a single file makes it faster to read, but it's slower to expunge mails from the beginning of the file. So this format would allow sysadmin to specify rules on how large the files would be allowed to grow.
This seems like a lot of complexity for an unknown amount of performance. Sure, it is going to be loads faster than multi-megabyte mbox mailboxes, but you can color me unconvinced that this will be a significant win over maildir. The primary advantage to maildir is the utter simplicity of all operations; at no time do you need to completely rewrite any files and all operations are 100% atomic. The index format under maildir is also very simple, since you only need to keep track of the filename (and flags) rather than filename and offset and flags. And with modern filesystems, disk access is intelligently cached.
the problem with maildir is that it's fine for a small system but it sucks terrible for large systems in several ways.
the vast number of inodes required is a kernel memory hog. they blow away most backup solutions, and they increase the number of disk seeks required to do anything on the mailbox.
a maildir delivery involves generally at least 4 synchronous writes (and more if your filesystem forces the tmp/ directory changes to be synchronous), compared to a minimum of 2 for mbox.
unless NFS is involved i think maildir is a bad idea... (and i think NFS is a bad idea too, so draw your own conclusions :)
If you are trying to tune for where there are significant numbers of very small (< 2k) files (well smaller than the typical block size in the underlying filesystem), you may be aiming too small. It looks like the median file size in my maildir folders is about 3100 bytes. What sizes were you thinking the typical admin would set as the limit?
the average tends to be even higher when your system includes lots of general users with lots of word/excel/etc. documents attached to their messages. istr a number in the 10KB range when i was working at a place with ~10M mailboxes.
-dean