[Dovecot] New mailbox format

Timo Sirainen tss at iki.fi
Fri Sep 23 17:43:53 EEST 2005


On Fri, 2005-09-23 at 10:01 -0400, John Peacock wrote:
> Timo Sirainen wrote:
> > The point is to have a mailbox format where the mailbox can consist of one
> > or more files. Grouping multiple files in a single file makes it faster to
> > read, but it's slower to expunge mails from the beginning of the file. So
> > this format would allow sysadmin to specify rules on how large the files
> > would be allowed to grow.
> 
> This seems like a lot of complexity for an unknown amount of 
> performance.  Sure, it is going to be loads faster than multi-megabyte 
> mbox mailboxes, but you can color me unconvinced that this will be a 
> significant win over maildir.  The primary advantage to maildir is the 
> utter simplicity of all operations; at no time do you need to completely 
> rewrite any files and all operations are 100% atomic.  The index format 
> under maildir is also very simple, since you only need to keep track of 
> the filename (and flags) rather than filename and offset and flags.  And 
> with modern filesystems, disk access is intelligently cached.

I think it'd need some benchmarking :) It depends quite a lot on
filesystem, but opening and reading a single file is still a lot faster
in all filesystems compared to opening and reading thousands of files.
That's probably not a common operation for IMAP clients, but with POP3
the behavior is often to just read all new mails and delete the existing
ones.

Besides just raw throughput the new format would allow higher
concurrency with multiple clients reading/writing the mailbox. While
maildir theoretically doesn't have any locks, in practise it needs them
with all existing filesystems or mails gets temporarily lost and Dovecot
starts giving errors.

While the maildir format itself is simple, it's actually really
difficult to handle correctly when the maildir is changing under us.
Files can be renamed at any time so you'll have to be prepared to look
for the file's new name at any time. Filesystems also don't work the way
maildir assumes they do, so you have to work around their limitations
too.

There probably are also other reasons why people don't like maildir,
which I don't really remember now.

> If you are trying to tune for where there are significant numbers of 
> very small (< 2k) files (well smaller than the typical block size in the 
> underlying filesystem), you may be aiming too small.  It looks like the 
> median file size in my maildir folders is about 3100 bytes.  What sizes 
> were you thinking the typical admin would set as the limit?

I thought a few megabytes per file might be good. Large enough for full
mailbox reads to be fast but not so large that expunging messages from
the middle would cause too much I/O.

This could possibly be also automatically set per mailbox. If user
always expunges all mails at a time (POP3) or never expunges (mailing
list archives), there would be only a single file.

> Personally, I think your time would be better spent integrating a 
> database message store and let the database engine deal with storage and 
> indexing issues.  YMMV. ;-)

I think SQL database as a mail store would have much worse performance
than with any filesystem based mail store.

Anyway, I wouldn't mind having Dovecot support SQL databases (or other
kind of databases). If you really want it, you can always pay for it to
get implemented on my work time, which is what's happening with this
mailbox format :)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://dovecot.org/pipermail/dovecot/attachments/20050923/8244e35e/attachment-0001.pgp


More information about the dovecot mailing list