[Dovecot] New mailbox format

Geo Carncross geocar-dovecot at internetconnection.net
Tue Oct 4 17:38:59 EEST 2005


On Fri, 2005-09-23 at 19:18 +0300, Timo Sirainen wrote:
> > I thought that was the point of dovecot-lda, so that you know that 
> > dovecot is the only process touching the message files and can update 
> > the indexes as things change, rather than after the fact.
> 
> Dovecot still can't know that it's the only one, but I had been 
> thinking about adding an option to make Dovecot assume it's the only 
> program modifying the maildir. That should avoid scanning the cur/ 
> directory constantly. Except that probably still requires some changes 
> since up-to-date maildir filenames aren't stored anywhere, but maybe 
> they could be guessed by looking at the base filename and mail's 
> current flags from indexes and generating the most-likely filename 
> based on them (or did I already have that code..? :)

If dovecot uses both ends of delivery (in and out) of the message store,
you should modify it.

Maildir is a great system that solves very real problems (that most of
its opponents seem to believe don't exist).

That doesn't change the fact that it's not terribly easy to model IMAP
on it.

I'd recommend a slightly modified maildir- incompatible, but easy to
mutate to/from a compatible maildir as needs change:

DMaildir/tmp
DMaildir/use
DMaildir/fl

that's it.

use contains the message IDs and flags and serves the same purpose as
cur, but there is no new.

To convert to DMaildir, one generates a maildir:
	marks the home directory as sticky to halt delivery
	[wait]
	moves messages from new/ to cur/
	renames cur/ to use/
	mkdir fl/
	unsticks mail delivery.

To convert back:
	mkdir new

When opening a mailbox:

	If new/ exists, use it.
	If use/ exists use that.
	
	if cur/ doesn't exist but use/ exists and new/ exists, rename use to
cur.

flags should NOT be encoded into the message (so that messages are never
rename()d within use/)

fl/ contains a set of symbolic links. each message in use/ MAY have a
symlink with the same name in fl/ containing all of the metadata as the
link target (systems without symlinks can use actual files I suppose,
but do any of those still exist?)

readlink is cheaper than open/read/close, and we already know the
message "identifier".

Really, it might be just as slow as returning stat() to the equation,
but because you'd be putting other metadata available to the DMaildir,
you could make up for it by making the information ACTUALLY NEEDED more
accessible.

> >> This could possibly be also automatically set per mailbox. If user
> >> always expunges all mails at a time (POP3) or never expunges (mailing
> >> list archives), there would be only a single file.
> >
> > If you are thinking that way, it might be useful to mark mailboxes as 
> > either:
> >
> > 	a) sequential access (most POP3 and archives);
> > 	b) random access (IMAP).
> >
> > and provide different tuning based on usage expectations, rather than 
> > trying to actively tune.  Then the system admin could choose which 
> > fits their needs best; we only have a few people using POP3 and 
> > everyone else is IMAP, and I frankly don't care what kind of 
> > performance the POP3 people have.
> 
> I haven't really thought about this yet, but a global setting wouldn't 
> usually be enough for all users and I'm not sure how it can be set 
> per-user or per-mailbox in any useful way.

Worse still, it may be per-time. My access behaviors of new messages is
largely delete, but once it's a few days old, I don't delete anything
anymore.

Many people don't move things out of INBOX... ever.

> >> I think SQL database as a mail store would have much worse performance
> >> than with any filesystem based mail store.
> >
> > Except that with clustering, you could have multiple DB servers 
> > getting hit by multiple dovecot servers.  SQL could potentially scale 
> > to much larger environments than filesystem support.

No it cannot.

The AMOEBA filesystem (bullet) exhibits nearly perfect (linear)
scalability.

Transactions require locks, and locks are slow. Better to look for a
lock-free distributed mailbox.

Even UID generation can be performed without locks, although only in a
very IMAP-specific way.

[[this was covered on the dbmail mailing list a while back]]

I can pull up the references if anyone's interested.

> If you thought about some kind of database that is distributed across 
> multiple computers.. Well, that would probably work, but I haven't 
> heard much of them being in any actual use. Probably because they all 
> cost too much. I don't think PostgreSQL or MySQL had support for these 
> kind of clusters yet?

SQL is never faster than domain-specific data structures, and
distributed SQL is very slow.

Email is constant. Replicating constant blobs is downright trivial
(consider running rsync twice) provided you name them uniquely (easy
enough).

Flags and message location however, is mutable, and distributing a
directory is hard work.

Nevertheless, I'm a big proponent of split filesystems; make storage one
service and metadata (directory) another.

I presently am using Fedora directory server with it's multimaster
replication system as a means to have per-site tolerance. Replication is
fairly sluggish, but an IMAP user wouldn't need to notice flag/keyword
replication on the order of seconds (compare with SQL replication which
has been measured in minutes).

There's absolutely no reason LDAP couldn't be used (initially) as a
directory for flags and deletion status.

Furthermore, LDAP's search model is compatible with the IMAP search
model.

At which point, profiling the beast would answer exactly what could be
sped up.


> There are also clustered filesystems that are distributed across 
> multiple computers. Dovecot already works with them.

Almost all distributed clustered filesystems exhibit lazy replication.
Performance is completely unacceptable if you need UNIX filesystem
semantics. These things are even less forgiving than NFS.

The few that don't (GFS) require specialized hardware or infrastructure,
and really, are more suitable to sharing a disk array with multiple
machines, instead of distributing the content (i.e. local area only).


> > Plus, think of all the neat fulltext indexing possibilities.
> 
> Unfortunately that doesn't help with IMAP at all, and might make it 
> actually more difficult and slower to implement the kind of indexing 
> that would benefit IMAP's SEARCH command.

Agreed.

-- 
Internet Connection High Quality Web Hosting
http://www.internetconnection.net/



More information about the dovecot mailing list