[dovecot] Re: Architectural questions

Thomas Wouters thomas at xs4all.net
Sat Oct 19 18:10:26 EEST 2002


On Sat, Oct 19, 2002 at 05:01:55PM +0300, Timo Sirainen wrote:
> On Sat, 2002-10-19 at 15:38, Thomas Wouters wrote:

> > We briefly played with modifying sendmail and the pop server to avoid the
> > full copy in the common case (only status changes) by doing in-place edits
> > of a pre-generated Status line,

> UW-imapd does this as well, creating "X-Keywords:          " line for
> each mail. I had thought about this first with dovecot too, but since
> mutt rewrote the whole mailbox always I figured I might as well. But
> with larger mailboxes this is really slow, so I think I'll support the
> X-keywords trick myself too.

Well, for POP3 servers the story is a bit different than IMAP. The typical
use we were seeing was "user", "pass", "list", "retr <new mail>", "quit".
Sometimes (for some users) every few minutes. In that case, having to write
a 'RO' at a specific location in a large mbox is oodles more efficient than
copying the whole thing to local disk and back again (which is what the
popserver would do.) I'm not sure if it matters much with typical IMAP
usage.

> > as well as avoid full scanning of the mbox file by creating special
> > headers to mark the 'real' length of an email.

> For each mail? Content-Length? With my tests that didn't seem to help
> much, rather made it just slower.. Could be that I just did something
> badly, have to look into it more when I begin optimizing mbox handling
> more. Have to get it at least as fast as UW-imapd :)

Well, if I recall correctly, we added an 'X-Offset' header which pointed to
the exact (relative) byte offset for the next 'From ' line. It made our
pop3d (a modified qpopper 2.3 by the way) a much happier puppy. I'm not sure
what the difference with Content-Length was. I could find the sources, I
suppose; since we disabled mbox-inbox support we aren't using that code
anymore.

> > We still have support for mbox mailboxes in a user's homedirectory though,
> > by using procmail and such. So when we needed an IMAP server for use with
> > our webmail (based on SquirrelMail), we were forced to go with the UW-IMAP
> > server, with the maildir patch that's been scattered around the 'net. This

> Hm. Squirrelmail requires SORT extension which Dovecot doesn't support
> yet.

Ah, that's a shame. It means we can't use dovecot for our internal
SquirrelMail+IMAP testing yet :) We likely wouldn't start using dovecot for
production SquirrelMail anytime soon anyway, so it's not a big issue right
now... We'll have to see if our other uses of IMAP require it or not.

> Dovecot doesn't care [about maildir-message filenames] as long as the file
> name stays same before the ':' character.

They do.

> It would be possible to just assume that there's always someone else
> using the modify log, but each flag change or expunge would always write
> a few bytes to it then, and when log file is switched (there's .log and
> .log.2) it wouldn't be truncated after last process is finished with it
> which is not too bad since after the next switch it will be truncated.

> Also it would be possible not to use index files at all but just keep
> them in memory. I've been fixing code to make this possible and somewhat
> fast.

Hmm. I'd have to look at the code to say for sure, but I think we could live
with keeping them in memory. Accessing the same mailbox from two different
clients at the same time is not something we're too worried about, at the
moment.

> >  - Every user's incoming mailbox is /var/spool/u/s/username.

> Are maildir inboxes also in /var/spool? 

Yes. We don't use the ~/Maildir structure at all. We've always simply used
maildir mailboxes as a directly replacement of mbox mailboxes; a directory
instead of a file, and no sub-boxes :) I guess it's a philosphical
difference. To me, and to my colleagues, everything can be a mailbox, not
just something stored in an arbitrary directory somewhere. I guess we could
change that position, if necessary, but so far it hasn't proven to be.

> With mbox sub-inboxes wouldn't be even possible because dir structure ==
> mailbox structure, and since inbox file exists there can't be inbox-dir
> (except maybe with different case but that's kludgy).

Yes... don't worry, we don't even want to consider mbox-subboxes :)

> I've also thought I might as well make it possible to read the mbox
> inbox from /var/mail or whereever it is. Pretty easy to do, but .lock
> file is problematic if new files can't be added to the /var/mail
> directory.

Our /var/spool/mail subdirectories are mode 01733 (drwx-wx-wt) owned by
root, so creating files and removing them is not an issue, but reading the
directory is. You can of course still check for existance of specific
filenames.

> Am I right in that CPU usage still isn't any problem but rather the I/O?

Yes. As I said, we use several netapp filers (currently two for /home and
two for /var/spool/mail, with several hundred gigabytes filespace each) and
though they're great boxes, their performance does tend to drop off when it
gets flooded with I/O requests :) And they're used by a lot of machines, so
if they are slow to respond, a lot of our services do too.

> >  - The UW-IMAP maildir patch stores UID's in the indiviual filenames, using
> >    a 'U' flag. Will this interfere with dovecot ? We don't really need
> >    dovecot and UW-IMAP to share UIDs, but we would like to have an as
> >    painless transition as possible, without having to rename millions of
> >    files to remove the U flag and other flags :P It would also be nice to
> >    keep pine using the existing maildir patch, even though very few
> >    IMAP-users would use pine.

> How exactly does the U flag work? I hope it's before the ':' character
> like Courier's S=filesize? Otherwise U=1234 would be thought of as 6
> different flags which isn't very good since Dovecot reorders them as
> 1234=U.

No, it can't be before the :, because the UID is generated by UW-IMAP, and
the maildir spec says you can't change the uniqe part of the name, just the
info :) Here are some examples. The ',U*' is the UID.

_k2,6NtZ9.maildrop4.xs4all.nl:2,S,U1030712092
_fmT,O63l8.maildrop8.xs4all.nl:2,RS,U1026644784
990612135.16312.000000002.maildrop2.xs4all.nl:2,S,U991994304
993058841.maildrop7.49267:2,S,U993058888

(In case you're wondering, the first two files were created by standard
procmail, the third by our modified procmail which tries to allow for the
pine/uw-imap maildir patch, and the last is our mail.local's format.)

As long as dovecot doesn't read a different meaning into those flags
(ignoring them is just fine) we should be fine. I don't think we'll have
many customers switching back and forth between dovecot and UW-IMAP, just
people switching from UW-IMAP to dovecot.

> If your clusters access the files through NFS, there should be no
> problem. Except I've never tried Dovecot through NFS, and I'm not sure
> how well mmap()ing works through NFS. I know there's been problems
> before but hopefully they've been fixed already.

I'm not too worried about bugs. I've yet to see a piece of software that we
don't find oodles of small and large bugs in just by installing and trying
to run on our clientbase. That's what testing is for :) But I wouldn't mind
being happily suprised by dovecot, we'll see :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



More information about the dovecot mailing list