[Dovecot] Big sites using Dovecot

Joshua Goodall joshua at roughtrade.net
Sat Sep 23 05:50:18 EEST 2006

On Fri, Sep 22, 2006 at 03:28:03PM +0100, David Lee wrote:
> 1. Performance has been sluggish: high load average, probably caused by
> NFS stat activity (itself because of "noac"?).
> 2. Although older Linuxes (e.g. Redhat 9, 2.4.20-43.9) have been OK, more
> recent releases (e.g. FC5, 2.6.16-1) introduced some nasty deadlocking,
> requiring machine reboot every day.  (Unacceptable!)
> We hope dovecot will improve matters.

It will (it should).

> Any advice or comments or experiences?

Use maildir on NFS, not mbox.  Especially for random-access IMAP
services, and it isn't shabby in the POP3 slurpdown scenarios either.
The performance difference is an order-of-magnitude, and converting
mailboxes between these formats is trivially scriptable.  The only
practical downside to maildir I've experienced is that backup of
lots-of-tiny-files (maildir) is more expensive than one-big-file (mbox).

> Also, in such a set-up (multiple IMAP/Linux NFS-mouting from NetApp) where
> should the dovecot index files be?  NFS from the NetApp?  Or on each Linux
> machine (if so, on disk or ramdisk?)?

I put indices on the filer. Access serializes on dotlocking of
dovecot-uidlist anyway.

> > Note: without Trond M's NFS client patchset we see comedy VFS
> > out-of-sync errors after a few days uptime, resulting in mistaken
> > deadlocking of dovecot indices (usually only one user, but always
> > a high-volume user) that needs a node reboot to fix, and some
> > intervention with the "lock status / break" command on the OnTAP
> > command line. With the patches it's been rock solid.
> Could you provide more details?  (I wonder if these are related to
> deadlock problems we see with Washington/IMAP on 2.6.16?)  Are these
> patches in the processes of being pushed into the relevant source codes so
> that they will ultimately be unnecessary?

They could well be related issues.  I'm no expert on the Linux NFS
client code itself and I wouldn't wish such a role on anyone :)

Unpatched, we see (after a day or two of uptime) the following kernel
gripe: "do_vfs_lock: VFS is out of sync with lock manager!".  This repeats
a few times, and shortly afterwards the box locks up hard.

Patched, our mailstores are currently at >two months uptimes, and the
last reboot was for routine maintenance.  The only problem with this
is that nfsstat's counters (e.g. for getattr) have reached 2^31-1 and
stopped turning :)

Trond's code is often incorporated into the kernel.  I have always
tracked his patchsets in production platforms that were reliant
upon NFS, and have for years found his code to make a difference
in stability.

Also note that he works for NetApp, so you can guess what his
interoperability testing might include.


Josh "Koshua" Goodall                      "as modern as tomorrow afternoon"
joshua at roughtrade.net                                       - FW109

More information about the dovecot mailing list