[Dovecot] Big sites using Dovecot

Joshua Goodall joshua at roughtrade.net
Sat Sep 23 05:50:18 EEST 2006


On Fri, Sep 22, 2006 at 03:28:03PM +0100, David Lee wrote:
> 1. Performance has been sluggish: high load average, probably caused by
> NFS stat activity (itself because of "noac"?).
> 
> 2. Although older Linuxes (e.g. Redhat 9, 2.4.20-43.9) have been OK, more
> recent releases (e.g. FC5, 2.6.16-1) introduced some nasty deadlocking,
> requiring machine reboot every day.  (Unacceptable!)
> 
> We hope dovecot will improve matters.

It will (it should).

> Any advice or comments or experiences?

Use maildir on NFS, not mbox.  Especially for random-access IMAP
services, and it isn't shabby in the POP3 slurpdown scenarios either.
The performance difference is an order-of-magnitude, and converting
mailboxes between these formats is trivially scriptable.  The only
practical downside to maildir I've experienced is that backup of
lots-of-tiny-files (maildir) is more expensive than one-big-file (mbox).

> Also, in such a set-up (multiple IMAP/Linux NFS-mouting from NetApp) where
> should the dovecot index files be?  NFS from the NetApp?  Or on each Linux
> machine (if so, on disk or ramdisk?)?

I put indices on the filer. Access serializes on dotlocking of
dovecot-uidlist anyway.

> > Note: without Trond M's NFS client patchset we see comedy VFS
> > out-of-sync errors after a few days uptime, resulting in mistaken
> > deadlocking of dovecot indices (usually only one user, but always
> > a high-volume user) that needs a node reboot to fix, and some
> > intervention with the "lock status / break" command on the OnTAP
> > command line. With the patches it's been rock solid.
> 
> Could you provide more details?  (I wonder if these are related to
> deadlock problems we see with Washington/IMAP on 2.6.16?)  Are these
> patches in the processes of being pushed into the relevant source codes so
> that they will ultimately be unnecessary?

They could well be related issues.  I'm no expert on the Linux NFS
client code itself and I wouldn't wish such a role on anyone :)

Unpatched, we see (after a day or two of uptime) the following kernel
gripe: "do_vfs_lock: VFS is out of sync with lock manager!".  This repeats
a few times, and shortly afterwards the box locks up hard.

Patched, our mailstores are currently at >two months uptimes, and the
last reboot was for routine maintenance.  The only problem with this
is that nfsstat's counters (e.g. for getattr) have reached 2^31-1 and
stopped turning :)

Trond's code is often incorporated into the kernel.  I have always
tracked his patchsets in production platforms that were reliant
upon NFS, and have for years found his code to make a difference
in stability.

Also note that he works for NetApp, so you can guess what his
interoperability testing might include.

JG


-- 
Josh "Koshua" Goodall                      "as modern as tomorrow afternoon"
joshua at roughtrade.net                                       - FW109


More information about the dovecot mailing list