Awfully slow dovecot

Marc Stürmer mail at marc-stuermer.de
Fri Dec 26 17:43:00 UTC 2014


Am 26.12.2014 um 17:21 schrieb Nick Edwards:

>> Maildir is fine as long as you don't have too much mail on your
>> storage, but there comes a point when you are getting big enough where
>> Maildir really isn't going to behave really nicely anymore, because
>> too many files and way too many seeks. Mdbox is quite different and
>> scales beyond the point of Maildir with ease.
>>
> that may only be true depending om your filesystem
>
> it may be no more supported for obvious reasons, but reiserfs will
> never be beaten for maildir
> and maildir  is time tested and proven in very large environments with
> millions of users, with many GB quotas runs perfect with depuping on
> netapps too.

Wrong. ReiserFS is something you should really avoid nowadays on 
servers. V3, while stable and not getting much love nor features anyway, 
has some serious known quirks. There's a reason why Reiser4 was invented 
back then.

ReiserFS4 will almost certainly never find its way in the kernel and 
goesn't get much love either, especially since its author went into jail 
for murder.

Because Reiser4 is being maintained outside the kernel you have either 
to compile the kernels yourself or get community kernels. Aside that, 
Reiser4 is still experimental and not stable.

Aside from some still raging and alive fanboys who didn't get the 
message ReiserFS doesn't really matter anymore. It had some nice ideas, 
but people moved on to other systems and are happy with those.

The thing is, that Maildir means every mail is one file and you've got 
to maintain some index files to get a decent speed out of it. Robust, 
also yes, because all data is in the file system itself.

This means for every modern file system, regardless if you are using 
XFS, ext4 or Reiser, that you need to lookup that file in the folder and 
read/write it. All modern file systems use for that some kind of binary 
tree algorithm.

Some older file systems tend do slow down drastically if you put enough 
mails in one folder. Just consider one mail folder with the contents of 
the LKML of one year, this folder alone would be around 100.000 files or 
even more.

And if your machines are getting big enough, Maildir plainly sucks and 
making backups takes more and more time, because reading 100000 files 
for a backup means many seeks and lots of work for the HDD, it means of 
course much more protocol overhead e.g. on rsync.

For example my mailbox had a lkml archive with over > 30000 mails and I 
switched to mdbox. On the pro side you get much more speed out of your 
hardware, because you can do more with less I/O operations. On the con 
side you cannot read mails on the file system level anymore and need to 
learn the dovecot tool chain, which though IMHO is absolutely worth it, 
backups need proper preparations and losing the mapping files would be a 
desaster.

Now instead of maybe around 60000 files I had and taking over 800 MB my 
mail storage consists of 347 files and takes 485 MB and I didn't throw 
any mail away.

How comes? Simple, I enabled compressed saving with gzip. CPU power is 
cheap, getting more memory and hdd speed is most time not. Because of 
all my mails are now compressed in much less files on the HDD, looking 
them up in that folder is blazingly fast, because it's quite a small 
tree needed to maintain those.

And it also means that I can use the same amount of memory to cache more 
mails in the file system cache (roughly around 30-40% more), because the 
data itself is compressed. That's the beauty of it, and much faster 
backups again.

If you really run millions of accounts, you wouldn't want Maildir 
anymore when you can have mdbox.

If you want to build a really large imap server on Linux, either take 
ext4 or XFS as file system.


More information about the dovecot mailing list