[Dovecot] Migration questions...

Timo Sirainen tss at iki.fi
Tue May 12 18:01:09 EEST 2009


On May 12, 2009, at 6:41 AM, Richard Hobbs wrote:

>>> Single-dbox is the highest performing, but note that it's not as  
>>> much
>>> tested as mbox and Maildir code. I think it should work ok, but  
>>> I'm not
>>> aware of any larger installations using dbox currently. So in case  
>>> you
>>> find a problem, you might have to upgrade/patch Dovecot to get it  
>>> fixed
>>> and that would require compiling from sources.
>>
>> In that case (and with a little further investigation which i've just
>> done) we've decided to go with maildir! That is still going to be
>> significantly better performing than mbox, right?

Depends on the usage, but it's significantly better performing than UW- 
IMAP. Dovecot+mbox is also significantly faster than UW-IMAP+mbox.

>> Also... do you know how uw-imapd & maildir compares to dovecot &  
>> maildir
>> in terms of performance?

Maildir is a patch on top of the official UW-IMAP distribution. I  
don't know how well it performs, but it doesn't use any indexes and  
indexes are what makes Dovecot fast.

>> Does dovecot still use indices with maildir?

Yes.

>>> mbox -> Maildir conversion can preserve both IMAP and POP3 UIDLs  
>>> using
>>> an external script. Maildir -> dbox conversion can also preserve  
>>> both,
>>> but that causes Dovecot to use this "hybrid Maildir-dbox format",  
>>> which
>>> is slower than the full native dbox.
>>
>> That'd good to know. Do you happen to know where I can get a copy of
>> this "external script" you speak of? Will it simply be included in  
>> the
>> debian package (probably)?

http://wiki.dovecot.org/Migration/MailFormat -> mb2md.py

>> Also, given that i'm going to have to test this, i will obviously be
>> running the conversion on a copy of the live data, and then i'll  
>> have to
>> run the conversion again during the migration outage - will i need to
>> delete all the data and basically start again, or is it incremental?

I don't think it can do incremental, but I've never looked at the  
script myself.

>>> My guess is that two RAID-1s would be faster, but I haven't really  
>>> done
>>> any benchmarking. Anyway index files are 10-30% of the mailbox  
>>> size, so
>>> the index-disks would be using a lot less disk space.
>>
>> I assume you are talking about dovecot with maildir here, right?

The same applies to all mailbox formats Dovecot supports.

>> Also, what would we put on each array? Are the inboxes still stored
>> separately to the IMAP folders when using dovecot and maildir?

Inboxes are stored inside Maildir like all other mailboxes.

>> Would it be best to put all data on one array, and the indices on the
>> other? We're basically after the fastest way to distribute the  
>> data! :-)

Last I heard it was faster to keep index files in a separate disk than  
mailbox data. I've never verified this myself, but it sounds reasonable.

> My colleague has mentioned something of interest... can dovecot keep  
> the
> index files in RAM? If so, the performance will obviously be *so* much
> better than running them off the hard disks.

Last I heard it didn't really help much. Assuming your OS works  
properly it already keeps the necessary indexes in memory anyway. Also  
I wrote a patch that tries to tell OS to do that by dropping message  
files' data from cache after reading the messages:

http://dovecot.org/patches/1.1/fadvise.diff

But no one has told me if that helps or makes things worse..

> This also raises questions about what happens if the machine is  
> powered
> off etc... but it's UPSd etc... so if it were to rebuild it's indexes
> every time it was booted up, that wouldn't be the end of the world.

Well, there are two parts of index rebuilding: dovecot.index files  
which are quick to rebuild and dovecot.index.cache files that contain  
the useful fields that clients want. The cache file is especially  
useful with webmails and if it's gone it could mean opening user's all  
messages and reading their headers and perhaps even bodies.

So depends on what clients your users use, but in some cases it could  
be 10-100x slower to open the mailbox if the cache file is gone.


More information about the dovecot mailing list