fast doveadm search results

Vu Ngoc VU vvu at mcra.fr
Mon Mar 23 08:36:08 UTC 2015


Hello Timo, thanks you so much for answering.

> Date: Fri, 20 Mar 2015 19:16:55
> From: Timo Sirainen <tss at iki.fi>
> 
> On 20 Mar 2015, at 10:37, Vu Ngoc VU <vvu at mcra.fr> wrote:
>>
>> But to answer your questions, I'm not really interested in purging the cache data.
>> I just think that cache has expiration delay.
>> The only point is to get this "doveadm search" answering me in minutes instead of hours.
>> Writing my original post, I didn't get if it was slow because of:
>> - data I'm searching are not cached at all? (headers like From, Date, Message-ID...)
>> => I wanted to know if dovecot allows to add some headers.
>> Stephen answered to that question.
>
> By default all headers are added to cache the first time they're accessed in the folder (e.g. via FETCH or SEARCH). Also mails that are newly delivered by Dovecot will add those headers to cache immediately.

Wow, I'm wondering how Dovecot manages to know if it is 1st time access.
So, no matter if it is from a MUA or with dovecot's administrative commands like doveadm search/fetch ?
Yes, delivery is done by dovecot LMTP on a separated server, I plan to move this service into others servers that do imap/pop3.

>> - these data a cache, but for an extremely short time, like the user session
>> => that's why I asked if it is possible to extend cache validity to at least 48h.
>> But for sure, if these data remain forever, it'll be better ! :)
>
> Dovecot automatically figures out if the data should stay in cache for 1 week or forever. It sounds like something's wrong in your system if it's not already automatically performing fast searches. The first time a search on a header is done it might be slow if the data isn't in cache, but all subsequent times should be very fast. Not hours or minutes but seconds. No need to modify the mail_cache_* settings. It might be helpful if you posted your whole doveconf -n output.

I've read on the website and you confirmed here that dovecot tries to be smart about how to manage the cache.
But is there settings so I can ask dovecot to never remove cache entries?
I don't want dovecot to try to evaluate if the MUA needs/asks these data often.
Like I wrote before, I host mails for nearly 10 companies inside a same "group" (sorry, english is not my native language).
And mails is a tool overused here, so even if it is rare cases, sometimes, they ask me to delete some mails quickly.
Then, I prefer to waste some I/O and disk usage for the cache (or indexes, whatever it is called) to have doveadm search answer fast.

If there is no "never_purge_cache" or "dont_try_to_be_smart" setting, would it be some command to run every night with CRON to update/refill the cache?

>> - NFS limitations
>> => do I have to re-install dovecot on my NFS servers? I prefer not.
>
> What do you mean by this? You're using NFS now to store emails but with one Dovecot server? That should work fine, although NFS of course always adds some extra overhead.

I can paste configuration somewhere, but since I have several servers, maybe we should decide which ones to run `doveconf -n` from.
My setup looks like this, I know there are some design errors:
- storage bays from DELL (MD1220) with 24 hdd, directly SAS attached to some physical servers
- these servers attached to the storage bays are NFS servers
- 2 openvz containers running dovecot as director proxy + postfix submission (I called these mailhubs)
- 2 openvz containers running dovecot as IMAP/POP3 backends, they are nfs clients to access mailboxes
- 1 openvz container acting as MX with postfix and the only one that has public+private ip addresses
this server only receives mails for our domains and then "transports" to lmtp servers.
It is not part of Director setup.
- 1 openvz container called lmtp, this one is nfs client too, and only delivers mails
This one only do delivery and is not part of Director setup, I plan to remove it from the "archi".

All systems are Ubuntu 14.04 (so it's dovecot 2.2.9 via packages) and filesystem on storage servers is ext4 on hardware RAID-6.

I need to migrate others domains hosted on older setup (with Courier). But before that, I prefer having my "new setup" working as expected.
And I also want to understand important parts (for me) like caching of messages. 
To enhance searching performance, I can consider many solutions like integrating SSD, storing indexes/caches in some memcached (we can put 1 TB of RAM on our servers).
But, for now, I need to know if I can configure dovecot so that it add headers I need to the cache and never remove these.

I'll do further tests. But for the moment, I've tried this:
- delete dovecot.index.cache file for a mailbox
- running doveadm search on it and check contents of the new cache file with strings command
=> only headers I ask in search command are stored
- running another search with another header as criteria
=> the cache file now has stored new asked headers.

As you wrote that on delivery "all" headers should be stored in cache, I'll do some tests on this today.
But for now, running doveadm on same mailboxes as friday is fast.
So I still don't know how this caching works, it is "too smart" for me :)

Have a nice day/week, and sorry for this long message, I think it's like a mess :/


More information about the dovecot mailing list