Searching 30 GB mailbox

Einar Bjarni Halldórsson einar at isnic.is
Thu Dec 2 08:57:06 UTC 2021


>>> You can inspect the index files with doveadm dump to check what is 
>>> cached. Not sure how it went with mdbox storage driver.
>> According to `man doveadm-dump` it just seems to dump index files. I 
>> tried dumping the cache file but it complains that it can't auto 
>> detect the file type.
>
>
> Try this:
>
> [root at ketola .INBOX]# pwd
> /vmail/sami at ketola.io/index/.INBOX
> [root at ketola .INBOX]# doveadm dump . | grep -c hdr.message-id
> 11007
>

root at ht-mailstore01:/data/mail/hostmasterlog/mdbox/mailboxes/INBOX/dbox-Mails 
# doveadm dump . | grep -c hdr.MESSAGE-ID
4464736

In dovecot config I have `fts_enforced = yes` and after the search for 
Message-ID finishes and returns UID, I can do a FETCH .. ENVELOPE on it 
and it responds immediately.
I was under the impression that fts_enforced forces all searches, 
headers and body, to go to solr. Then all dovecot would have to do was 
to return the UID returned by solr.
Unless solr doesn't return UID and dovecot has to take the result from 
solr and lookup the UID, and with a full 1 GB cache file it always has 
to scan the whole index?

Nothing seems to be completely broken, I always receive a result, it's 
just that it takes 30 seconds when I really want it to be ~5 seconds at 
most.

I guess if we can't find a solution and 30 seconds becomes a real 
problem, we'll split the mailbox up by years. It should help with the 
size of the cache. It makes the searching code a little more complicated 
since it has to figure out the sent date before it can search, but it's 
doable.

.einar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20211202/5c2a78c1/attachment-0001.htm>


More information about the dovecot mailing list