Searching 30 GB mailbox
Einar Bjarni Halldórsson
einar at isnic.is
Thu Dec 2 08:57:06 UTC 2021
>>> You can inspect the index files with doveadm dump to check what is
>>> cached. Not sure how it went with mdbox storage driver.
>> According to `man doveadm-dump` it just seems to dump index files. I
>> tried dumping the cache file but it complains that it can't auto
>> detect the file type.
>
>
> Try this:
>
> [root at ketola .INBOX]# pwd
> /vmail/sami at ketola.io/index/.INBOX
> [root at ketola .INBOX]# doveadm dump . | grep -c hdr.message-id
> 11007
>
root at ht-mailstore01:/data/mail/hostmasterlog/mdbox/mailboxes/INBOX/dbox-Mails
# doveadm dump . | grep -c hdr.MESSAGE-ID
4464736
In dovecot config I have `fts_enforced = yes` and after the search for
Message-ID finishes and returns UID, I can do a FETCH .. ENVELOPE on it
and it responds immediately.
I was under the impression that fts_enforced forces all searches,
headers and body, to go to solr. Then all dovecot would have to do was
to return the UID returned by solr.
Unless solr doesn't return UID and dovecot has to take the result from
solr and lookup the UID, and with a full 1 GB cache file it always has
to scan the whole index?
Nothing seems to be completely broken, I always receive a result, it's
just that it takes 30 seconds when I really want it to be ~5 seconds at
most.
I guess if we can't find a solution and 30 seconds becomes a real
problem, we'll split the mailbox up by years. It should help with the
size of the cache. It makes the searching code a little more complicated
since it has to figure out the sent date before it can search, but it's
doable.
.einar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20211202/5c2a78c1/attachment-0001.htm>
More information about the dovecot
mailing list