[Dovecot] When are search indexes updated?
Dave Abrahams
dave at boostpro.com
Tue Dec 4 18:57:41 EET 2012
Here's a further experience report with questions inline:
1. "doveadm index '*'" crashes in clucene (for me), so it must be trying
to update the FTS indexes, somehow. Is that (the updating, not the
crashing) intended behavior, and if so, should it be documented?
2. "doveadm search text SOMETHINGthatWONTbeFOUND" takes a long time and
finds stuff without crashing, but doesn't seem to create the
lucene-indexes/ directory in my mdbox, and it takes a long time the
next time around. Is "doveadm search" intended to update the FTS
indexes if they're missing?
3. Performing a search on a large mailbox through IMAP takes a long time
the first time around, during which "top" shows the big cpu hog is
/opt/local/libexec/dovecot/indexer-worker, lucene-indexes/ is
created, and subsequent searches go quickly. Nice! Is there a
reasonably easy way to start such a search on all my mailboxes from
the command line by using doveadm or the preauth tunnel?
on Thu Nov 01 2012, Dave Abrahams <dave-AT-boostpro.com> wrote:
> on Sat Oct 27 2012, Stan Hoeppner <stan-AT-hardwarefreak.com> wrote:
>
>> On 10/27/2012 3:00 PM, David Abrahams wrote:
>>>
>>> I noticed that occasionally searching in my huge archive mailbox can be
>>> really slow, so I tried doveadm index on it and it seemed to do a lot of
>>> work, which seemed strange given, for example, that dovecot-lda says it
>>> keeps Dovecot index files up-to-date. Then I thought, "maybe these are
>>> different files than the search indices." If so, that's not entirely
>>> clear from the docs and Wiki. So, questions:
>>
>> Mailbox and search indexes are separate.
>
> If so, I hereby request that they be properly and explicitly
> distinguished from one another, every place "index" is mentioned on the
> wiki.
>
>> Look in your mailbox directory and you'll see them, such as on 1.2.x
>> with mbox:
>
> I'm on 2.x with mdbox, FWIW.
>
>> $ la /home/stan/mail/.imap/1-Dovecot
>> total 3.4M
>> drwx------ 2 stan stan 135 Oct 25 21:39 .
>> drwx------ 51 stan stan 4.0K Apr 13 2012 ..
>> -rw------- 1 stan stan 44K Oct 27 13:28 dovecot.index
>> -rw------- 1 stan stan 1.2M Oct 27 21:23 dovecot.index.cache
>> -rw------- 1 stan stan 18K Oct 27 21:23 dovecot.index.log
>> -rw------- 1 stan stan 1.1M May 20 06:32 dovecot.index.search
>> -rw------- 1 stan stan 1.1M May 20 06:32 dovecot.index.search.uids
>>
>> I've not full text searched this folder for quite some time, thus the
>> search indexes are not current, and the next FTS of this mail folder
>> will take much more time than if the FTS indexes were current.
>>
>>> * When are search indexes updated?
>>
>> When the index is stale.
>
> That's pretty vague :-)
>
>>> * Are they updated incrementally?
>>> * If not, why not?
>>> * If so, why would a mailbox's index drift out-of-date, as mine had?
>>
>> When a sufficient number of messages are added to an IMAP folder the FTS
>> index becomes stale.
>
> That's a little less vague, thanks :-)
>
>> This index is not updated in real time. This is why Timo and others
>> recommend cron'ing a script to index folders regularly that are
>> searched regularly.
>
> And how does one index the folders for search? Is that "doveadm
> index" or "doveadm fts rescan" (which I see at
> http://wiki2.dovecot.org/Plugins/FTS but NOT in the manpage), or...?
>
>> This keeps the indexes up to date and keeps searches fast. If you
>> don't do this or search often, your indexes become stale. Then each
>> time you do an FTS search the first thing that happens is an FTS
>> re-indexing of the mail folder. Only then does it display the search
>> results.
>>
>>> BTW, I'm using the clucene search backend.
>>
>> I've not used Lucene, but I believe the default behavior is similar to
>> the Dovecot 1.2.x FTS indexer.
>
> Not sure what conclusion to draw from that, thanks.
--
Dave Abrahams
BoostPro Computing Software Development Training
http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost
More information about the dovecot
mailing list