Indexing paralelism

Timo Sirainen tss at iki.fi
Mon Jan 14 02:30:08 EET 2019


On 13 Jan 2019, at 19.23, Joan Moreau via dovecot <dovecot at dovecot.org> wrote:
> 
> Hi
> 
> Observing the processes of FTS, I observe the following:
> 
> 
> 
> 1 - For one user, indexer-wroker does not start several threads for each request. On teh contrary, it waits for the first request to finish before starting the second. How to make sure all requests (or a limited number of it, for instance linked to the CPU number on the machine) are started asap ?

I guess two answers:

1) Dovecot doesn't use threads anywhere. It uses only processes. So a single indexer-worker couldn't start multiple parallel threads.

2) It's intentional that there aren't more than one indexer-worker per user. This is especially because fts-lucene would break if there were more. Also generally in larger installations it's better if all the workers weren't stuck processing a single user, blocking other users' indexing.

> 2 - If a IMAP query is received, the dovecot checks the last UID from FTS and launch a request of indexing to finish the index *before* running the search query . THis creates timeouts (and can take a while if many request are pending - see point 1) How to prevent that (i.e. the serach request is launched (read only) no matter what ? THe completeion of missing UIDs is launched in a separate thread ?

This would be violating IMAP protocol if it didn't include latest mails in the SEARCH response.. Generally I haven't noticed this being a big practical problem. The initial indexing can be done before FTS searches are enabled for the user, and afterwards the indexing shouldn't have especially long queues. Note that when doing indexing due to a SEARCH, that indexing's priority is higher than the indexing triggered by new mail deliveries. So unless all the indexer-workers are busy indexing mails from large folders without any indexes, this shouldn't be a huge problem normally.



More information about the dovecot mailing list