[Dovecot] xapian search plugin

Timo Sirainen tss at iki.fi
Thu Dec 16 12:52:30 EET 2010


On Thu, 2010-12-16 at 13:00 +0300, Vasiliy G Tolstov wrote:
> Hello. Does anybody write fts xapian plugin to dovecot 2? 

Not as far as I know.

> If nobody -
> can You suggest me what i need to read and what source files i need no
> read to write own xapian fts plugin.

src/plugins/fts/fts-api*.h
src/plugins/fts-solr/*
http://wiki2.dovecot.org/Design

There are some stupid things in the FTS API, like having lookup(),
lookup2() and filter() methods. I don't remember clearly what the point
was, but just do it like fts-solr does :)

There are some things you maybe shouldn't do like Solr though. There are
two things in my TODO list:

1. fts-solr: crashes if expunge is done while search is indexing
2. fts-solr: handle DELETE, RENAME. use mailbox GUIDs (optionally)

The first is caused because Dovecot supports processing multiple
commands in parallel. So if a client issues a SEARCH that starts
indexing messages, and then while the indexing is going on it issues
EXPUNGE command, the Solr code's expunge() method is called, but that
breaks because the search indexing is still going on and it can't handle
that. I haven't yet decided what to do about that. It's a bit rare
problem though.

The second part then has two issues. Not handling DELETE at all is
probably FTS API's problem, it should probably just be issuing expunge()
for all of the messages in mailbox while it's being deleted.

The RENAME is also kind of annoying, because renaming a mailbox requires
all messages in it to be reindexed (Solr doesn't support modifying
existing fields). I was thinking that perhaps instead of using mailbox
names in the FTS indexes, it could use mailbox global UIDs. The GUIDs
never change, so RENAME doesn't require any action. Filtering based on
mailbox names would get a bit trickier then though, because they would
always have to be first translated to GUIDs.

Yet another possibility would be to not save mailbox name or GUID at
all, just store message GUIDs. This would make it possible that COPYing
a message to another mailbox wouldn't require reindexing it. The main
problem then is of course that you'd need to have GUID -> array of
{mailbox, uid} index somewhere, and Dovecot doesn't yet have one.



More information about the dovecot mailing list