On Thu, 2010-12-16 at 13:00 +0300, Vasiliy G Tolstov wrote:
Hello. Does anybody write fts xapian plugin to dovecot 2?
Not as far as I know.
If nobody - can You suggest me what i need to read and what source files i need no read to write own xapian fts plugin.
src/plugins/fts/fts-api*.h src/plugins/fts-solr/* http://wiki2.dovecot.org/Design
There are some stupid things in the FTS API, like having lookup(), lookup2() and filter() methods. I don't remember clearly what the point was, but just do it like fts-solr does :)
There are some things you maybe shouldn't do like Solr though. There are two things in my TODO list:
- fts-solr: crashes if expunge is done while search is indexing
- fts-solr: handle DELETE, RENAME. use mailbox GUIDs (optionally)
The first is caused because Dovecot supports processing multiple commands in parallel. So if a client issues a SEARCH that starts indexing messages, and then while the indexing is going on it issues EXPUNGE command, the Solr code's expunge() method is called, but that breaks because the search indexing is still going on and it can't handle that. I haven't yet decided what to do about that. It's a bit rare problem though.
The second part then has two issues. Not handling DELETE at all is probably FTS API's problem, it should probably just be issuing expunge() for all of the messages in mailbox while it's being deleted.
The RENAME is also kind of annoying, because renaming a mailbox requires all messages in it to be reindexed (Solr doesn't support modifying existing fields). I was thinking that perhaps instead of using mailbox names in the FTS indexes, it could use mailbox global UIDs. The GUIDs never change, so RENAME doesn't require any action. Filtering based on mailbox names would get a bit trickier then though, because they would always have to be first translated to GUIDs.
Yet another possibility would be to not save mailbox name or GUID at all, just store message GUIDs. This would make it possible that COPYing a message to another mailbox wouldn't require reindexing it. The main problem then is of course that you'd need to have GUID -> array of {mailbox, uid} index somewhere, and Dovecot doesn't yet have one.