additionally, my logic is that the backend stores one databalse per mailox in /xapian-indexes (in the "root" dir of the user), the name od the database is the GUID of the mailbox
For INBOX, that works perfectly, and database is properly createdm and backed starts indexing all emails
For other folder, somehow, the process can not access that (root) folder.
Am I missing something ?
On 2019-01-12 17:37, Joan Moreau wrote:
THank you
Now, for the results
I see the member of fts_result is :
ARRAY_TYPE(seq_range) definite_uids;
I have the UID as a aray of uint32_t *
How to put my UIDs into this "definite_uids" ? Obviously this is not a simple array/pointer. How to say someting similar to result->definite_uids[1]=my_uid ?
On 2019-01-12 10:25, Timo Sirainen wrote: On 11 Jan 2019, at 21.23, Joan Moreau via dovecot dovecot@dovecot.org wrote: The below patch resolves the compilation error
$ diff -p compat.h compat.h.joan *** compat.h 2019-01-11 20:21:00.726625427 +0100 --- compat.h.joan 2019-01-11 20:14:41.729109919 +0100 *************** struct iovec; *** 202,207 **** --- 202,211 ---- ssize_t i_my_writev(int fd, const struct iovec *iov, int iov_len); #endif
- #ifdef __cplusplus
- extern "C" {
- #endif
You should put this extern "C" into the C++ file you're creating. See for example how fts-lucene/lucene-wrapper.cc does this.
1 - WHat does represent "subargs" in mail_search_args It's set only for SEARCH_OR and SEARCH_SUB. So for example:
SEARCH TEXT foo TEXT bar TEXT baz
results in:
type=SEARCH_SUB value.subargs = ( { type=SEARCH, value.str="foo" }, { type=SEARCH, value.str="bar" }, { type=SEARCH, value.str="baz" }, )
Or similarly if there's SEARCH OR foo OR TEXT bar TEXT baz or some other combination of OR/ANDs. 2 - for rescan : who is responsible for passing again the new email ? Is the Dovecot core sending again all the emails to index ? or the fts shall somehow access the mailbox and read all emails ? Wouldn't just be saying "delete all index and get_last_uid is now 0" the easy way ? or the fts must process all emails (and block the current thread as a mailbx maybe quite large) The next indexing run is responsible for it. If you return get_last_uid=0, then indexer starts feeding you all mails. So fts backend doesn't have to know about it.
3 - for get_last_uid : this uncertainity is very unclear. "If there is a gap, then indexer first indexes all the missing" -> this mean at a certain point, indexer maybe rebuilding a previous email, so *last* uid is something different than max. And how indexer does know whther there is a gap wihtout callong the fts backend (whch it does not as there are no function for that) ? I mean if get_last_uid() returns for example 100, it means that UIDs 1..100 have been indexed by the FTS backend. It's possible that at this point there are already mails with UIDs 101..200 in the folder. So when UID=201 is delivered, indexer notices that FTS backend has only UIDs 1..100 indexed so far, and starts feeding it UIDs 101..201 in that order.
You can implement get_last_uid() simply by keeping track of it in dovecot.index* files, similar to how Lucene and Solr already do it with fts_index_get_header() / fts_index_set_header(). They also have a fallback that if the index doesn't have the last_uid value, they do a slower search from the Lucene/Solr index to find the last UID.