Hi
ANyone to answer specifically ?
Q1 : get_last_uid -> Is this the last UID indexed (which may be not the greatest value), or the gratest value (which may not be the latest) (the code of existing plugins is unclear about this, Solr looks for the greatest for insance)
Q2 : WHen Indexing an email, the data is not passed by "build_key". Why so ? What is the link with "build_more" ?
Q3 : Searching/Lookup : THe fheader in which to llok for (must be a least among "cc, to, from, subject, body") is not appearing in the 'struct' data. WHere to find it ?
Q4 : Refresh : this is very unclear. How come there would not be the "latest" view on index. What is the real meaning of this function ?
Q5 : Rescan : is it just a bout remonving all indexes for a specific mailbox ?
Q6 : lokkup_multi : isn't the function the same for all plugnins (see below) ?
THank you
On 2019-01-06 16:50, Joan Moreau via dovecot wrote:
and finally , for fts_backend_xxxx_lookup_multi, why is that backend dependent ?
Would- nt the below function below be the same for any backend ?
Waiting fro your feedback on all those questions
Thank you
JM
static int fts_backend_xapian_lookup_multi(struct fts_backend *_backend, struct mailbox *const boxes[], struct mail_search_arg *args, enum fts_lookup_flags flags, struct fts_multi_result *result) { struct xapian_fts_backend_update_context *ctx = (struct xapian_fts_backend_update_context *)_ctx;
int i=0;
while(boxes[i]!=NULL) { if(fts_backend_xapian_lookup(backend,box[i],args,flags,result->box_results[i])<0) return -1; i++; } return 0; }
On 2019-01-06 16:31, Joan Moreau via dovecot wrote:
for fts_backend_xxx_lookup, where is specidifed in which field (to, cc, subject, body, from, all) to lookup ?
On 2019-01-06 16:03, Joan Moreau wrote:
For "rescan " and "optimize", wouldn't it be the dovecot core who indicate which are to be dismissed (expunged), or re-ask for indexing a particular (or all) uid ? WHy would the backend be aware of the transactions on the mailbox ???
There is alredy "fts_backend_xxx_update_expunge", so I beleive the management of the expunged messages is *NOT* in the backend, right ?
On 2019-01-06 15:41, Joan Moreau wrote:
also, for fts_backend_solr_update_set_build_key -> where is the data (of the hdr_name or the body) ?
On 2019-01-06 14:10, Joan Moreau wrote:
for the "last uid"-> this is not the last added, but the maximum of the UID in the indexed emails, right ?
On 2019-01-06 11:53, Joan Moreau via dovecot wrote:
Thank you
I still don't get the "build_key" function. The email (body, hearders, .. and the uid) is the one (and only) to index . What "key" is that function referring to ? Or is the "key" here the actual email ?
On 2019-01-06 08:43, Stephan Bosch wrote:
Op 06/01/2019 om 01:00 schreef Joan Moreau: Anyone willing to explain those functions ?
Most notably " get_last_uid" From src/plugins/fts/fts-api.h:
/* Get the last_uid for the mailbox. */ int fts_backend_get_last_uid(struct fts_backend *backend, struct mailbox *box, uint32_t *last_uid_r);
The solr sources ( src/plugins/fts-solr/fts-backend-solr.c:213) tell me this returns the last UID added to the index for the given mailbox and FTS index.
"set_build_key" From src/plugins/fts/fts-api.h:
/* Switch to building index for specified key. If backend doesn't want to index this key, it can return FALSE and caller will skip to next key. */ bool fts_backend_update_set_build_key(struct fts_backend_update_context *ctx, const struct fts_backend_build_key *key);
Same file provides outline of what a build_key is.
"build_more" , /* Add more content to the index for the currently specified build key. Non-BODY_PART_BINARY data must contain only full valid UTF-8 characters, but it doesn't need to be NUL-terminated. size contains the data size in bytes, not characters. This function may be called many times and the data block sizes may be small. Backend returns 0 if ok, -1 if build should be aborted. */ int fts_backend_update_build_more(struct fts_backend_update_context *ctx, const unsigned char *data, size_t size);
You should look at the sources of a few backends like squat and solr to get a feel of what exactly this is doing.
what is refresh versus rescan ? From fts-api.h:
/* Refresh index to make sure we see latest changes from lookups. Returns 0 if ok, -1 if error. */ int fts_backend_refresh(struct fts_backend *backend); /* Go through the entire index and make sure all mails are indexed, and delete any extra mails in the index. */ int fts_backend_rescan(struct fts_backend *backend);
Regards,
Stepham
On January 5, 2019 14:23:10 Joan Moreau via dovecot dovecot@dovecot.org wrote:
Thank Stephan
I basically need to know the role/description of each of the functions of the fts_backend:
struct fts_backend fts_backend_xapian = { .name = "xapian", .flags = FTS_BACKEND_FLAG_NORMALIZE_INPUT,*-> what other flags ?*
{ fts_backend_xapian_alloc, fts_backend_xapian_init, fts_backend_xapian_deinit, fts_backend_xapian_get_last_uid, fts_backend_xapian_update_init, fts_backend_xapian_update_deinit, fts_backend_xapian_update_set_mailbox, fts_backend_xapian_update_expunge, fts_backend_xapian_update_set_build_key, fts_backend_xapian_update_unset_build_key, fts_backend_xapian_update_build_more, fts_backend_xapian_refresh, fts_backend_xapian_rescan, fts_backend_xapian_optimize, fts_backend_default_can_lookup, fts_backend_xapian_lookup, fts_backend_xapian_lookup_multi, fts_backend_xapian_lookup_done } };
THank you
On 2019-01-05 08:49, Stephan Bosch wrote:
Op 04/01/2019 om 11:17 schreef Joan Moreau via dovecot: Why not, but please guide me about the core structure (mandatory funcitons, etc..) of a typical Dovecot FTS plugin
The Dovecot API documentation is not exhaustive everywhere, but the basics are documented. The remaining questions can be answered by looking at examples found in similar plugins or the relevant API sources.
I know of one FTS plugin not written by Dovecot developers:
https://github.com/atkinsj/fts-elasticsearch
If you really wish to do something like this, just go ahead. It will not be a small effort though. As soon as you have concrete questions, we can help you (don't expect rapid responses though).
Regards,
Stephan.