Hi,
It looks like I've found something strange. It looks like dovecot updates fts-lucene index every time I'm opening virtual folder that contains FTS query:
indexer-worker(dion): Warning: fts-lucene: Settings have changed, rebuilding index for mailbox
dovecot-virtual is pretty simple:
archive/INBOX
BODY "test"
first of all, I'm trying to perform FTS in archive/INBOX itself, then I'm trying to open virtual folder.
both default namespace and 'archive' are private namespaces with mdbox storage.
plugin { fts = lucene fts_lucene = whitespace_chars=@. fts_autoindex = no }
Any suggestions?
-- WBR, Dmitry
On Tue, Apr 12, 2016 at 11:05:08AM +0300, Dmitry Nezhevenko wrote:
Hi,
It looks like I've found something strange. It looks like dovecot updates fts-lucene index every time I'm opening virtual folder that contains FTS query:
indexer-worker(dion): Warning: fts-lucene: Settings have changed, rebuilding index for mailbox
Ok. It seems that it's unrelated to virtual at all. It's enough to just use any folder from non-inbox private namespace and perform FTS multiple times.
I've added a few debug prints around fts_lucene_settings_checksum, fts_index_have_compatible_settings and fts_index_set_header and fount that fts_lucene_settings_checksum is always same.
The root issue is that fts_lucene_settings_checksum reads checksum from mailbox with empty name (probably namespace root mbox). At the same time fts_index_set_header is called for all mailboxes in namespace except this 'root' mbox.
That's actually why I'm always getting 'Settings have changed' warning.
I've tried to create 'archive' mailbox (same as namespace name) and got call to fts_index_set_header() for it during scan, but with zero settings_checksum.
Any suggestions how to fix it?
-- WBR, Dmitry
On Tue, Apr 12, 2016 at 11:26:05AM +0300, Dmitry Nezhevenko wrote:
On Tue, Apr 12, 2016 at 11:05:08AM +0300, Dmitry Nezhevenko wrote:
indexer-worker(dion): Warning: fts-lucene: Settings have changed, rebuilding index for mailbox
The root issue is that fts_lucene_settings_checksum reads checksum from mailbox with empty name (probably namespace root mbox). At the same time fts_index_set_header is called for all mailboxes in namespace except this 'root' mbox.
This proof-of-concept patch fixes issue for me. I don't think that this is right way to fix it. I've copied vname calculation code from fts_index_have_compatible_settings. Maybe it's better to create something like fts_index_write_settings_checksum() in fts-api. Index: dovecot-2.2.22/src/plugins/fts-lucene/lucene-wrapper.cc =================================================================== --- dovecot-2.2.22.orig/src/plugins/fts-lucene/lucene-wrapper.cc +++ dovecot-2.2.22/src/plugins/fts-lucene/lucene-wrapper.cc @@ -832,6 +832,11 @@ static void rescan_clear_unseen_mailboxe struct mailbox_metadata metadata; struct fts_index_header hdr; + struct mail_namespace *ns; + const char* vname; + struct fts_index_header hdr_root; + unsigned int len; + memset(&hdr, 0, sizeof(hdr)); hdr.settings_checksum = fts_lucene_settings_checksum(&index->set); @@ -852,6 +857,26 @@ static void rescan_clear_unseen_mailboxe mailbox_free(&box); } (void)mailbox_list_iter_deinit(&iter); + + // Make sure we've stored settings checksum for non-INBOX namespaces + ns = mailbox_list_get_namespace(index->list); + if ((ns->flags & NAMESPACE_FLAG_INBOX_USER) == 0) { + len = strlen(ns->prefix); + if (len > 0 && ns->prefix[len-1] == mail_namespace_get_sep(ns)) + len--; + vname = t_strndup(ns->prefix, len); + + box = mailbox_alloc(index->list, vname, + (enum mailbox_flags)0); + if (mailbox_open(box) == 0 && + fts_index_get_header(box, &hdr_root)) { + if (hdr_root.settings_checksum != hdr.settings_checksum) { + hdr_root.settings_checksum = hdr.settings_checksum; + (void)fts_index_set_header(box, &hdr_root); + } + } + mailbox_free(&box); + } } int lucene_index_rescan(struct lucene_index *index) -- WBR, Dmitry
On Tue, Apr 12, 2016 at 12:40:55PM +0300, Dmitry Nezhevenko wrote:
This proof-of-concept patch fixes issue for me. I don't think that this is right way to fix it. I've copied vname calculation code from fts_index_have_compatible_settings. Maybe it's better to create something like fts_index_write_settings_checksum() in fts-api.
It looks like dovecot is pretty stable with this patch. I've successfully indexed ~7GB of mails and got ~4GB index.
In any case, any comments/suggestions? Maybe there is other solution?
-- WBR, Dmitry
participants (1)
-
Dmitry Nezhevenko