I have tried to spend some time of understanding the logic (if any !) of the fts part
Honestly, the one who created this mess shall be the one to fix it, or one shall refactor it totally.
Basically, the fts "core" should be able to do
- select the backend according to conf file
- send new emails/maiblox to backend
- send teh ID of the emails to be removed
- resend an entire mailbox ('rescan')
- send the search parameters (from client) to backend and return the email to front end based on backend results (and NOTHING more)
Today, the fts part is plain wong and must be totally reviewed.
I do not have the time but I can participate in testing if someone is ready to roll up its sleeves on teh mater
THe "loop" part seems the most urgent : It breaks everything (search timeout 100% of the time)
On 2019-04-06 09:56, Joan Moreau via dovecot wrote:
For the point 1, this is not "suboptimal", it is plain wrong (results are damn wrong ! and this is not related to the backend, but the FTS logic in Dovecot core)
For the point 2 , this has been discussed already numerous times but without action. The dovecot core shall be the one re-submitting the emails to scan, not the backend to try to figure out where and which are the emails to be re-scaned
For the point 3, I will do a bit of research in the existing code and will get back to you
For the point 4, this is random. FTS backend (xapian, lucene, solr, whatever..) returns X, then dovecot core choose to select only Y emails. THis is a clear bug.
On Fri, Apr 05, 2019 at 19:33:57 +0800, Joan Moreau via dovecot wrote:Hi
If you plan to fix the FTS part of Dovecot, I will be very gratefull.
I'm trying to figure out what is causing the 3rd issue you listed, so we can
decide how severe it is and therefore how quickly it needs to be fixed. At
the moment we are unable to reproduce it, and therefore we cannot fix it.
Not sure this is related to any specific commit but rahter the overall
design
Ok.
The list of bugs so far
1 - Double call to fts plugins with inconsistent parameter (first call
diferent from second call for the same request)
Understood. It is my understanding that this is simply suboptimal rather
than causing crashes/etc.
2 - "Rescan" features for now consists of deleting indexes. SHall be
resending emails to rescan to the fts plugin instead
I'm not sure I follow. The rescan operation is invoked on the fts backend
and it is up to the implementation to somehow ensure that after it is done
the fts index is up to date. The easiest way to implement it is to simply
delete the fts index and re-index all the mails. That is what currently
happens in the solr backend.
The lucene fts backend does a more complicated matching of the fts index
with the emails. Finally, the deprecated squat backend seem to ignore the
rescan requests (its rescan vfunc is NULL).
3 - the loop when body search (just do a "doveadm search -u user@domain
mailbox inbox text whatevertexte")
Refer to my email to Timo on 2019-04-03 18:30 on the same thread for bug
details
(especially the loop)
This seems to be the most important of the 4 issues you listed, so I'd like
to focus on this one for now.
As I mentioned, we cannot reproduce this ourselves. So, we need your help
to narrow things down. Therefore, can you give us the commit hashes of
revisions that you know are good and which are bad? You can use git-bisect
to narrow the range down.
4 - Most notably, I notice that header search usually does not care
about fts plugin (even with fts_enforced) and rely on some internal
search , which si total non-sense
You're right, that doesn't seem to make sense. Can you provide a test case?
Jeff.
Let me know how can I help on thos 4 points
On 2019-04-05 18:37, Josef 'Jeff' Sipek wrote:
On Fri, Apr 05, 2019 at 17:45:36 +0800, Joan Moreau wrote:
I am on master (very latest)
No clue exactly when this problem appears, but
1 - the "request twice the fts plugin instead of once" issue has always
been there (since my first RC release of fts-xapian)
Ok, good to know.
2 - the body/text loop has appeared recently (maybe during the month of
March)
Our testing doesn't seem to be able to reproduce this. Can you try to
git-bisect this to find which commit broke it?
Thanks,
Jeff.
On 2019-04-05 16:36, Josef 'Jeff' Sipek via dovecot wrote:
On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote:
issue seems in the Git version :
Which git revision?
Before you updated to the broken revision, which revision/version were you
running?
Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
just before the fts_enforced=body introduction)? That's the only recent fts
change.
Thanks,
Jeff.
On 2019-04-03 18:58, @lbutlr via dovecot wrote:
On 3 Apr 2019, at 04:30, Joan Moreau via dovecot <dovecot@dovecot.org> wrote:
doveadm search -u jom@grosjo.net mailbox inbox text milan
Did that search over my list mail and got 83 results, not able to duplicate your issue.
What version of dovecot and have you tried to reindex?
dovecot-2.3.5.1 here.