[Dovecot] [BUG] Lucene plugin breaks header substring search
According to the IMAP spec if I do a search for "TO isocpp.org" it should find all the messages whose To: field contains the string "isocpp.org", but dovecot is returning me an empty list. However, a search for "TO tm@isocpp.org" produces a long list of messages. This behavior is present if I *even load* the lucene fts plugin. Note that lucene isn't in use (fts = squat); it's merely loaded. This behavior goes away if I don't load fts_lucene.
Dovecot configuration with dovecot -n:
--8<---------------cut here---------------start------------->8--- # 2.1.6: /usr/local/stow/dovecot-2.1.6/etc/dovecot/dovecot.conf # OS: Darwin 11.4.2 x86_64 hfs default_internal_user = _dovecot default_login_user = _dovenull mail_gid = 20 mail_location = mdbox:/Users/dave/Library/Data/LocalIMAP/mdbox mail_plugin_dir = /usr/local/lib/dovecot mail_plugins = fts fts_squat fts_lucene zlib mail_uid = 501 maildir_very_dirty_syncs = yes namespace { inbox = yes location = prefix = separator = . subscriptions = yes type = private } passdb { args = uid=501 gid=20 home=/Users/dave nopassword=y driver = static } plugin { fts = squat zlib_save = gz zlib_save_level = 6 } protocols = imap ssl = no protocol imap { mail_plugins = fts fts_squat fts_lucene zlib } --8<---------------cut here---------------end--------------->8---
Dovecot version: 2.1.6
Operating system or Linux distribution name: MacOS X 10.7, 10.8
CPU architecture (x86 or something else?): x86_64
Filesystem you used (especially if you use NFS or not): Mac
Some kind of description of what you were doing and with what IMAP client.: Searching
-- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost
On 16.10.2012, at 1.35, Dave Abrahams wrote:
According to the IMAP spec if I do a search for "TO isocpp.org" it should find all the messages whose To: field contains the string "isocpp.org", but dovecot is returning me an empty list. However, a search for "TO tm@isocpp.org" produces a long list of messages.
This specific problem can be solved by:
plugin { fts_lucene = whitespace_chars=@. }
This behavior is present if I *even load* the lucene fts plugin. Note that lucene isn't in use (fts = squat); it's merely loaded. This behavior goes away if I don't load fts_lucene.
I don't really see how that's possible. Although a quick test shows me that fts_squat seems to be completely broken with me for some reason.
on Mon Oct 15 2012, Timo Sirainen
On 16.10.2012, at 1.35, Dave Abrahams wrote:
According to the IMAP spec if I do a search for "TO isocpp.org" it should find all the messages whose To: field contains the string "isocpp.org", but dovecot is returning me an empty list. However, a search for "TO tm@isocpp.org" produces a long list of messages.
This specific problem can be solved by:
plugin { fts_lucene = whitespace_chars=@. }
Wow; OK, Google tells me that's documented at http://wiki2.dovecot.org/Plugins/FTS/Lucene but I only found it now because I knew what to look for.
This might be good enough for me, but still doesn't make it conforming to the IMAP spec, right? IIUC the spec says you can search for arbitrary strings without regard to word boundaries.
This behavior is present if I *even load* the lucene fts plugin. Note that lucene isn't in use (fts = squat); it's merely loaded. This behavior goes away if I don't load fts_lucene.
I don't really see how that's possible. Although a quick test shows me that fts_squat seems to be completely broken with me for some reason.
I don't know what to tell ya. Tests confirm it for me.
-- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost
on Mon Oct 15 2012, Timo Sirainen
On 16.10.2012, at 1.35, Dave Abrahams wrote:
According to the IMAP spec if I do a search for "TO isocpp.org" it should find all the messages whose To: field contains the string "isocpp.org", but dovecot is returning me an empty list. However, a search for "TO tm@isocpp.org" produces a long list of messages.
This specific problem can be solved by:
plugin { fts_lucene = whitespace_chars=@. }
OK, Google tells me that's documented at http://wiki2.dovecot.org/Plugins/FTS/Lucene but I only found it now because I knew what to look for. I suggest doing something to make that more discoverable.
This might be good enough for me, but still doesn't make it conforming to the IMAP spec, right? IIUC the spec says you can search for arbitrary strings without regard to word boundaries.
This behavior is present if I *even load* the lucene fts plugin. Note that lucene isn't in use (fts = squat); it's merely loaded. This behavior goes away if I don't load fts_lucene.
I don't really see how that's possible. Although a quick test shows me that fts_squat seems to be completely broken with me for some reason.
I don't know what to tell ya. Tests confirm it for me.
-- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost
On 16.10.2012, at 6.45, Dave Abrahams wrote:
According to the IMAP spec if I do a search for "TO isocpp.org" it should find all the messages whose To: field contains the string "isocpp.org", but dovecot is returning me an empty list. However, a search for "TO tm@isocpp.org" produces a long list of messages.
This specific problem can be solved by:
plugin { fts_lucene = whitespace_chars=@. }
OK, Google tells me that's documented at http://wiki2.dovecot.org/Plugins/FTS/Lucene but I only found it now because I knew what to look for. I suggest doing something to make that more discoverable.
That is the only page where there is any information about fts-lucene. I made it a bit clearer in that page now that whitespace_chars should be used as default.
This might be good enough for me, but still doesn't make it conforming to the IMAP spec, right? IIUC the spec says you can search for arbitrary strings without regard to word boundaries.
It doesn't conform to the IMAP spec, correct. But nobody cares about that anymore. Everyone violates it.
on Mon Oct 15 2012, Timo Sirainen
On 16.10.2012, at 1.35, Dave Abrahams wrote:
According to the IMAP spec if I do a search for "TO isocpp.org" it should find all the messages whose To: field contains the string "isocpp.org", but dovecot is returning me an empty list. However, a search for "TO tm@isocpp.org" produces a long list of messages.
This specific problem can be solved by:
plugin { fts_lucene = whitespace_chars=@. }
Do I also need
plugin { fts = lucene }
or are these mutually exclusive, or...? It's not clear from http://wiki2.dovecot.org/Plugins/FTS/Lucene
-- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost
On 16.10.2012, at 6.51, Dave Abrahams wrote:
plugin { fts_lucene = whitespace_chars=@. }
Do I also need
plugin { fts = lucene }
or are these mutually exclusive, or...? It's not clear from http://wiki2.dovecot.org/Plugins/FTS/Lucene
fts setting selects which backend to use. fts_lucene gives settings to that backend.
participants (2)
-
Dave Abrahams
-
Timo Sirainen