[Dovecot] Full text search indexing

Rob Middleton robm-dovecot at centenary.org.au
Wed Apr 12 12:19:31 EEST 2006


>>
>> Did you test what the rate was approximately with some somewhat sensible
>> search strings?
> 
> I'll try. Sensible searchstrings are not that easy to come up with though.
> The data indexed are the bodyparts of an UML (linux) archive.
> 
> First line is nr of candidates. Second line is result of a grep -i in 
> the rawdata. (Ouf of a total of almost 13000 maps).

grep -c -i
may be more useful. (then the two numbers are directly comparable)

I certainly think that orders of magnitude is more important than O() in 
the search case of mailboxes where messages come and go regularly ... 
this type of index is actually appendable.

> Some worse cases:
> jensl:~/project/jelindex> ./search.sh "management"
> 5098
> 341
> jensl:~/project/jelindex> ./search.sh "Timo Sirainen"
> 494
> 0

??? What
Timo Sirainen
requires all of the pairs:
"ti", "im", "mo", "o ", " S", "Si" ... and so on.
Longer strings should (all else being equal) result in greater accuracy.

I understand spaces are treated exactly the same way as characters in 
IMAP search.


Rob.


More information about the dovecot mailing list