Dovecot v2.2 FTS is not indexing "text/html" emails...

Akash akbwiz+dovecot at gmail.com
Mon Aug 4 17:16:23 UTC 2014


Hi,

I am not sure its intended or a fault in the newest Dovecot versions. I
have been using Dovecot v1.2.15 on Debian squeeze and FTS is working as
expected. When I search a quoted string "very good", I get 107 results
including plain and HTML emails which have this phrase.

In order to compare the benefits of lucene over squat, I recently started
testing dovecot v2.2.13 on Debian Sid with the same maildir content. But
now the same search "very good" yielded just 8 results. I thought it could
be some problem with lucene so I tried switching to squat and got 107
results again. After this I deleted the old squat search index files
created by v1.2.15 and re-indexed the mail-box by using doveadm index
command. Now the same squat search is giving 8 results just as lucene. So
I have realized that its not a problem with just lucene but FTS in newer
dovecot isn't indexing those emails which have Content-type as text/html.

Thus if a mail is like this:

Content-Type: text/html

<b>He is very good.</b>

It isn't shown in search by the squat indexes created using dovecot
v2.2.13. I have done further testing on some sample emails which confirmed
this behavior.

Why is this so?

-Regards,
Akash



More information about the dovecot mailing list