[Dovecot] Can dovecot+solr search on attachments?
I remember to read that dovecot+solr support search on mail attachment
Is this true?
Maybe, is needed to improve the basic solr schema to do this?
-- Antonio Pérez-Aranda Alcaide aperezaranda@yaco.es
Yaco Sistemas S.L. http://www.yaco.es/ C/ Rioja 5, 41001 Sevilla Teléfono +34 954 50 00 57 Fax +34 954 50 09 29
On 28.12.2010, at 10.45, Antonio Perez-Aranda wrote:
I remember to read that dovecot+solr support search on mail attachment
Is this true?
Almost, but not quite. I've an 1,5 year old patch from Rui implementing it to Solr. Since then I updated fts code a bit to make it easier to support this for all backends and Rui said he's going to work on it again.. But I don't know if he finished it.
But ¿Does Dovecot sent all content of a message to Solr? ¿Are Attachment Included?
Then, it's possible to allow indexing and return results on search only with some changes on Solr Scheme.
2010/12/28 Timo Sirainen tss@iki.fi:
On 28.12.2010, at 10.45, Antonio Perez-Aranda wrote:
I remember to read that dovecot+solr support search on mail attachment
Is this true?
Almost, but not quite. I've an 1,5 year old patch from Rui implementing it to Solr. Since then I updated fts code a bit to make it easier to support this for all backends and Rui said he's going to work on it again.. But I don't know if he finished it.
-- Antonio Pérez-Aranda Alcaide aperezaranda@yaco.es
Yaco Sistemas S.L. http://www.yaco.es/ C/ Rioja 5, 41001 Sevilla Teléfono +34 954 50 00 57 Fax +34 954 50 09 29
On 28.12.2010, at 11.30, Antonio Perez-Aranda wrote:
But ¿Does Dovecot sent all content of a message to Solr? ¿Are Attachment Included?
Then, it's possible to allow indexing and return results on search only with some changes on Solr Scheme.
If you modify fts_backend_default_can_index() to always return TRUE, it'll index everything. But are you talking about Solr actually being able to somehow figure out that there's a word document coming and index only its plaintext? If it can do that, I'd think it would require that the word document is sent as its own separate Solr document, but Dovecot just sends it in the middle of other message text.
Well, I test it in the next three/four weeks
2010/12/28 Timo Sirainen tss@iki.fi:
On 28.12.2010, at 11.30, Antonio Perez-Aranda wrote:
But ¿Does Dovecot sent all content of a message to Solr? ¿Are Attachment Included?
Then, it's possible to allow indexing and return results on search only with some changes on Solr Scheme.
If you modify fts_backend_default_can_index() to always return TRUE, it'll index everything. But are you talking about Solr actually being able to somehow figure out that there's a word document coming and index only its plaintext? If it can do that, I'd think it would require that the word document is sent as its own separate Solr document, but Dovecot just sends it in the middle of other message text.
-- Antonio Pérez-Aranda Alcaide aperezaranda@yaco.es
Yaco Sistemas S.L. http://www.yaco.es/ C/ Rioja 5, 41001 Sevilla Teléfono +34 954 50 00 57 Fax +34 954 50 09 29
participants (2)
-
Antonio Perez-Aranda
-
Timo Sirainen