[Dovecot] Full text search in attachments
Hello,
is it possible to use the Solr full text search plugin for indexing mail attachments? I found a very old patch and some hints regarding a fts_decoder script that I don't understand.
Making Solr indexing PDF or Office files shouldn't be that difficult, but how can I enable the plugin to transfer the attachments to Solr?
Best regards,
Sebastian
On 11.8.2012, at 5.28, Mailing wrote:
is it possible to use the Solr full text search plugin for indexing mail attachments? I found a very old patch and some hints regarding a fts_decoder script that I don't understand.
Making Solr indexing PDF or Office files shouldn't be that difficult, but how can I enable the plugin to transfer the attachments to Solr?
I updated the wiki with:
- See the decode2text.sh script included in Dovecot for how to use this.
Hi Timo,
Am 19.08.2012 17:26, schrieb Timo Sirainen:
On 11.8.2012, at 5.28, Mailing wrote: I updated the wiki with:
- See the decode2text.sh script included in Dovecot for how to use this.
I want to send the complete attachments (unparsed) to solr server and let solr do the parsing work. Is it maybe possible to use a decode2text like script together with curl to send the attachments to the sold server?
But in this case I would have to know additional informations in the script like message id and the mailbox name.
Best regards,
Sebastian
On 19.8.2012, at 21.57, Mailing wrote:
Am 19.08.2012 17:26, schrieb Timo Sirainen:
On 11.8.2012, at 5.28, Mailing wrote: I updated the wiki with:
- See the decode2text.sh script included in Dovecot for how to use this.
I want to send the complete attachments (unparsed) to solr server and let solr do the parsing work. Is it maybe possible to use a decode2text like script together with curl to send the attachments to the sold server?
But in this case I would have to know additional informations in the script like message id and the mailbox name.
It can't work like that, because all the text from one message (text & all attachments) has to go to one document.
participants (2)
-
Mailing
-
Timo Sirainen