-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Sat, 30 Nov 2013, Timo Sirainen wrote:
- Don't index non-text data? For example if there is large block of base64 data or something else that definitely doesn't look like text, it's pretty useless to index it. Then again, we do want to index all kinds of IDs that someone might want to search. This could be a bit difficult to implement well.
- Attachments can be translated to indexable UTF-8 text already with fts_decoder setting by doing it via a conversion script. This could also support Apache Tika server directly.
This means some kind of MIME type based (or file type guesser) "... to UTF8 text" converter script? Some users would find that very very very ^ n nice. There are already several programs used in the field of CMS.
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux)
iQEVAwUBUqA8A13r2wJMiz2NAQLQYwf/bAyrg080/i2khM/XGXLlhjlcPcyxGHym KgoFFBhh2sgfl+ecRHCM4BP+WX/c5coxAScyXhSy9JjwcQz8MXUHzkbGL4d8kwa4 pgdhaD4hFhPqpOJGf1ULwBSIBEsJfZeHaOkJHlMqDgd3yKY5APoJPKJtG2z+lI+7 vqR/Pe8n8EhCcWcLC1CfEGKxcci09XYj09Sai96VGbCO2coVCm+xIKRSCW6pasoQ NTqpJBTCe2gCD3KdVA5jUNqFeEj2AQF5+nkujtSF4B1G/xrpfoABLkJ+lyQ8F5hc DTJFiHhlvJKRIIKbhuyQukeqDSzeln2UtSRce3q59fek4foFzDrhTw== =l3mf -----END PGP SIGNATURE-----