[Dovecot] fts solr : out of memory

Timo Sirainen tss at iki.fi
Tue Jul 31 23:42:28 EEST 2012

On 31.7.2012, at 22.46, tonio at starbridge.org wrote:

>>> 21500/59363doveadm(clxx at spamguard.fr): Error: fts_solr: Invalid XML
>>> input at line 1: mismatched tag
>> No idea. You can reproduce this? What does it log with this patch? http://hg.dovecot.org/dovecot-2.1/rev/817b69b2b21f
> It happens every time on the same mailboxes (very few) around the same
> uid number (I think I can find the exact uid with strace and send the
> email message to you if it helps)
> catalina.out show this at this time:
> INFO: {} 0 1
> 31 juil. 2012 21:19:56 org.apache.solr.common.SolrException log
> GRAVE: org.apache.solr.common.SolrException: Illegal character
> ((CTRL-CHAR, code 4))
> After a quick google search , it seems related to invalid Control
> Character sent to SOLR.

So it seems, but Dovecot already has code to filter out all control characters when sending data to Solr. I just looked through the source and did a few tests and I couldn't get it to send a control char to Solr.

> I've applied your last patch and the message is now:
> Error: fts_solr: Invalid XML input at 4:113: mismatched tag (near:
> <html><head><title>Apache Tomcat/6.0.35 - Rapport
> d'erreur</title><style><!--H1
> {font-family:Tahoma,Arial,sans-serif;color:white)

I don't get this either. Instead I get a clean error (if I explicitly change the code to allow control chars):

Jul 31 23:41:14 indexer-worker(tss 16345 ): Error: fts_solr: Indexing failed: 400 Illegal character ((CTRL-CHAR, code 4))  at [row,col {unknown-source}]: [858,254]

More information about the dovecot mailing list