On 31.7.2012, at 22.46, tonio@starbridge.org wrote:
21500/59363doveadm(clxx@spamguard.fr): Error: fts_solr: Invalid XML input at line 1: mismatched tag No idea. You can reproduce this? What does it log with this patch? http://hg.dovecot.org/dovecot-2.1/rev/817b69b2b21f
It happens every time on the same mailboxes (very few) around the same uid number (I think I can find the exact uid with strace and send the email message to you if it helps)
catalina.out show this at this time:
INFO: {} 0 1 31 juil. 2012 21:19:56 org.apache.solr.common.SolrException log GRAVE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, code 4)) .. After a quick google search , it seems related to invalid Control Character sent to SOLR.
So it seems, but Dovecot already has code to filter out all control characters when sending data to Solr. I just looked through the source and did a few tests and I couldn't get it to send a control char to Solr.
I've applied your last patch and the message is now: Error: fts_solr: Invalid XML input at 4:113: mismatched tag (near:
<html><head><title>Apache Tomcat/6.0.35 - Rapport d'erreur</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white)
I don't get this either. Instead I get a clean error (if I explicitly change the code to allow control chars):
Jul 31 23:41:14 indexer-worker(tss 16345 ): Error: fts_solr: Indexing failed: 400 Illegal character ((CTRL-CHAR, code 4)) at [row,col {unknown-source}]: [858,254]