[Dovecot] panic fts_solr for bad attachment
Hi!
I use dovecot 2.1.7 on Ubuntu 12.10 with fts_solr und decode2text.sh for indexing attachments. This works great in general.
Just for one user there is a problem with an unknown bad attachment.
I run "doveadm index -A '*'". After a while I receive:
doveadm(xyz): Error: fts_solr: Invalid XML input at line 1: mismatched tag doveadm(xyz): Panic: file solr-connection.c: line 545 (solr_connection_post_more): assertion failed: (maxfd >= 0) doveadm(xyz): Error: Raw backtrace: /usr/lib/dovecot/libdovecot.so.0(+0x3c14a) [0x7f7ce2c1714a] -> /usr/lib/dovecot/libdovecot.so.0(default_fatal_handler+0x2a) [0x7f7ce2c1720a] -> /usr/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7f7ce2bee81a] -> /usr/lib/dovecot/modules/lib21_fts_solr_plugin.so(solr_connection_post_more+0x249) [0x7f7ce11913a9] -> /usr/lib/dovecot/modules/lib21_fts_solr_plugin.so(+0x4597) [0x7f7ce118e597] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(+0x6f57) [0x7f7ce159df57] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(fts_build_mail+0xf5) [0x7f7ce159e085] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(+0xba70) [0x7f7ce15a2a70] -> doveadm(+0x15309) [0x7f7ce35cc309] -> doveadm(+0x11f36) [0x7f7ce35c8f36] -> doveadm(+0x12bf1) [0x7f7ce35c9bf1] -> doveadm(doveadm_mail_try_run+0x161) [0x7f7ce35c9ed1] -> doveadm(main+0x3d1) [0x7f7ce35c8ae1] -> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f7ce283d76d] -> doveadm(+0x11d15) [0x7f7ce35c8d15]
In catalina out I find:
Nov 18, 2012 2:59:09 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Invalid UTF-8 start byte 0xfc (at char #25214836, byte #26687495) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:81) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.valves.RequestFilterValve.process(RequestFilterValve.java:316) at org.apache.catalina.valves.RemoteAddrValve.invoke(RemoteAddrValve.java:81) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:722) Caused by: com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 start byte 0xfc (at char #25214836, byte #26687495) at com.ctc.wstx.sr.StreamScanner.constructFromIOE(StreamScanner.java:625) at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:994) at com.ctc.wstx.sr.StreamScanner.getNext(StreamScanner.java:754) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2691) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1065) at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:309) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:156) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79) ... 19 more Caused by: java.io.CharConversionException: Invalid UTF-8 start byte 0xfc (at char #25214836, byte #26687495) at com.ctc.wstx.io.UTF8Reader.reportInvalidInitial(UTF8Reader.java:303) at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:189) at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:87) at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57) at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:988) ... 25 more
doveadm index stops after this error.
How can I make doveadm just to skip the error and to continue indexing?
Thanks Robert
-- Robert Strötgen Abteilungsleiter Informationsmanagement und Publikationen Georg-Eckert-Institut für internationale Schulbuchforschung Celler Str. 3 38114 Braunschweig Tel. +49 (0)531 59099-47 & +49 (0)531 123103-205 Fax +49 (0)531 59099-99 http://www.gei.de/
On 18.11.2012, at 16.54, Robert Strötgen wrote:
Nov 18, 2012 2:59:09 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Invalid UTF-8 start byte 0xfc (at char #25214836, byte #26687495)
Annoying. I guess these fix it:
http://hg.dovecot.org/dovecot-2.1/rev/172295f5a78b http://hg.dovecot.org/dovecot-2.1/rev/01550514f189 http://hg.dovecot.org/dovecot-2.1/rev/339e654f371e
On 11/26/2012 5:50 PM, Timo Sirainen wrote:
On 18.11.2012, at 16.54, Robert Strötgen wrote:
Nov 18, 2012 2:59:09 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Invalid UTF-8 start byte 0xfc (at char #25214836, byte #26687495) Annoying. I guess these fix it:
http://hg.dovecot.org/dovecot-2.1/rev/172295f5a78b http://hg.dovecot.org/dovecot-2.1/rev/01550514f189 http://hg.dovecot.org/dovecot-2.1/rev/339e654f371e
These patches have improved fts for me - but I still have errors like:
Nov 26 20:49:29 bubba dovecot: indexer-worker(dmiller@amfes.com): Panic: file solr-connection.c: line 547 (solr_connection_post_more): assertion failed: (maxfd >= 0) Nov 26 20:49:29 bubba dovecot: indexer-worker(dmiller@amfes.com): Error: Raw backtrace: /usr/local/lib/dovecot/libdovecot.so.0(+0x45cea) [0x7f0c66c33cea] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x45d2e) [0x7f0c66c33d2e] -> /usr/local/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7f0c66c07d10] -> /usr/local/lib/dovecot/lib21_fts_solr_plugin.so(+0x6de5) [0x7f0c653a6de5] -> /usr/local/lib/dovecot/lib21_fts_solr_plugin.so(+0x3867) [0x7f0c653a3867] -> /usr/local/lib/dovecot/lib20_fts_plugin.so(fts_build_mail+0x53b) [0x7f0c655b2b2b] -> /usr/local/lib/dovecot/lib20_fts_plugin.so(+0xc530) [0x7f0c655b7530] -> dovecot/indexer-worker dmiller@amfes.com Archives/2010 - 7000/7266 [0x402326] -> dovecot/indexer-worker dmiller@amfes.com Archives/2010 - 7000/7266 [0x4026cc] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_call_io+0x36) [0x7f0c66c40b76] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0xa7) [0x7f0c66c419c7] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_run+0x28) [0x7f0c66c406b8] -> /usr/local/lib/dovecot/libdovecot.so.0(master_service_run+0x13) [0x7f0c66c2c203] -> dovecot/indexer-worker dmiller@amfes.com Archives/2010 - 7000/7266 [0x401dfa] -> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f0c6685276d] -> dovecot/indexer-worker dmiller@amfes.com Archives/2010 - 7000/7266 [0x401e9d]
The solr log shows: Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, code 8)) at [row,col {unknown-source}]: [1011144,197790]
-- Daniel
On 27.11.2012, at 6.51, Daniel L. Miller wrote:
On 11/26/2012 5:50 PM, Timo Sirainen wrote:
On 18.11.2012, at 16.54, Robert Strötgen wrote:
Nov 18, 2012 2:59:09 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Invalid UTF-8 start byte 0xfc (at char #25214836, byte #26687495) Annoying. I guess these fix it:
http://hg.dovecot.org/dovecot-2.1/rev/172295f5a78b http://hg.dovecot.org/dovecot-2.1/rev/01550514f189 http://hg.dovecot.org/dovecot-2.1/rev/339e654f371e
Ugh. Should have known this was already being done. Reversed the whole thing.
These patches have improved fts for me - but I still have errors like: .. Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, code 8)) at [row,col {unknown-source}]: [1011144,197790]
Something's wrong. The Solr code was already supposed to catch all of these.
On 27.11.2012, at 7.50, Timo Sirainen wrote:
Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, code 8)) at [row,col {unknown-source}]: [1011144,197790]
Something's wrong. The Solr code was already supposed to catch all of these.
http://dovecot.org/tmp/allchars.gz
If you send this mail to yourself and index it, does it fail? (Works for me.)
On 11/26/2012 10:08 PM, Timo Sirainen wrote:
On 27.11.2012, at 7.50, Timo Sirainen wrote:
Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, code 8)) at [row,col {unknown-source}]: [1011144,197790] Something's wrong. The Solr code was already supposed to catch all of these. http://dovecot.org/tmp/allchars.gz
If you send this mail to yourself and index it, does it fail? (Works for me.)
I think it works - I tried sending it as an attachment (unzipped) and then with a command of "sendmail -t dmiller@amfes.com < allchars" - I don't know how else to do it.
Following that by a "doveadm search -u dmiller@amfes.com mailbox INBOX text test" indexed a couple new messages, including I assume these, without errors. Some of my other mailboxes continue to break.
I know you've got a filter that strips out control characters prior to sending to solr - so I'm left to assume: this server is using ECC, so I don't think so).
- solr is breaking on its own
- I have a hardware problem that is corrupting memory (possible, but
- Somehow in the communication with solr, control characters are being introduced. Perhaps it's a maximum length or buffer issue?
- Could it be attachment related?
- Could it be zlib related - as in compressed mail, or a mix of compressed & uncompressed mail, being processed?
-- Daniel
On 11/26/2012 5:50 PM, Timo Sirainen wrote:
On 18.11.2012, at 16.54, Robert Strötgen wrote:
Nov 18, 2012 2:59:09 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Invalid UTF-8 start byte 0xfc (at char #25214836, byte #26687495) Annoying. I guess these fix it:
http://hg.dovecot.org/dovecot-2.1/rev/172295f5a78b http://hg.dovecot.org/dovecot-2.1/rev/01550514f189 http://hg.dovecot.org/dovecot-2.1/rev/339e654f371e
The "waitFlush" option for solr's commit method has been deprecated - and removed completely in the current version. Suggest a change to fts-backend-solr.c:
in fts_backend_solr_update_deinit() str = t_strdup_printf("<commit " "waitSearcher=\"%s\"/>", ctx->documents_added ? "true" : "false");
-- Daniel
participants (3)
-
Daniel L. Miller
-
Robert Strötgen
-
Timo Sirainen