Re: [Dovecot] panic fts_solr for bad attachment
On 11/27/2012 7:28 AM, Daniel L. Miller wrote:
On 11/26/2012 10:08 PM, Timo Sirainen wrote:
On 27.11.2012, at 7.50, Timo Sirainen wrote:
Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, code 8)) at [row,col {unknown-source}]: [1011144,197790] Something's wrong. The Solr code was already supposed to catch all of these.
I was taking a brief scan of the code - and as usual I'm probably wrong that processing? Like mailbox names, field names, or uids - that
- but I believe the protection comes from the xml_encode functions.
Could it be that there are some solr writes that don't go through that function - because it is assumed that the data in question doesn't need
SHOULDN'T have any garbage but maybe something is creeping in?
-- Daniel
On 27.11.2012, at 17.38, Daniel L. Miller wrote:
On 11/27/2012 7:28 AM, Daniel L. Miller wrote:
On 11/26/2012 10:08 PM, Timo Sirainen wrote:
On 27.11.2012, at 7.50, Timo Sirainen wrote:
Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, code 8)) at [row,col {unknown-source}]: [1011144,197790] Something's wrong. The Solr code was already supposed to catch all of these.
I was taking a brief scan of the code - and as usual I'm probably wrong - but I believe the protection comes from the xml_encode functions. Could it be that there are some solr writes that don't go through that function - because it is assumed that the data in question doesn't need that processing? Like mailbox names, field names, or uids - that SHOULDN'T have any garbage but maybe something is creeping in?
I did go through the code looking for that a few times already but didn't notice anything. I went through it once more, and finally found the problem. :) http://hg.dovecot.org/dovecot-2.1/rev/6a97faf3e500
On 11/27/2012 1:07 PM, Timo Sirainen wrote:
On 27.11.2012, at 17.38, Daniel L. Miller wrote:
On 11/27/2012 7:28 AM, Daniel L. Miller wrote:
On 11/26/2012 10:08 PM, Timo Sirainen wrote:
On 27.11.2012, at 7.50, Timo Sirainen wrote:
Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, code 8)) at [row,col {unknown-source}]: [1011144,197790] Something's wrong. The Solr code was already supposed to catch all of these. I was taking a brief scan of the code - and as usual I'm probably wrong - but I believe the protection comes from the xml_encode functions. Could it be that there are some solr writes that don't go through that function - because it is assumed that the data in question doesn't need that processing? Like mailbox names, field names, or uids - that SHOULDN'T have any garbage but maybe something is creeping in? I did go through the code looking for that a few times already but didn't notice anything. I went through it once more, and finally found the problem. :) http://hg.dovecot.org/dovecot-2.1/rev/6a97faf3e500
:( Mine still breaks. Both UTF-8 and Control-Char errors.
-- Daniel
On 28.11.2012, at 4.43, Daniel L. Miller wrote:
I did go through the code looking for that a few times already but didn't notice anything. I went through it once more, and finally found the problem. :) http://hg.dovecot.org/dovecot-2.1/rev/6a97faf3e500
:( Mine still breaks. Both UTF-8 and Control-Char errors.
Can you grab the network traffic between Dovecot and Solr and find the problematic stream?
On 11/27/2012 6:45 PM, Timo Sirainen wrote:
On 28.11.2012, at 4.43, Daniel L. Miller wrote:
I did go through the code looking for that a few times already but didn't notice anything. I went through it once more, and finally found the problem. :) http://hg.dovecot.org/dovecot-2.1/rev/6a97faf3e500
:( Mine still breaks. Both UTF-8 and Control-Char errors. Can you grab the network traffic between Dovecot and Solr and find the problematic stream?
Tell me how and I'll be happy to!
-- Daniel
On 28.11.2012, at 10.50, Daniel L. Miller wrote:
On 11/27/2012 6:45 PM, Timo Sirainen wrote:
On 28.11.2012, at 4.43, Daniel L. Miller wrote:
I did go through the code looking for that a few times already but didn't notice anything. I went through it once more, and finally found the problem. :) http://hg.dovecot.org/dovecot-2.1/rev/6a97faf3e500
:( Mine still breaks. Both UTF-8 and Control-Char errors. Can you grab the network traffic between Dovecot and Solr and find the problematic stream?
Tell me how and I'll be happy to!
Maybe the easiest would be to use tcpflow. It outputs different TCP streams to different files. From them you can then grep for the error and look closer into it. I guess something like wireshark would work too, but I've never been able to use its GUI in a useful way.
On 11/28/2012 12:55 AM, Timo Sirainen wrote:
On 28.11.2012, at 10.50, Daniel L. Miller wrote:
On 11/27/2012 6:45 PM, Timo Sirainen wrote:
On 28.11.2012, at 4.43, Daniel L. Miller wrote:
I did go through the code looking for that a few times already but didn't notice anything. I went through it once more, and finally found the problem. :) http://hg.dovecot.org/dovecot-2.1/rev/6a97faf3e500
:( Mine still breaks. Both UTF-8 and Control-Char errors. Can you grab the network traffic between Dovecot and Solr and find the problematic stream?
Tell me how and I'll be happy to! Maybe the easiest would be to use tcpflow. It outputs different TCP streams to different files. From them you can then grep for the error and look closer into it. I guess something like wireshark would work too, but I've never been able to use its GUI in a useful way.
Would I just do "tcpflow -i lo port 8983"? Or something else?
-- Daniel
participants (2)
-
Daniel L. Miller
-
Timo Sirainen