On 11/15/20 1:29 PM, John Fawcett wrote:
atm, listening on localhost, with Dovecot -> Tika direct, no proxy.
similarly fragile under load. throwing ~10 messages with .5-5MB attachments at it at once causes all sorts of complaints.
frequently, like this
Nov 15 15:59:40 test.loc tika[35696]: INFO tika/ (message/rfc822) Nov 15 15:59:41 test.loc tika[35696]: WARN tika/: Text extraction failed (null) Nov 15 15:59:41 test.loc tika[35696]: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes Nov 15 15:59:41 test.loc tika[35696]: at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.tika.server.resource.TikaResource$4.write(TikaResource.java:521) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.provider.BinaryDataProvider.writeTo(BinaryDataProvider.java:177) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.utils.JAXRSUtils.writeMessageBody(JAXRSUtils.java:1472) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.serializeMessage(JAXRSOutInterceptor.java:249) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.processResponse(JAXRSOutInterceptor.java:122) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.handleMessage(JAXRSOutInterceptor.java:84) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.Server.handle(Server.java:500) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) Nov 15 15:59:41 test.loc tika[35696]: at java.base/java.lang.Thread.run(Thread.java:832) Nov 15 15:59:41 test.loc tika[35696]: ERROR Problem with writing the data, class org.apache.tika.server.resource.TikaResource$4, ContentType: text/plain Nov 15 15:59:41 test.loc tika[35696]: INFO tika/ (message/rfc822) Nov 15 15:59:41 test.loc tika[35696]: WARN tika/: Text extraction failed (Tried to contact you | Quote #Q4889744.eml) Nov 15 15:59:41 test.loc tika[35696]: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes Nov 15 15:59:41 test.loc tika[35696]: at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.tika.server.resource.TikaResource$4.write(TikaResource.java:521) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.provider.BinaryDataProvider.writeTo(BinaryDataProvider.java:177) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.utils.JAXRSUtils.writeMessageBody(JAXRSUtils.java:1472) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.serializeMessage(JAXRSOutInterceptor.java:249) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.processResponse(JAXRSOutInterceptor.java:122) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.handleMessage(JAXRSOutInterceptor.java:84) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) Nov 15 15:59:41 test.loc tika[35696]: at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.Server.handle(Server.java:500) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) Nov 15 15:59:41 test.loc tika[35696]: at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) Nov 15 15:59:41 test.loc tika[35696]: at java.base/java.lang.Thread.run(Thread.java:832) Nov 15 15:59:41 test.loc tika[35696]: ERROR Problem with writing the data, class org.apache.tika.server.resource.TikaResource$4, ContentType: text/plain Nov 15 15:59:41 test.loc tika[35696]: INFO tika/ (image/jpeg) Nov 15 15:59:41 test.loc tika[35696]: INFO tika/ (image/png)
seems fts_tika isn't going to be a well-behaved black box.
pulling it out of dovecot usage for now, to setup a standalone instance and throw test attachments at it directly ...