tika 2.4.1 'Text extraction failed' errors when dovecot+fts 2.3.19.1 passes embedded *.eml (message/rfc822) files ; org.apache.tika.parser.mail.RFC822Parser or dovecot ?

PGNet Dev pgnet.dev at gmail.com
Mon Aug 1 13:51:32 UTC 2022


On 8/1/22 9:35 AM, Tim Allison wrote:
> This looks like zero-bytes are getting passed to Tika via dovecot.  I don't know enough about dovecot to figure out what's going on.

ok.  let's see what response from Dovecot ML.

atm, it's only in the case of submissions with attached/embedded *.eml ...

> On Sat, Jul 30, 2022 at 7:51 PM PGNet Dev <pgnet.dev at gmail.com <mailto:pgnet.dev at gmail.com>> wrote:
> 
>     i'm running
> 
>              dovecot 2.3.19.1 + fts
>              tika-server-standard 2.4.1
> 
>     dovecot is feeding tika backend via fts_tika
> 
>     when dovecot passes data with *.eml attachments embedded, tika fails to correctly parse/extract content
> 
>     not clear if the issue is with tika, or what dovecot's passing in this case.
> 
>     other non-.eml attachments are fine.
> 
>     here's the current failing procedure,
> 
>     (1)
>     create a simple pdf
> 
>              enscript -p mime.ps <http://mime.ps> /etc/mime.types
>              ps2pdf mime.ps <http://mime.ps> mime.pdf
> 
>     (2)
>     send an email *with* mime.pdf attachment to
> 
>              echo "test" | mailx -s "test" -a ./mime.pdf testuser at example.com <mailto:testuser at example.com>
> 
>     tika processes OK
> 
>              journalctl -f -u tika
>                      ...
>                      Jul 30 19:09:24 mx-test tika[19682]: INFO  [qtp2112135199-30] 19:09:24,165 org.apache.tika.server.core.resource.TikaResource /tika (application/pdf)
>                      ...
> 
>     save the just-received email with .pdf attachment as mime.eml
> 
>     (3)
>     send an email with NO .pdf attachment
>     save the just-received email with .pdf attachment as mime2.eml
> 
>     (4)
>     send an email with mime.eml attachment, containing the embedded mime.pdf
> 
>              echo "test" | mailx -s "test" -a ./mime.eml testuser at example.com <mailto:testuser at example.com>
> 
>     tika fails to extract message/rfc822
> 
>              journalctl -f -u tika | grep -v StatusLogger
>                      ...
>                      Jul 30 19:28:00 mx-test tika[20049]: INFO  [qtp2112135199-30] 19:28:00,834 org.apache.tika.server.core.resource.TikaResource /tika (message/rfc822)
>                      Jul 30 19:28:00 mx-test tika[20049]: WARN  [qtp2112135199-30] 19:28:00,840 org.apache.tika.server.core.resource.TikaResource tika/: Text extraction failed (mime.eml)
>                      Jul 30 19:28:00 mx-test tika[20049]: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:153) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:152) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.tika.parser.DigestingParser.parse(DigestingParser.java:55) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.tika.server.core.resource.TikaResource.parse(TikaResource.java:352) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.tika.server.core.resource.TikaResource.lambda$produceText$1(TikaResource.java:502) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.jaxrs.provider.BinaryDataProvider.writeTo(BinaryDataProvider.java:177) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.jaxrs.utils.JAXRSUtils.writeMessageBody(JAXRSUtils.java:1616) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.serializeMessage(JAXRSOutInterceptor.java:249) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.processResponse(JAXRSOutInterceptor.java:122) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.handleMessage(JAXRSOutInterceptor.java:84) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.Server.handle(Server.java:516) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.io <http://org.eclipse.jetty.io>.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.io <http://org.eclipse.jetty.io>.FillInterest.fillable(FillInterest.java:105) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.io <http://org.eclipse.jetty.io>.ChannelEndPoint$1.run(ChannelEndPoint.java:104) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:00 mx-test tika[20049]:         at java.lang.Thread.run(Thread.java:833) ~[?:?]
>                      Jul 30 19:28:00 mx-test tika[20049]: ERROR [qtp2112135199-30] 19:28:00,845 org.apache.cxf.jaxrs.utils.JAXRSUtils Problem with writing the data, class org.apache.tika.server.core.resource.TikaResource$$Lambda$338/0x0000000800eb4a38, ContentType: text/plain
> 
>     (5)
>     send an email with mime2.eml attachment, WITHOUT an embedded .pdf
> 
>              echo "test" | mailx -s "test" -a ./mime.eml testuser at example.com <mailto:testuser at example.com>
> 
>     again, tika fails to extract message/rfc822
> 
>              journalctl -f -u tika | grep -v StatusLogger
>                      ...
>                      Jul 30 19:28:33 mx-test tika[20049]: INFO  [qtp2112135199-30] 19:28:33,607 org.apache.tika.server.core.resource.TikaResource /tika (message/rfc822)
>                      Jul 30 19:28:33 mx-test tika[20049]: WARN  [qtp2112135199-30] 19:28:33,616 org.apache.tika.server.core.resource.TikaResource tika/: Text extraction failed (mime2.eml)
>                      Jul 30 19:28:33 mx-test tika[20049]: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:153) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:152) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.tika.parser.DigestingParser.parse(DigestingParser.java:55) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.tika.server.core.resource.TikaResource.parse(TikaResource.java:352) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.tika.server.core.resource.TikaResource.lambda$produceText$1(TikaResource.java:502) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.jaxrs.provider.BinaryDataProvider.writeTo(BinaryDataProvider.java:177) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.jaxrs.utils.JAXRSUtils.writeMessageBody(JAXRSUtils.java:1616) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.serializeMessage(JAXRSOutInterceptor.java:249) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.processResponse(JAXRSOutInterceptor.java:122) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.handleMessage(JAXRSOutInterceptor.java:84) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.Server.handle(Server.java:516) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.io <http://org.eclipse.jetty.io>.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.io <http://org.eclipse.jetty.io>.FillInterest.fillable(FillInterest.java:105) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.io <http://org.eclipse.jetty.io>.ChannelEndPoint$1.run(ChannelEndPoint.java:104) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) ~[tika-server-standard-2.4.1.jar:2.4.1]
>                      Jul 30 19:28:33 mx-test tika[20049]:         at java.lang.Thread.run(Thread.java:833) ~[?:?]
>                      Jul 30 19:28:33 mx-test tika[20049]: ERROR [qtp2112135199-30] 19:28:33,630 org.apache.cxf.jaxrs.utils.JAXRSUtils Problem with writing the data, class org.apache.tika.server.core.resource.TikaResource$$Lambda$338/0x0000000800eb4a38, ContentType: text/plain
> 
>     (6)
>     submit mime.eml directly to tika
> 
>              curl -T ./mime.eml http://127.0.0.1:9998/tika <http://127.0.0.1:9998/tika>
>              journalctl -f -u tika | grep -v StatusLogger
>                      ...
>                      Jul 30 19:30:08 mx-test tika[20049]: INFO  [qtp2112135199-34] 19:30:08,073 org.apache.tika.server.core.resource.TikaResource /tika (autodetecting type)
> 
>     (7)
>     submit mime2.eml directly to tika
> 
>              curl -T ./mime2.eml http://127.0.0.1:9998/tika <http://127.0.0.1:9998/tika>
>              journalctl -f -u tika | grep -v StatusLogger
>                      ...
>                      Jul 30 19:30:52 mx-test tika[20049]: INFO  [qtp2112135199-30] 19:30:52,349 org.apache.tika.server.core.resource.TikaResource /tika (autodetecting type)
> 
>     (8)
>     where,
> 
>              cat mime.eml
>                      Return-Path: <msmtp at pgnd.example.com <mailto:msmtp at pgnd.example.com>>
>                      Delivered-To: testuser at example.com <mailto:testuser at example.com>
>                      ...
>                      From: msmtp at pgnd.example.com <mailto:msmtp at pgnd.example.com>
>                      Date: Sat, 30 Jul 2022 18:53:38 -0400
>                      To: testuser at example.com <mailto:testuser at example.com>
>                      Subject: test
>                      User-Agent: Heirloom mailx 12.5 7/5/10
>                      Content-Type: multipart/mixed;
>                       boundary="=_62e5b672.wAyBX+sGMbS7ZcNv8O/A1QeYuseaJ2NDRf8hfdbm/x8Vayp+"
>                      Message-Id: <4LwKS35QWSzWf3Q at mx-test.example.com <mailto:4LwKS35QWSzWf3Q at mx-test.example.com>>
> 
>                      This is a multi-part message in MIME format.
> 
>                      --=_62e5b672.wAyBX+sGMbS7ZcNv8O/A1QeYuseaJ2NDRf8hfdbm/x8Vayp+
>                      Content-Type: text/plain; charset=us-ascii
>                      Content-Transfer-Encoding: 7bit
>                      Content-Disposition: inline
> 
>                      test
> 
>                      --=_62e5b672.wAyBX+sGMbS7ZcNv8O/A1QeYuseaJ2NDRf8hfdbm/x8Vayp+
>                      Content-Type: application/pdf
>                      Content-Transfer-Encoding: base64
>                      Content-Disposition: attachment;
>                       filename="mime.pdf"
> 
>                      JVBERi0xLjQKJcfsj6IKJSVJbnZvY2F0aW9uOiBwYXRoL2dzIC1QLSAtZFNBRkVSIC1kQ29t
>                      ...
>                      Rgo=
> 
>                      --=_62e5b672.wAyBX+sGMbS7ZcNv8O/A1QeYuseaJ2NDRf8hfdbm/x8Vayp+--
> 
>     and,
> 
>              cat mime2.eml
>                      Return-Path: <msmtp at pgnd.example.com <mailto:msmtp at pgnd.example.com>>
>                      Delivered-To: testuser at example.com <mailto:testuser at example.com>
>                      ...
>                      From: msmtp at pgnd.example.com <mailto:msmtp at pgnd.example.com>
>                      Date: Sat, 30 Jul 2022 19:14:59 -0400
>                      To: testuser at example.com <mailto:testuser at example.com>
>                      Subject: test
>                      User-Agent: Heirloom mailx 12.5 7/5/10
>                      Content-Type: text/plain; charset=us-ascii
>                      Content-Transfer-Encoding: 7bit
>                      Message-Id: <4LwKwh5brVzWf3Q at mx-test.example.com <mailto:4LwKwh5brVzWf3Q at mx-test.example.com>>
> 
>                      test
> 



More information about the dovecot mailing list