[Dovecot] "Header is huge" in fts-solr

Timo Sirainen tss at iki.fi
Fri Feb 22 14:53:46 EET 2013


On 5.2.2013, at 15.58, Valery V. Sedletski <valerius at afterlogic.com> wrote:

> Hi, Timo and all!
> 
> I am trying to index mail in a test mailbox using fts_solr plugin for
> full-text search. On most mailboxes, it works fine, but on some big
> messages I get
> warnings like the following, and then I get an Out of memory error from
> Solr, then the indexer-worker process (or doveadm) crashes with "assertion
> failed" error and the backtrace:
> 
> ==========================================================
> doveadm(valerius at test.afterlogic.com): Warning:
> fts-solr(valerius at test.afterlogic.com): Mailbox gmail.com UID=48 header
> size is huge

I'm not sure why Solr would become out of memory. If it handles huge message bodies then I don't really see why it couldn't handle huge headers..

> doveadm(valerius at test.afterlogic.com): Panic: file
> ../../../../src/plugins/fts-solr/solr-connection.c: line 548
> (solr_connection_post_more): assertion failed: (maxfd >= 0)

This is hopefully fixed by v2.2, which uses its own lib-http instead of libcurl (which I'm apparently not using correctly).

> So, it seems that Dovecot tries to parse messages in the mailbox, and can't
> correctly determine where the message header ends. So, it thinks that the
> message header is big, and passes very big data to Solr. When trying to
> index it, Solr exhausts the available memory (though, I have 8 Gb of RAM on
> my machine, and java eats more than 2 Gb when indexing). Then connections
> to Solr get closed, and maxfd is invalid, hence the assertion is failed.
> 
> Note also the following error
> 
> ==========================================================
> SEVERE: org.apache.solr.common.SolrException: undefined field text
> ==========================================================
> 
> before an out of memory error.

I don't know about that one.

> I also tried to tweak the decode2text.sh script to ignore all attachments
> bigger than 1 Mb (just test if the file is bigger than 1 Mb, and if so,
> return "1"). This won't help. As I understood, this is because of big
> header, so attachments doesn't matter.

Yes.

> I separated the set of messages which cause this error (by their UID's).
> So, I can give them as a testcase, the size of them all in archive is about
> 40 Mb. The error can be reproduced if put all these messages into an empty
> mailbox, and do reindexing, via IMAP search, or via "doveadm index -u  ".

Is it really a message with huge header? Also MIME headers are counted as headers.

Anyway, http://hg.dovecot.org/dovecot-2.1/rev/0a932ba1f01f hopefully helps?




More information about the dovecot mailing list