Solr and FTS - assertion failure [proposed patch for upper bound on rows in solr search]

> in the latest weeks i'm working on the Solr integration and
> immediately i've faced the assertion failure errors, on 2.0.19, 2.2.9
> and servers in our network.
> Reading the thread on debian ML, I realize this issue is related to
> nested MIME and it affects large mailboxes
> In my case, the error in dovecot.log pairs with the following on
> solr.log and it seems the rows value has the same value of the last
> UID recorded in the mailbox. 
> For your reference, here is the Solr logs, where *2276996170* is the
> value passed by Dovecot as rows number and it clearly don't fit with
> the rows data type.
> Have you had experienced the same behaviour? Is there a workaround?
Whatever the reason for this happening, it would make sense not to
supply unbounded values to solr.

The "rows" value that is passed for a lookup on a single mailbox is the
value of the uidnext for the searched mailbox. For lookups on multiple
mailboxes there is a hard coded value:

#define SOLR_MAX_MULTI_ROWS 100000

If 100000 is a good maximum for lookups on multiple mailboxes it could
also be a good upper bound for lookups on single mailboxes too.

My proposed patch, which stops too large "rows" values going to solr is
as follows. This doesn't solve the issue of why uidnext is so large in
the first place for the specific mailbox. Nevertheless I think it makes
sense both as a potential workaround to the original issue and to
incorporate it as a safeguard. If the hard-coded value is too limiting,
it could be made configurable.

diff -ur dovecot-
--- dovecot-      
2020-08-12 14:20:41.000000000 +0200
+++ dovecot-   
2020-12-31 09:05:07.681897716 +0100
@@ -838,7 +838,7 @@

        str = t_str_new(256);
-                   status.uidnext);
+                   I_MIN(status.uidnext,SOLR_MAX_MULTI_ROWS));
        prefix_len = str_len(str);

        if (solr_add_definite_query_args(str, args, and_args)) {


