<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 9pt; font-family: Verdana,Geneva,sans-serif'>
<p>What about consedering linking Dovecot with Xapian librairies instead of going to nightmare Solr ?</p>
<p>https://xapian.org/features</p>
<div id="signature"> </div>
<p><br /></p>
<p id="reply-intro">On 2019-01-02 17:10, John Tulp wrote:</p>
<blockquote type="cite" style="padding: 0 0.4em; border-left: #1010ff 2px solid; margin: 0"><!-- html ignored --><!-- head ignored --><!-- meta ignored -->
<div class="pre" style="margin: 0; padding: 0; font-family: monospace">On Wed, 2019-01-02 at 00:59 -0800, M. Balridge wrote:
<blockquote type="cite" style="padding: 0 0.4em; border-left: #1010ff 2px solid; margin: 0">
<blockquote type="cite" style="padding: 0 0.4em; border-left: #1010ff 2px solid; margin: 0">The main problem is : After some time of indexing from Dovecot, Dovecot<br />returns errors (invalid SID, etc...) and Solr return "out of range<br />indexes" errors</blockquote>
<br />I've been watching the progress of this thread with no small concern, mainly<br />because I've been tasked with providing a server-side email search facility<br />with a budget and manpower level that comes down to mainly *1*, i.e., me.<br /><br />I was expecting, given the strongly worded language about "just use<br />lucene/SOLR" and "ignore squat", that I should invest time + effort into this<br />JAVA nightmare that is SOLR.<br /><br />I started with squat and another word-indexor system that used out-of-band<br />(not a dovecot plugin) software to provide rapid (sub-second) searches through<br />tens-of-GB-scale mailboxes.<br /><br />Unlike what I was led to believe, the squat indexes worked surprisingly well,<br />once you sorted out the odd resource size (ulimit-related) issues (vsz &<br />friends) limitations. I did notice the "worst-case" search performance have<br />worryingly high O(x) increases in time, but I'd not seen anything that was a<br />dealbreaker. It goes without saying that various substring searches worked as<br />expected, for the most part.<br /><br />My experiences with SOLR were similar to Messr. Moreau's: lots of startup<br />errors with provided schemata files. Lots of JAVA nonsense issues. Lots of<br />sensitivity to WHICH Java runtime, etc, etc. I finally fixated a specific JVM,<br />version of SOLR, and dovecot to find the "best" working combination, only to<br />find that the searches didn't work out as expected. I expected to be able to<br />do date-ranging based searches. Didn't work. I expected to search CONTENTS of<br />emails, and despite many days of tweaks, I couldn't get it to index even the<br />basics like filenames/types of attachments, so I could exposed<br />attachment-based searching to my users.<br /><br />So, without rancour or antipathy, I ask the entire list: has ANYONE gotten a<br />Dovecot/solr-fts-plugin setup to work that provides as a BASELINE, all of the<br />following functionality:<br /><br />1) The ability to search for a string within any of the structured fields<br />(from/subject) that returns correct results?<br /><br />2) The ability to search for any string within the BODY of emails, including<br />the MIME attachment boundaries?<br /><br />3) The ability to do "ranging" searches for structures within emails that<br />decompose to "dates" or other simple-numeric data?<br /><br />OPTIONALLY, and this is probably way outside of the scope of the above,<br />despite the fact that it's listed as a "selling point" of SOLR versus other<br />full text search engines:<br /><br />4) The ability to do searches against any attachments that are able to be<br />post-processed and hyper-indexed by SOLR+Tika?<br /><br />-------------<br /><br />SOLR seems to have "brand cachet", so presumably it actually works (for somebody).<br /><br />Dovecot has not a little "brand cachet", and for me, I have innate faith and<br />trust in Timo and his software. I am no stranger to the "costs" of "free"<br />software, in that you sacrifice your own blood, sweat, and tears just to get<br />these disparate pieces to work together.<br /><br />I *DO* respect that Timo has to keep the lights (and sauna) on in Finland.<br />Maybe there's a super-secret (no advertised prices, "carrier-only" price list)<br />with _Dovecot, Oy_ wherein the above ARE actually available for something less<br />than 6.022 x 10^23 Euros per centi-second of licencing fees.<br /><br />But please, level with us faithful users. Does this morass of Java B.S.<br />actually work, and if not, please just deprecate and remove this moribund<br />software, and stop trying to bury the only FTS plugin many of us HAVE actually<br />gotten to work. (Pretty please?)<br /><br />I respect that Messr. Moreau has made an earnest effort to get this JAVA B.S.<br />to actually work, as I have. <br /><br />He persevered where I'd given up. He's vocal about it, and now I'm chiming in<br />that this ornate collection of switchblades only cuts those who try to use them.<br /><br />Respectfully,<br />=M=<br /><br /></blockquote>
Fascinating...<br /><br />SOLR says the following are powered by SOLR...<br /><br /><a href="https://wiki.apache.org/solr/PublicServers" target="_blank" rel="noopener noreferrer">https://wiki.apache.org/solr/PublicServers</a><br /><br />Perhaps if you could find out from that list which of them are using<br />SOLR in conjunction with Dovecot...<br /><br />food for thought...<br /><br /><br /></div>
</blockquote>
</body></html>