FTS-lucene errors : language not available for stemming
I'm getting some log errors with clucene that I am having no luck tracking down on the interwebs.
Errors:
May 19 05:05:16 indexer-worker(gessel@blackrosetech.com)<62971>
Config:
FreeBSD 11.3-RELEASE-p8 #0 r360490
dovecot-2.3.10_3
clucene-2.3.3.4_19
py37-pystemmer-2.0.0.1
py37-snowballstemmer-1.2.1
icu-67.1,1
plugin { #setting_name = value expire = Trash mail_log_events = delete undelete expunge copy mailbox_delete mailbox_rename mail_log_fields = uid box msgid size fts_autoindex=yes #zlib_save_level = 6 # 1..9 #zlib_save = gz # or bz2 }
plugin { fts = lucene # Lucene-specific settings, good ones are: fts_lucene = whitespace_chars=@. mime_parts }
I am considering switch to xapian (solr and java... pls noe) as the port is quite tempting from an ease of integration perspective, but the easiest solution would be to resolve these odd indexing errors. Anyone have a clue?
-David
On 19.05.20 15:15, David Gessel wrote:
I'm getting some log errors with clucene that I am having no luck tracking down on the interwebs.
Errors:
May 19 05:05:16 indexer-worker(gessel@blackrosetech.com)<62971>
: Error: lucene index /mail/blackrosetech.com/gessel//lucene-indexes: IndexWriter::addDocument() failed (#4): language not available for stemming May 19 05:05:16 indexer-worker: Error: May 19 05:05:16 indexer-worker(gessel@blackrosetech.com)<62971> : Error: Mailbox Security: Mail search failed: Internal error occurred. Refer to server log for more information. [2020-05-19 05:05:16] May 19 05:05:16 indexer-worker(gessel@blackrosetech.com)<62971> : Error: Mailbox Security: Transaction commit failed: FTS transaction commit failed: transaction context (attempted to index 1 messages (UIDs 152736..152736)) Config:
FreeBSD 11.3-RELEASE-p8 #0 r360490
dovecot-2.3.10_3
clucene-2.3.3.4_19
py37-pystemmer-2.0.0.1
py37-snowballstemmer-1.2.1
icu-67.1,1
plugin { #setting_name = value expire = Trash mail_log_events = delete undelete expunge copy mailbox_delete mailbox_rename mail_log_fields = uid box msgid size fts_autoindex=yes #zlib_save_level = 6 # 1..9 #zlib_save = gz # or bz2 }
plugin { fts = lucene # Lucene-specific settings, good ones are: fts_lucene = whitespace_chars=@. mime_parts }
I am considering switch to xapian (solr and java... pls noe) as the port is quite tempting from an ease of integration perspective, but the easiest solution would be to resolve these odd indexing errors. Anyone have a clue?
I ran into the same problem a few weeks back. The workaround I found was to add no_snowball to fts_lucene. It disables the snowball algorithm.
On 2020-05-19, David Gessel gessel@blackrosetech.com wrote:
I'm getting some log errors with clucene that I am having no luck tracking down on the interwebs.
This looks relevant:
https://www.mail-archive.com/dovecot@dovecot.org/msg66366.html
I am considering switch to xapian (solr and java... pls noe) as the port is quite tempting from an ease of integration perspective, but the easiest solution would be to resolve these odd indexing errors. Anyone have a clue?
dovecot-fts-xapian is easy to configure, but has a big downside compared to solr in that the indexer runs as root.
On 2020/05/19 17:04, Aki Tuomi wrote:
On 19/05/2020 16:48 Stuart Henderson <stu@spacehopper.org> wrote: On 2020-05-19, David Gessel <gessel@blackrosetech.com> wrote: I'm getting some log errors with clucene that I am having no luck tracking down on the interwebs. This looks relevant: https://www.mail-archive.com/dovecot@dovecot.org/msg66366.html I am considering switch to xapian (solr and java... pls noe) as the port is quite tempting from an ease of integration perspective, but the easiest solution would be to resolve these odd indexing errors. Anyone have a clue? dovecot-fts-xapian is easy to configure, but has a big downside compared to solr in that the indexer runs as root.
Dovecot indexer does not run as root.
Aki Tuomi
It does in the not entirely uncommon case where you have setup dovecot-fts-xapian, have multiple system users rather than a single uid owning all mailboxes, and need to index all mailboxes.
PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND 44468 root 2 0 11M 18M sleep/6 netio 1:41 68.36% doveadm index -A *
With solr the indexing is done out-of-process and typically under a safe uid.
On 2020-05-19 16:28, Aki Tuomi wrote:
Also if you were looking carefully what happens, you'd notice dovecot calls seteuid() before actually doing the indexing work.
would be more sense in to have a dovecot shell that all commands must be issued in, like postgresql, setuid is nice still but its not well known it happends the same as apache starts as root for port under 1024 but after start a fork that is not running as root
On 2020-05-19 16:18, Stuart Henderson wrote:
It does in the not entirely uncommon case where you have setup dovecot-fts-xapian, have multiple system users rather than a single uid owning all mailboxes, and need to index all mailboxes.
PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND 44468 root 2 0 11M 18M sleep/6 netio 1:41 68.36% doveadm index -A *
With solr the indexing is done out-of-process and typically under a safe uid.
no doveconf -n, no problem
if you really have a bug it should be solved
i think you could
su --user=non-root-user doveadm index -A
to not have it run as root, it works well for fangfrisch, dovecot should not allow commands as root without thinking of consequences
On 2020-05-19 16:48, Stuart Henderson wrote:
On 2020-05-19, David Gessel gessel@blackrosetech.com wrote:
I'm getting some log errors with clucene that I am having no luck tracking down on the interwebs. This looks relevant:
https://www.mail-archive.com/dovecot@dovecot.org/msg66366.html
Thanks Stuart & Jan - no_snowball seems to have cleared up the errors.
relevant config now reads:
plugin { fts = lucene # Lucene-specific settings, good ones are: fts_lucene = whitespace_chars=@. mime_parts no_snowball }
May 20 04:40:50 indexer-worker(gessel@blackrosetech.com)<26130>rw0HDUIXxV6KBwEA0J78UA:4CgkD0IXxV4SZgAA0J78UA: Error: lucene index /mail/blackrosetech.com/gessel//lucene-indexes: IndexWriter::addDocument() failed (#4): language not available for stemming May 20 04:40:50 indexer-worker: Error: May 20 04:40:50 indexer-worker(gessel@blackrosetech.com)<26130>rw0HDUIXxV6KBwEA0J78UA:4CgkD0IXxV4SZgAA0J78UA: Error: Mailbox Lists.Spamassassin: Mail search failed: Internal error occurred. Refer to server log for more information. [2020-05-20 04:40:50] May 20 04:40:50 indexer-worker(gessel@blackrosetech.com)<26130>rw0HDUIXxV6KBwEA0J78UA:4CgkD0IXxV4SZgAA0J78UA: Error: Mailbox Lists.Spamassassin: Transaction commit failed: FTS transaction commit failed: transaction context (attempted to index 2 messages (UIDs 7..8)) May 20 04:45:05 master: Warning: Killed with signal 15 (by pid=81740 uid=0 code=kill) May 20 04:46:39 indexer-worker(gessel@blackrosetech.com)<87087><5jtvLp8YxV4tVAEA0J78UA:NexHM58YxV4vVAEA0J78UA>: Warning: fts-lucene: Settings have changed, rebuilding index for mailbox
(no further errors, various mailboxes being indexed.)
I am considering switch to xapian (solr and java... pls noe) as the port is quite tempting from an ease of integration perspective, but the easiest solution would be to resolve these odd indexing errors. Anyone have a clue? dovecot-fts-xapian is easy to configure, but has a big downside compared to solr in that the indexer runs as root.
-David
participants (6)
-
Aki Tuomi
-
Benny Pedersen
-
David Gessel
-
Jan Bramkamp
-
Joan Moreau
-
Stuart Henderson