Hello,
Is anyone running Dovecot 2.4.4 with multilingual FTS flatcurve on Debian 12/13, using the packages from Dovecot.org? I suspect a packaging/build issue related to language detection. It looks like libexttextcat is not included in the Debian build.
Simple reproduction with doveadm: doveadm fts tokenize -u localpart@domain 'This is the house. Das ist das Haus‘ —> Warning: Can't detect its language - assuming de ……
My suspicion: If the "libexttextcat-dev" package was not installed during the "configure" step of the package creation process, then "HAVE_LANG_EXTTEXTCAT" will not be set.
And if the define "HAVE_LANG_EXTTEXTCAT" is not set during the compilation of "src/lib-language/language.c", then "language_textcat_init" always returns "LANGUAGE_DETECT_RESULT_UNKNOWN;". Consequently, doveadm invariably translates this return value into the message: "Warning: Can't detect its language - assuming $DEFAULT-LANG".
If I download the source code from Dovecot.org and build the dovecot vendor package myself, "language detection" works—provided that "libexttextcat-dev" has been installed beforehand.
My simple test to check whether the installed version contains „libexttextcat“: (My build finds references to libexttextcat; the build from Dovecot.org finds none.)
nm --print-file-name --dynamic --undefined-only /usr/lib/x86_64-linux-gnu/dovecot/* | grep textcat —> /usr/lib/x86_64-linux-gnu/dovecot/libdovecot-language.so.0.0.0: U special_textcat_Init —> /usr/lib/x86_64-linux-gnu/dovecot/libdovecot-storage.so.0.0.0: U textcat_ReleaseClassifyFullOutput
Background: In my self-built package, exttextcat-related symbols appear in both libdovecot-language and libdovecot-storage; I assume this is expected due to the build/link layout.
Regards, Jens
Hello, Is anyone running Dovecot 2.4.4 with multilingual FTS flatcurve on Debian 12/13, using the packages from Dovecot.org? I suspect a packaging/build issue related to language detection. It looks like libexttextcat is not included in the Debian build. Simple reproduction with doveadm: doveadm fts tokenize -u localpart@domain 'This is the house. Das ist das HausaEUR~ aEUR"> Warning: Can't detect its language - assuming de aEUR|aEUR| My suspicion: If the "libexttextcat-dev" package was not installed during the "configure" step of the package creation process, then "HAVE_LANG_EXTTEXTCAT" will not be set. And if the define "HAVE_LANG_EXTTEXTCAT" is not set during the compilation of "src/lib-language/language.c", then "language_textcat_init" always returns "LANGUAGE_DETECT_RESULT_UNKNOWN;". Consequently, doveadm invariably translates this return value into the message: "Warning: Can't detect its language - assuming $DEFAULT-LANG". If I download the source code from Dovecot.org and build the dovecot vendor package myself, "language detection" worksaEUR"provided that "libexttextcat-dev" has been installed beforehand. My simple test to check whether the installed version contains aEURzlibexttextcataEURoe: (My build finds references to libexttextcat; the build from Dovecot.org finds none.) nm --print-file-name --dynamic --undefined-only /usr/lib/x86_64-linux-gnu/dovecot/* | grep textcat aEUR"> /usr/lib/x86_64-linux-gnu/dovecot/libdovecot-language.so.0.0.0: U special_textcat_Init aEUR"> /usr/lib/x86_64-linux-gnu/dovecot/libdovecot-storage.so.0.0.0: U textcat_ReleaseClassifyFullOutput Background: In my self-built package, exttextcat-related symbols appear in both libdovecot-language and libdovecot-storage; I assume this is expected due to the build/link layout. Regards, Jens