Bug: Dovecot index loosing sync with FTS despite "fts_autoindex = yes"

kfx kadafax at gmail.com
Mon May 28 13:28:59 EEST 2018


On 24/05/2018 15:33, Timo Sirainen wrote:
> On 21 May 2018, at 14.11, kadafax at gmail.com wrote:
>>
>> Le 21/05/2018 à 12:38, Aki Tuomi a écrit :
>>> can you try turning on pluign { fts_enforced = yes } and repeat your test?
>>
>> Same (wrong) result:
>> 1. Send an email with "too6Ouka" in the body
>>
>> 2. Search against "too6Ouka":
>> # doveadm search -u username mailbox INBOX body too6Ouka
>> --> No result
>>
>> 3. Force re-index:
>> # doveadm fts rescan -u username
>>
>> 4. Search again against "too6Ouka":
>> # doveadm search -u username mailbox INBOX body too6Ouka
>> --> e09cce0283e8695ab760000002deed92 29055
>>
>> Don't know if relevant, but on a side note, if I send a second message
>> with "too6Ouka" in the body, followed by:
>> # doveadm -v index -u username Inbox
>> --> doveadm(username): Info: INBOX: Cache is already up to date
>> And a search against the pattern immediately return only one result
>> instead of two.
> 
> So it looks like after "doveadm fts rescan" you can do initial indexing successfully. Afterwards the index isn't updated at all, unless you rebuild it entirely. fts_autoindex=yes appears to work as expected, because after fts rescan you did a search which automatically updated the index. So this doesn't have anything to do with fts_autoindex, just that your index updates don't seem to work at all.
> 
> What does it show in the header if you do:
> doveadm dump /var/vmail/username/dovecot.index

Detected file type: index
-- INDEX: /var/vmail/username/dovecot.index
version .................. = 7.3
base header size ......... = 120
header size .............. = 640
record size .............. = 16
compat flags ............. = 1
index id ................. = 1527164208 (2018-05-24 14:16:48)
flags .................... = 0
uid validity ............. = 1516890243 (2018-01-25 15:24:03)
next uid ................. = 30104
messages count ........... = 17068
seen messages count ...... = 17011
deleted messages count ... = 0
first recent uid ......... = 30104
first unseen uid lowwater  = 393
first deleted uid lowwater = 30087
log file seq ............. = 14
log file tail offset ..... = 28144
log file head offset ..... = 28144
log2 rotate time ......... = 1527487933 (2018-05-28 08:12:13)
last temp file scan ...... = 0 (1970-01-01 01:00:00)
day stamp ................ = 1527458400 (2018-05-28 00:00:00)
day first uid[0] ......... = 30037
day first uid[1] ......... = 29921
day first uid[2] ......... = 29814
day first uid[3] ......... = 29669
day first uid[4] ......... = 2
day first uid[5] ......... = 0
day first uid[6] ......... = 0
day first uid[7] ......... = 0
-- Extension 0 --
name ........ = maildir
hdr_size .... = 36
reset_id .... = 0
record_offset = 0
record_size . = 0
record_align  = 0
header
  - new_check_time .... = 2018-05-28 11:48:14
  - new_mtime ......... = 2018-05-28 11:47:59
  - new_mtime_nsecs ... = 129602175
  - cur_check_time .... = 2018-05-28 11:48:14
  - cur_mtime ......... = 2018-05-28 11:48:01
  - cur_mtime_nsecs.... = 277650283
  - uidlist_mtime ..... = 2018-05-28 11:47:58
  - uidlist_mtime_nsecs = 521588557
  - uidlist_size ...... = 1334026
-- Extension 1 --
name ........ = keywords
hdr_size .... = 292
reset_id .... = 0
record_offset = 5
record_size . = 3
record_align  = 1
-- Extension 2 --
name ........ = hdr-vsize
hdr_size .... = 16
reset_id .... = 0
record_offset = 0
record_size . = 0
record_align  = 8
header
  - highest uid . = 0
  - message count = 0
  - vsize ....... = 0
-- Extension 3 --
name ........ = cache
hdr_size .... = 0
reset_id .... = 1527164233
record_offset = 8
record_size . = 4
record_align  = 4
-- Extension 4 --
name ........ = vsize
hdr_size .... = 0
reset_id .... = 0
record_offset = 12
record_size . = 4
record_align  = 4
-- Extension 5 --
name ........ = fts
hdr_size .... = 12
reset_id .... = 0
record_offset = 0
record_size . = 0
record_align  = 0
header
  - last_indexed_uid ..... = 30103
  - settings_checksum .... = 0
-- Keywords --
   0 = unknown-1
   1 = unknown-2
   2 = unknown-5
   3 = unknown-8
   4 = unknown-6
   5 = unknown-3
   6 = unknown-7
   7 = unknown-0
   8 = NonJunk
   9 = $Forwarded
  10 = $label1
  11 = $label2
  12 = Junk
  13 = $label3
  14 = $label4


> 
> Especially what is in the "fts" header vs. "next uid" header? Does the UID in "fts" header keep changing every time you save a new mail? I suppose it will. 

Diff between 2 emails:
next uid = 30104    |    next uid = 30105
last_indexed_uid = 30103    |    last_indexed_uid = 30104


> You could also monitor (e.g. tcpdump/wireshark) the network traffic between Dovecot <-> Solr what happens when a new mail arrives. I suspect Dovecot sends it to Solr, which for whatever reason just ignores the update.

################### TCPDUMP ####################################
POST /solr/dovecot/update HTTP/1.1
Host: localhost:8983
Date: Mon, 28 May 2018 10:18:05 GMT
Transfer-Encoding: chunked
Connection: Keep-Alive
Content-Type: text/xml

<add><doc><field name="uid">37581</field><field 
name="box">e0c58a3093235153110800003ea484a8</field><field 
name="user">username</field><field 
name="id">37581/e0c58a3093235153110800003ea484a8/username</field><field 
name="body">Search Pattern: Kai8oovi

--
username


</field><field name="from">"username" <username at mydomain>
</field><field name="hdr">From: "username" <username at mydomain>
Subject: New search test: 4
To: "username" <username at mydomain>
Message-ID: <86442b87-aaad-c6bc-86f6-a74551a564c6 at mydomain>
Date: Mon, 28 May 2018 12:17:41 +0200
User-Agent:  _
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: fr-xx-classique+reforme1990
Content-Transfer-Encoding: 7bit
</field><field name="subject">New search test: 4
</field><field name="to">"username" <username at mydomain>
</field></doc></add>

######### SOLR'S RESPONSE ###############
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
   <int name="status">0</int>
   <int name="QTime">0</int>
</lst>
</response>



################### /TCPDUMP ####################################



More information about the dovecot mailing list