Race condition when setting flags (\Deleted) + expunge quickly, leaving mails not deleted

Tomas Forsman stric at cs.umu.se
Tue Jun 12 15:27:38 EEST 2018


On 22 April, 2018 - Tomas Forsman wrote:

> On 21 March, 2018 - Aki Tuomi wrote:
> 
> > Thank you for your thorough report, we'll look into it.
> 
> Has anyone managed to reproduce this (using my transcript for example)?

Anyone? I would guess more people are affected..

/Tomas
-- 
Tomas Forsman, stric at cs.umu.se, http://people.cs.umu.se/stric/
`- SysAdmin at Computing Science, University of Umeå

> With mutt, I get this problem.. If I set 'imap_pipeline_depth=0' in
> .muttrc, I can't seem to reproduce it anymore.
> 
> /Tomas
> -- 
> Tomas Forsman, stric at cs.umu.se, http://people.cs.umu.se/stric/
> `- SysAdmin at Computing Science, University of Umeå
> 
> > Aki
> > 
> > 
> > On 20.03.2018 16:56, Tomas Forsman wrote:
> > > Hello.
> > >
> > > I seem to have found a race condition, when setting flags on multiple emails
> > > rapidly. 5 commands including login to reproduce. Problem found using mutt in
> > > real world usage.
> > >
> > > Seems to happen both with UID STORE 1:3 and UID STORE 1,2,3 ..
> > >
> > > I have tried with the following packages, with a minimized config in a
> > > throwaway vm:
> > > https://packages.debian.org/stretch/dovecot-core
> > > Package: dovecot-core (1:2.2.27-3+deb9u2) 
> > >
> > > https://packages.debian.org/stretch-backports/dovecot-core
> > > Package: dovecot-core (1:2.2.34-2~bpo9+1) 
> > >
> > > Also tested pristine:
> > > https://dovecot.org/releases/2.3/dovecot-2.3.0.1.tar.gz
> > > https://dovecot.org/releases/2.2/dovecot-2.2.34.tar.gz
> > > https://dovecot.org/releases/2.2/dovecot-2.2.35.tar.gz
> > > .. built on Debian9 with:
> > > # apt-get build-dep dovecot-core
> > > # bash configure --prefix=/scratch/dovecot-test
> > >
> > > Tried with both mailbox over NFS (backed by ext4), and also mailbox on ext4.
> > >
> > > On my test systems, I am only able to reproduce it with mbox_lazy_writes=no,
> > > but with production settings on a Ubuntu 14.04 with 1:2.2.9-1ubuntu2.4 that has
> > > mbox_lazy_writes=yes it is still reproducable with mutt.
> > >
> > > I have found two different ways to reproduce it with netcat + cut'n'paste,
> > > below is done with 2.2.34/35.
> > >
> > > All servers are VMs in Ganeti/kvm clusters (Debian hosts in one
> > > cluster&organization, Ubuntu host in another cluster&organization).
> > >
> > >
> > > Debian minimized config:
> > >> doveconf -n
> > > # 2.2.34 (874deae): /etc/dovecot/dovecot.conf
> > > # Pigeonhole version 0.4.22 (22940fb7)
> > > # OS: Linux 4.9.0-6-amd64 x86_64 Debian 9.4 
> > > # Hostname: mail2-vm.cs.umu.se
> > > auth_mechanisms = plain login
> > > auth_verbose = yes
> > > debug_log_path = syslog
> > > info_log_path = syslog
> > > mail_location = mbox:~/Mail:INBOX=/var/mail/%u
> > > mail_privileged_group = mail
> > > mbox_lazy_writes = no
> > > namespace inbox {
> > >   inbox = yes
> > >   location = 
> > >   prefix = 
> > > }
> > > passdb {
> > >   driver = pam
> > > }
> > > protocols = imap
> > > service auth {
> > >   client_limit = 2500
> > >   unix_listener auth-client {
> > >     group = postfix
> > >     mode = 0660
> > >   }
> > > }
> > > service imap-login {
> > >   inet_listener imap {
> > >     port = 143
> > >   }
> > > }
> > > service imap {
> > >   executable = imap
> > > }
> > > ssl = no
> > > syslog_facility = local1
> > > userdb {
> > >   driver = passwd
> > > }
> > >
> > >
> > > Attached is a mailbox that can be used to reproduce the problem, called
> > > "error-box", anonymized with mbox-anonymize.pl. It was created by sending test
> > > messages with mutt 1.7.2 through Dovecot/Postfix/Amavis.
> > >
> > >
> > >
> > > **** Method 1, with IDLE:
> > >
> > > Setup a local account, MY-USER / MY-PASSWORD (replace below).
> > >
> > >> mkdir Mail
> > >> cp error-box Mail/error1
> > >> nc -v localhost 143
> > > Paste the following 3 lines in a go:
> > > a0000 LOGIN "MY-USER" "MY-PASSWORD"
> > > a0010 SELECT "error1"
> > > a0020 IDLE
> > >
> > > then wait a second.. then paste the following 3 in a go:
> > > DONE
> > > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > > a0040 EXPUNGE
> > >
> > > Notice that the results from UID STORE + EXPUNGE gives:
> > > * 2 FETCH (UID 2 FLAGS (\Recent))
> > > * 3 FETCH (UID 3 FLAGS (\Recent))
> > > * 1 EXPUNGE
> > > * 3 RECENT
> > > a0030 OK Store completed (0.004 + 0.000 + 0.003 secs).
> > > a0040 OK Expunge completed (0.004 + 0.000 + 0.003 secs).
> > >
> > >
> > > Verify what's left in the mailbox with:
> > > a0050 UID FETCH 1:4 FLAGS
> > > * 1 FETCH (UID 2 FLAGS (\Recent))
> > > * 2 FETCH (UID 3 FLAGS (\Recent))
> > > * 3 FETCH (UID 4 FLAGS (\Recent))
> > > a0050 OK Fetch completed (0.001 + 0.000 secs).
> > >
> > >
> > > Since 1 to 3 should be \Deleted and then EXPUNGEd, we should only have UID 4
> > > left, not uid 2-4 without Deleted flag.
> > >
> > >
> > > Try same again, but with 1 inserted delay (use a new mailbox name for every
> > > test, otherwise UID numbering will vary, indexes etc):
> > >> cp error-box Mail/error2
> > >> nc -v localhost 143
> > > Paste the following 3 lines in a go:
> > > a0000 LOGIN "MY-USER" "MY-PASSWORD"
> > > a0010 SELECT "error2"
> > > a0020 IDLE
> > >
> > > then wait a second.. then paste 1 line:
> > > DONE
> > >
> > > then [newly introduced] wait a second.. then paste 2 lines:
> > > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > > a0040 EXPUNGE
> > >
> > > Notice that the results from UID STORE + EXPUNGE gives:
> > > * 1 FETCH (UID 1 FLAGS (\Deleted \Recent))
> > > * 2 FETCH (UID 2 FLAGS (\Deleted \Recent))
> > > * 3 FETCH (UID 3 FLAGS (\Deleted \Recent))
> > > a0030 OK Store completed (0.003 + 0.000 + 0.002 secs).
> > > * 3 EXPUNGE
> > > * 2 EXPUNGE
> > > * 1 EXPUNGE
> > > * 1 RECENT
> > > a0040 OK Expunge completed (0.003 + 0.000 + 0.002 secs).
> > >
> > >
> > > Verify what's left in the mailbox with:
> > > a0050 UID FETCH 1:4 FLAGS
> > > * 1 FETCH (UID 4 FLAGS (\Recent))
> > > a0050 OK Fetch completed (0.001 + 0.000 secs).
> > >
> > >
> > > So if I wait some between DONE and UID STORE, it seems to be ok.
> > >
> > >
> > >
> > > If I, in the problem-triggering method, change
> > > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > > a0040 EXPUNGE
> > > to
> > > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > > a0035 UID FETCH 1:4 FLAGS
> > > a0040 EXPUNGE
> > >
> > > The untagged responses become:
> > > * 1 FETCH (UID 1 FLAGS (\Deleted \Recent))
> > > * 2 FETCH (UID 2 FLAGS (\Deleted \Recent))
> > > * 3 FETCH (UID 3 FLAGS (\Deleted \Recent))
> > > * 4 FETCH (UID 4 FLAGS (\Recent))
> > > * 2 FETCH (UID 2 FLAGS (\Recent))
> > > * 3 FETCH (UID 3 FLAGS (\Recent))
> > > * 1 EXPUNGE
> > > * 3 RECENT
> > >
> > >
> > >
> > >
> > >
> > > Full transcript (in&out from screen) from first, failing, run:
> > >
> > > localhost [127.0.0.1] 143 (imap2) open
> > > * OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE AUTH=PLAIN AUTH=LOGIN] Dovecot (Debian) ready.
> > > a0000 LOGIN "MY-USERNAME" "MY-PASSWORD"
> > > a0010 SELECT "error1"
> > > a0020 IDLE
> > > a0000 OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS BINARY MOVE SNIPPET=FUZZY] Logged in
> > > * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
> > > * OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft \*)] Flags permitted.
> > > * 4 EXISTS
> > > * 4 RECENT
> > > * OK [UNSEEN 1] First unseen.
> > > * OK [UIDVALIDITY 1520944853] UIDs valid
> > > * OK [UIDNEXT 5] Predicted next UID
> > > a0010 OK [READ-WRITE] Select completed (0.003 + 0.000 + 0.004 secs).
> > > + idling
> > > DONE
> > > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > > a0040 EXPUNGE
> > > a0020 OK Idle completed (7.805 + 7.804 + 7.804 secs).
> > > * 2 FETCH (UID 2 FLAGS (\Recent))
> > > * 3 FETCH (UID 3 FLAGS (\Recent))
> > > * 1 EXPUNGE
> > > * 3 RECENT
> > > a0030 OK Store completed (0.004 + 0.000 + 0.003 secs).
> > > a0040 OK Expunge completed (0.004 + 0.000 + 0.003 secs).
> > > a0050 UID FETCH 1:4 FLAGS
> > > * 1 FETCH (UID 2 FLAGS (\Recent))
> > > * 2 FETCH (UID 3 FLAGS (\Recent))
> > > * 3 FETCH (UID 4 FLAGS (\Recent))
> > > a0050 OK Fetch completed (0.001 + 0.000 secs).
> > >
> > >
> > >
> > > Full transcript (in&out from screen) from the second, slower, run:
> > >
> > > localhost [127.0.0.1] 143 (imap2) open
> > > * OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE AUTH=PLAIN AUTH=LOGIN] Dovecot (Debian) ready.
> > > a0000 LOGIN "MY-USERNAME" "MY-PASSWORD"
> > > a0010 SELECT "error2"
> > > a0020 IDLE
> > > a0000 OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS BINARY MOVE SNIPPET=FUZZY] Logged in
> > > * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
> > > * OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft \*)] Flags permitted.
> > > * 4 EXISTS
> > > * 4 RECENT
> > > * OK [UNSEEN 1] First unseen.
> > > * OK [UIDVALIDITY 1520944957] UIDs valid
> > > * OK [UIDNEXT 5] Predicted next UID
> > > a0010 OK [READ-WRITE] Select completed (0.003 + 0.000 + 0.004 secs).
> > > + idling
> > > DONE
> > > a0020 OK Idle completed (2.830 + 2.829 + 2.829 secs).
> > > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > > a0040 EXPUNGE
> > > * 1 FETCH (UID 1 FLAGS (\Deleted \Recent))
> > > * 2 FETCH (UID 2 FLAGS (\Deleted \Recent))
> > > * 3 FETCH (UID 3 FLAGS (\Deleted \Recent))
> > > a0030 OK Store completed (0.003 + 0.000 + 0.002 secs).
> > > * 3 EXPUNGE
> > > * 2 EXPUNGE
> > > * 1 EXPUNGE
> > > * 1 RECENT
> > > a0040 OK Expunge completed (0.003 + 0.000 + 0.002 secs).
> > > a0050 UID FETCH 1:4 FLAGS
> > > * 1 FETCH (UID 4 FLAGS (\Recent))
> > > a0050 OK Fetch completed (0.001 + 0.000 secs).
> > >
> > >
> > >
> > >
> > > Transcript from .muttdebug0 when using mutt 1.7.2 against the production server
> > > with NFS backed storage, marking message 1-3 for deletion and hitting $ for
> > > sync. This is how the problem was first found.
> > >
> > > [2018-03-12 13:46:28] 5> a0020 IDLE^M
> > > [2018-03-12 13:46:28] 5< + idling
> > > ...
> > > [2018-03-12 13:46:29] Marking 3 messages deleted...
> > > [2018-03-12 13:46:29] Expunging messages from server...
> > > [2018-03-12 13:46:29] 5> DONE^M
> > > a0021 UID STORE 1:3 +FLAGS.SILENT (\Deleted)^M
> > > a0022 EXPUNGE^M
> > > [2018-03-12 13:46:29] 5< a0020 OK Idle completed (0.001 + 1.634 + 1.633 secs).
> > > [2018-03-12 13:46:29] 5< * 2 FETCH (FLAGS (\Recent))
> > > [2018-03-12 13:46:29] Handling FETCH
> > > [2018-03-12 13:46:29] FETCH response ignored for this message
> > > [2018-03-12 13:46:29] 5< * 3 FETCH (FLAGS (\Recent))
> > > [2018-03-12 13:46:29] Handling FETCH
> > > [2018-03-12 13:46:29] FETCH response ignored for this message
> > > [2018-03-12 13:46:29] 5< * 1 EXPUNGE
> > > [2018-03-12 13:46:29] Handling EXPUNGE
> > > [2018-03-12 13:46:29] 5< * 3 RECENT
> > > [2018-03-12 13:46:29] 5< a0021 OK Store completed (0.001 + 0.000 + 0.001 secs).
> > > [2018-03-12 13:46:29] 5< a0022 OK Expunge completed (0.001 + 0.000 secs).
> > > [2018-03-12 13:46:29] IMAP queue drained
> > > [2018-03-12 13:46:29] imap_cmd_finish: Expunging mailbox
> > >
> > >
> > >
> > > *** Method 2, without IDLE (seems to require that I fetch flags before the wait/store)
> > >
> > > Copy'n'paste the following:
> > > a0000 LOGIN "MY-USER" "MY-PASSWORD"
> > > a0010 SELECT "error20"
> > > a0015 FETCH 1:4 (UID FLAGS)
> > > <wait a second, then paste the next two in one go>
> > > a0020 UID STORE 1:3 +FLAGS (\Deleted)
> > > a0025 EXPUNGE
> > >
> > >
> > > Gives the following transcript:
> > >
> > > mail3-vm:~/Mail> nc -v localhost 143
> > > localhost [127.0.0.1] 143 (imap2) open
> > > * OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE AUTH=PLAIN AUTH=LOGIN] Dovecot ready.
> > > a0000 LOGIN "MY-USER" "MY-PASSWORD"
> > > a0010 SELECT "error20"
> > > a0015 FETCH 1:4 (UID FLAGS)
> > > a0000 OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS BINARY MOVE SNIPPET=FUZZY] Logged in
> > > * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
> > > * OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft \*)] Flags permitted.
> > > * 4 EXISTS
> > > * 4 RECENT
> > > * OK [UNSEEN 1] First unseen.
> > > * OK [UIDVALIDITY 1521552107] UIDs valid
> > > * OK [UIDNEXT 5] Predicted next UID
> > > a0010 OK [READ-WRITE] Select completed (0.001 + 0.000 + 0.001 secs).
> > > * 1 FETCH (UID 1 FLAGS (\Recent))
> > > * 2 FETCH (UID 2 FLAGS (\Recent))
> > > * 3 FETCH (UID 3 FLAGS (\Recent))
> > > * 4 FETCH (UID 4 FLAGS (\Recent))
> > > a0015 OK Fetch completed (0.001 + 0.000 secs).
> > > a0020 UID STORE 1:3 +FLAGS (\Deleted)
> > > a0025 EXPUNGE
> > > * 2 FETCH (UID 2 FLAGS (\Recent))
> > > * 3 FETCH (UID 3 FLAGS (\Recent))
> > > * 1 EXPUNGE
> > > * 3 RECENT
> > > a0020 OK Store completed (0.001 + 0.000 secs).
> > > a0025 OK Expunge completed (0.001 + 0.000 secs).
> > >
> > >
> > >
> > > /Tomas
> > 


More information about the dovecot mailing list