Race condition when setting flags (\Deleted) + expunge quickly, leaving mails not deleted

Tomas Forsman stric at cs.umu.se
Sun Apr 22 18:55:28 EEST 2018


On 21 March, 2018 - Aki Tuomi wrote:

> Thank you for your thorough report, we'll look into it.

Has anyone managed to reproduce this (using my transcript for example)?

With mutt, I get this problem.. If I set 'imap_pipeline_depth=0' in
.muttrc, I can't seem to reproduce it anymore.

/Tomas
-- 
Tomas Forsman, stric at cs.umu.se, http://people.cs.umu.se/stric/
`- SysAdmin at Computing Science, University of Umeå

> Aki
> 
> 
> On 20.03.2018 16:56, Tomas Forsman wrote:
> > Hello.
> >
> > I seem to have found a race condition, when setting flags on multiple emails
> > rapidly. 5 commands including login to reproduce. Problem found using mutt in
> > real world usage.
> >
> > Seems to happen both with UID STORE 1:3 and UID STORE 1,2,3 ..
> >
> > I have tried with the following packages, with a minimized config in a
> > throwaway vm:
> > https://packages.debian.org/stretch/dovecot-core
> > Package: dovecot-core (1:2.2.27-3+deb9u2) 
> >
> > https://packages.debian.org/stretch-backports/dovecot-core
> > Package: dovecot-core (1:2.2.34-2~bpo9+1) 
> >
> > Also tested pristine:
> > https://dovecot.org/releases/2.3/dovecot-2.3.0.1.tar.gz
> > https://dovecot.org/releases/2.2/dovecot-2.2.34.tar.gz
> > https://dovecot.org/releases/2.2/dovecot-2.2.35.tar.gz
> > .. built on Debian9 with:
> > # apt-get build-dep dovecot-core
> > # bash configure --prefix=/scratch/dovecot-test
> >
> > Tried with both mailbox over NFS (backed by ext4), and also mailbox on ext4.
> >
> > On my test systems, I am only able to reproduce it with mbox_lazy_writes=no,
> > but with production settings on a Ubuntu 14.04 with 1:2.2.9-1ubuntu2.4 that has
> > mbox_lazy_writes=yes it is still reproducable with mutt.
> >
> > I have found two different ways to reproduce it with netcat + cut'n'paste,
> > below is done with 2.2.34/35.
> >
> > All servers are VMs in Ganeti/kvm clusters (Debian hosts in one
> > cluster&organization, Ubuntu host in another cluster&organization).
> >
> >
> > Debian minimized config:
> >> doveconf -n
> > # 2.2.34 (874deae): /etc/dovecot/dovecot.conf
> > # Pigeonhole version 0.4.22 (22940fb7)
> > # OS: Linux 4.9.0-6-amd64 x86_64 Debian 9.4 
> > # Hostname: mail2-vm.cs.umu.se
> > auth_mechanisms = plain login
> > auth_verbose = yes
> > debug_log_path = syslog
> > info_log_path = syslog
> > mail_location = mbox:~/Mail:INBOX=/var/mail/%u
> > mail_privileged_group = mail
> > mbox_lazy_writes = no
> > namespace inbox {
> >   inbox = yes
> >   location = 
> >   prefix = 
> > }
> > passdb {
> >   driver = pam
> > }
> > protocols = imap
> > service auth {
> >   client_limit = 2500
> >   unix_listener auth-client {
> >     group = postfix
> >     mode = 0660
> >   }
> > }
> > service imap-login {
> >   inet_listener imap {
> >     port = 143
> >   }
> > }
> > service imap {
> >   executable = imap
> > }
> > ssl = no
> > syslog_facility = local1
> > userdb {
> >   driver = passwd
> > }
> >
> >
> > Attached is a mailbox that can be used to reproduce the problem, called
> > "error-box", anonymized with mbox-anonymize.pl. It was created by sending test
> > messages with mutt 1.7.2 through Dovecot/Postfix/Amavis.
> >
> >
> >
> > **** Method 1, with IDLE:
> >
> > Setup a local account, MY-USER / MY-PASSWORD (replace below).
> >
> >> mkdir Mail
> >> cp error-box Mail/error1
> >> nc -v localhost 143
> > Paste the following 3 lines in a go:
> > a0000 LOGIN "MY-USER" "MY-PASSWORD"
> > a0010 SELECT "error1"
> > a0020 IDLE
> >
> > then wait a second.. then paste the following 3 in a go:
> > DONE
> > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > a0040 EXPUNGE
> >
> > Notice that the results from UID STORE + EXPUNGE gives:
> > * 2 FETCH (UID 2 FLAGS (\Recent))
> > * 3 FETCH (UID 3 FLAGS (\Recent))
> > * 1 EXPUNGE
> > * 3 RECENT
> > a0030 OK Store completed (0.004 + 0.000 + 0.003 secs).
> > a0040 OK Expunge completed (0.004 + 0.000 + 0.003 secs).
> >
> >
> > Verify what's left in the mailbox with:
> > a0050 UID FETCH 1:4 FLAGS
> > * 1 FETCH (UID 2 FLAGS (\Recent))
> > * 2 FETCH (UID 3 FLAGS (\Recent))
> > * 3 FETCH (UID 4 FLAGS (\Recent))
> > a0050 OK Fetch completed (0.001 + 0.000 secs).
> >
> >
> > Since 1 to 3 should be \Deleted and then EXPUNGEd, we should only have UID 4
> > left, not uid 2-4 without Deleted flag.
> >
> >
> > Try same again, but with 1 inserted delay (use a new mailbox name for every
> > test, otherwise UID numbering will vary, indexes etc):
> >> cp error-box Mail/error2
> >> nc -v localhost 143
> > Paste the following 3 lines in a go:
> > a0000 LOGIN "MY-USER" "MY-PASSWORD"
> > a0010 SELECT "error2"
> > a0020 IDLE
> >
> > then wait a second.. then paste 1 line:
> > DONE
> >
> > then [newly introduced] wait a second.. then paste 2 lines:
> > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > a0040 EXPUNGE
> >
> > Notice that the results from UID STORE + EXPUNGE gives:
> > * 1 FETCH (UID 1 FLAGS (\Deleted \Recent))
> > * 2 FETCH (UID 2 FLAGS (\Deleted \Recent))
> > * 3 FETCH (UID 3 FLAGS (\Deleted \Recent))
> > a0030 OK Store completed (0.003 + 0.000 + 0.002 secs).
> > * 3 EXPUNGE
> > * 2 EXPUNGE
> > * 1 EXPUNGE
> > * 1 RECENT
> > a0040 OK Expunge completed (0.003 + 0.000 + 0.002 secs).
> >
> >
> > Verify what's left in the mailbox with:
> > a0050 UID FETCH 1:4 FLAGS
> > * 1 FETCH (UID 4 FLAGS (\Recent))
> > a0050 OK Fetch completed (0.001 + 0.000 secs).
> >
> >
> > So if I wait some between DONE and UID STORE, it seems to be ok.
> >
> >
> >
> > If I, in the problem-triggering method, change
> > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > a0040 EXPUNGE
> > to
> > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > a0035 UID FETCH 1:4 FLAGS
> > a0040 EXPUNGE
> >
> > The untagged responses become:
> > * 1 FETCH (UID 1 FLAGS (\Deleted \Recent))
> > * 2 FETCH (UID 2 FLAGS (\Deleted \Recent))
> > * 3 FETCH (UID 3 FLAGS (\Deleted \Recent))
> > * 4 FETCH (UID 4 FLAGS (\Recent))
> > * 2 FETCH (UID 2 FLAGS (\Recent))
> > * 3 FETCH (UID 3 FLAGS (\Recent))
> > * 1 EXPUNGE
> > * 3 RECENT
> >
> >
> >
> >
> >
> > Full transcript (in&out from screen) from first, failing, run:
> >
> > localhost [127.0.0.1] 143 (imap2) open
> > * OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE AUTH=PLAIN AUTH=LOGIN] Dovecot (Debian) ready.
> > a0000 LOGIN "MY-USERNAME" "MY-PASSWORD"
> > a0010 SELECT "error1"
> > a0020 IDLE
> > a0000 OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS BINARY MOVE SNIPPET=FUZZY] Logged in
> > * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
> > * OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft \*)] Flags permitted.
> > * 4 EXISTS
> > * 4 RECENT
> > * OK [UNSEEN 1] First unseen.
> > * OK [UIDVALIDITY 1520944853] UIDs valid
> > * OK [UIDNEXT 5] Predicted next UID
> > a0010 OK [READ-WRITE] Select completed (0.003 + 0.000 + 0.004 secs).
> > + idling
> > DONE
> > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > a0040 EXPUNGE
> > a0020 OK Idle completed (7.805 + 7.804 + 7.804 secs).
> > * 2 FETCH (UID 2 FLAGS (\Recent))
> > * 3 FETCH (UID 3 FLAGS (\Recent))
> > * 1 EXPUNGE
> > * 3 RECENT
> > a0030 OK Store completed (0.004 + 0.000 + 0.003 secs).
> > a0040 OK Expunge completed (0.004 + 0.000 + 0.003 secs).
> > a0050 UID FETCH 1:4 FLAGS
> > * 1 FETCH (UID 2 FLAGS (\Recent))
> > * 2 FETCH (UID 3 FLAGS (\Recent))
> > * 3 FETCH (UID 4 FLAGS (\Recent))
> > a0050 OK Fetch completed (0.001 + 0.000 secs).
> >
> >
> >
> > Full transcript (in&out from screen) from the second, slower, run:
> >
> > localhost [127.0.0.1] 143 (imap2) open
> > * OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE AUTH=PLAIN AUTH=LOGIN] Dovecot (Debian) ready.
> > a0000 LOGIN "MY-USERNAME" "MY-PASSWORD"
> > a0010 SELECT "error2"
> > a0020 IDLE
> > a0000 OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS BINARY MOVE SNIPPET=FUZZY] Logged in
> > * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
> > * OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft \*)] Flags permitted.
> > * 4 EXISTS
> > * 4 RECENT
> > * OK [UNSEEN 1] First unseen.
> > * OK [UIDVALIDITY 1520944957] UIDs valid
> > * OK [UIDNEXT 5] Predicted next UID
> > a0010 OK [READ-WRITE] Select completed (0.003 + 0.000 + 0.004 secs).
> > + idling
> > DONE
> > a0020 OK Idle completed (2.830 + 2.829 + 2.829 secs).
> > a0030 UID STORE 1:3 +FLAGS (\Deleted)
> > a0040 EXPUNGE
> > * 1 FETCH (UID 1 FLAGS (\Deleted \Recent))
> > * 2 FETCH (UID 2 FLAGS (\Deleted \Recent))
> > * 3 FETCH (UID 3 FLAGS (\Deleted \Recent))
> > a0030 OK Store completed (0.003 + 0.000 + 0.002 secs).
> > * 3 EXPUNGE
> > * 2 EXPUNGE
> > * 1 EXPUNGE
> > * 1 RECENT
> > a0040 OK Expunge completed (0.003 + 0.000 + 0.002 secs).
> > a0050 UID FETCH 1:4 FLAGS
> > * 1 FETCH (UID 4 FLAGS (\Recent))
> > a0050 OK Fetch completed (0.001 + 0.000 secs).
> >
> >
> >
> >
> > Transcript from .muttdebug0 when using mutt 1.7.2 against the production server
> > with NFS backed storage, marking message 1-3 for deletion and hitting $ for
> > sync. This is how the problem was first found.
> >
> > [2018-03-12 13:46:28] 5> a0020 IDLE^M
> > [2018-03-12 13:46:28] 5< + idling
> > ...
> > [2018-03-12 13:46:29] Marking 3 messages deleted...
> > [2018-03-12 13:46:29] Expunging messages from server...
> > [2018-03-12 13:46:29] 5> DONE^M
> > a0021 UID STORE 1:3 +FLAGS.SILENT (\Deleted)^M
> > a0022 EXPUNGE^M
> > [2018-03-12 13:46:29] 5< a0020 OK Idle completed (0.001 + 1.634 + 1.633 secs).
> > [2018-03-12 13:46:29] 5< * 2 FETCH (FLAGS (\Recent))
> > [2018-03-12 13:46:29] Handling FETCH
> > [2018-03-12 13:46:29] FETCH response ignored for this message
> > [2018-03-12 13:46:29] 5< * 3 FETCH (FLAGS (\Recent))
> > [2018-03-12 13:46:29] Handling FETCH
> > [2018-03-12 13:46:29] FETCH response ignored for this message
> > [2018-03-12 13:46:29] 5< * 1 EXPUNGE
> > [2018-03-12 13:46:29] Handling EXPUNGE
> > [2018-03-12 13:46:29] 5< * 3 RECENT
> > [2018-03-12 13:46:29] 5< a0021 OK Store completed (0.001 + 0.000 + 0.001 secs).
> > [2018-03-12 13:46:29] 5< a0022 OK Expunge completed (0.001 + 0.000 secs).
> > [2018-03-12 13:46:29] IMAP queue drained
> > [2018-03-12 13:46:29] imap_cmd_finish: Expunging mailbox
> >
> >
> >
> > *** Method 2, without IDLE (seems to require that I fetch flags before the wait/store)
> >
> > Copy'n'paste the following:
> > a0000 LOGIN "MY-USER" "MY-PASSWORD"
> > a0010 SELECT "error20"
> > a0015 FETCH 1:4 (UID FLAGS)
> > <wait a second, then paste the next two in one go>
> > a0020 UID STORE 1:3 +FLAGS (\Deleted)
> > a0025 EXPUNGE
> >
> >
> > Gives the following transcript:
> >
> > mail3-vm:~/Mail> nc -v localhost 143
> > localhost [127.0.0.1] 143 (imap2) open
> > * OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE AUTH=PLAIN AUTH=LOGIN] Dovecot ready.
> > a0000 LOGIN "MY-USER" "MY-PASSWORD"
> > a0010 SELECT "error20"
> > a0015 FETCH 1:4 (UID FLAGS)
> > a0000 OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS BINARY MOVE SNIPPET=FUZZY] Logged in
> > * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
> > * OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft \*)] Flags permitted.
> > * 4 EXISTS
> > * 4 RECENT
> > * OK [UNSEEN 1] First unseen.
> > * OK [UIDVALIDITY 1521552107] UIDs valid
> > * OK [UIDNEXT 5] Predicted next UID
> > a0010 OK [READ-WRITE] Select completed (0.001 + 0.000 + 0.001 secs).
> > * 1 FETCH (UID 1 FLAGS (\Recent))
> > * 2 FETCH (UID 2 FLAGS (\Recent))
> > * 3 FETCH (UID 3 FLAGS (\Recent))
> > * 4 FETCH (UID 4 FLAGS (\Recent))
> > a0015 OK Fetch completed (0.001 + 0.000 secs).
> > a0020 UID STORE 1:3 +FLAGS (\Deleted)
> > a0025 EXPUNGE
> > * 2 FETCH (UID 2 FLAGS (\Recent))
> > * 3 FETCH (UID 3 FLAGS (\Recent))
> > * 1 EXPUNGE
> > * 3 RECENT
> > a0020 OK Store completed (0.001 + 0.000 secs).
> > a0025 OK Expunge completed (0.001 + 0.000 secs).
> >
> >
> >
> > /Tomas
> 


More information about the dovecot mailing list