doveadm-deduplicate deletes non-duplicates
Hi, I've been trying to use `doveadm deduplicate` to deduplicate mailboxes. According to doveadm-deduplicate(1), "deduplication will be done by message GUIDs". However, deduplication deletes messages with distinct message GUIDs, i.e., it deletes messages that are not duplicates. Is this a case of user error, do I have some form of corruption going on, or am I running into a bug? In case it helps, I'm including: 1) the list of GUIDs for messages in my INBOX before a deduplication run (as documented in doveadm-deduplicate(1)), 2) the output of `doveadm -D deduplicate -u rak mailbox INBOX`, 3) the list of GUIDs after deduplication, 4) a diff of (1) and (3), 5) the output of doveconf -n. Thanks, Ryan $ doveadm -f table fetch -u rak 'guid uid' mailbox INBOX | sort 0b3bee1414118f6282db0000226807b0 7100 0b4b681c18168f6220940000226807b0 7104 0b83ac1d79cf2962eb3a0100226807b0 6153 0b97791e83108f6282db0000226807b0 7095 1614451513.M180742P77875.hades.rak.ac,S=3516,W=3600 22 1614451513.M180779P77875.hades.rak.ac,S=5252,W=5370 52 1614452870.M315137P88362.hades.rak.ac,S=5516,W=5623 68 1614452870.M315152P88362.hades.rak.ac,S=5977,W=6085 69 23a5a0179e108f6282db0000226807b0 7097 33318e0686108f6282db0000226807b0 7096 3351581c9c0e8f624cd50000226807b0 7091 3359032360108f6282db0000226807b0 7093 3b3927352be48c62b65b0000226807b0 7072 3ba2252814118f62a2540100226807b0 7101 4154be0c2fc77761c2550000226807b0 4050 5b3a7327d73b866299cd0000226807b0 7013 638e073255235d62a5280100226807b0 6660 63b6422e14118f6282db0000226807b0 7102 8b9b942909118f6282db0000226807b0 7099 8bc95639a6168f62ad4e0000226807b0 7105 a316ee28980d8f627c460000226807b0 7090 ab802c1e37ac3d6214680100226807b0 6343 b3d12f21cd108f6282db0000226807b0 7098 bb89431b6e728d6230920000226807b0 7076 c386310f062adb61d1980000226807b0 5306 cb56992484478d62f3090100226807b0 7074 cb79bf3503128f6275220100226807b0 7103 eb2818367d108f62484b0100a558518d 7094 eb709020e70f8f62292e0000226807b0 7092 f19a8c06409a7d61271c0000226807b0 4128 f338ad1ff65e8562ff390100226807b0 7007 f9cd03363c457c613a0e0000226807b0 4107 guid uid $ doveadm -D deduplicate -u rak mailbox INBOX Debug: Loading modules from directory: /usr/local/lib/dovecot/doveadm Debug: Skipping module doveadm_acl_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) Debug: Skipping module doveadm_quota_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) Debug: Module loaded: /usr/local/lib/dovecot/doveadm/lib10_doveadm_sieve_plugin.so Debug: Skipping module doveadm_fts_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) Debug: Skipping module doveadm_fts_flatcurve_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) Debug: Skipping module doveadm_mail_crypt_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) May 31 12:50:35 Debug: Loading modules from directory: /usr/local/lib/dovecot May 31 12:50:35 Debug: Module loaded: /usr/local/lib/dovecot/lib15_notify_plugin.so May 31 12:50:35 Debug: Module loaded: /usr/local/lib/dovecot/lib20_replication_plugin.so May 31 12:50:35 Debug: Module loaded: /usr/local/lib/dovecot/lib20_virtual_plugin.so May 31 12:50:35 Debug: Loading modules from directory: /usr/local/lib/dovecot/doveadm May 31 12:50:35 Debug: Skipping module doveadm_acl_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) May 31 12:50:35 Debug: Skipping module doveadm_quota_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) May 31 12:50:35 Debug: Skipping module doveadm_fts_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) May 31 12:50:35 Debug: Skipping module doveadm_fts_flatcurve_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) May 31 12:50:35 Debug: Skipping module doveadm_mail_crypt_plugin, because dlopen() failed: Cannot load specified object (this is usually intentional, so just ignore this message) May 31 12:50:35 doveadm(rak)<74469><>: Debug: auth-master: userdb lookup(rak): Started userdb lookup May 31 12:50:35 doveadm(rak)<74469><>: Debug: auth-master: conn unix:/var/dovecot/auth-userdb: Connecting May 31 12:50:35 doveadm(rak)<74469><>: Debug: auth-master: conn unix:/var/dovecot/auth-userdb (pid=6157,uid=0): Client connected (fd=9) May 31 12:50:35 doveadm(rak)<74469><>: Debug: auth-master: userdb lookup(rak): auth USER input: rak uid=1002 gid=1002 home=/home/vmail/rak /etc/mail/auth May 31 12:50:35 doveadm(rak)<74469><>: Debug: auth-master: userdb lookup(rak): Finished userdb lookup (username=rak uid=1002 gid=1002 home=/home/vmail/rak /etc/mail/auth) May 31 12:50:35 doveadm(rak)<74469><>: Debug: Unknown userdb setting: plugin//etc/mail/auth=yes May 31 12:50:35 doveadm(rak)<74469><PLteAltHlmLlIgEAImgHsA>: Debug: Effective uid=1002, gid=1002, home=/home/vmail/rak May 31 12:50:35 doveadm(rak)<74469><PLteAltHlmLlIgEAImgHsA>: Debug: open(/proc/self/stat) failed: No such file or directory May 31 12:50:35 doveadm(rak)<74469><PLteAltHlmLlIgEAImgHsA>: Debug: open(/proc/self/io) failed: No such file or directory May 31 12:50:35 doveadm(rak)<74469><PLteAltHlmLlIgEAImgHsA>: Debug: Namespace inbox: type=private, prefix=, sep=/, inbox=yes, hidden=no, list=yes, subscriptions=yes location=maildir:~/mail:LAYOUT=fs:INDEX=~/indexes May 31 12:50:35 doveadm(rak)<74469><PLteAltHlmLlIgEAImgHsA>: Debug: fs: root=/home/vmail/rak/mail, index=/home/vmail/rak/indexes, indexpvt=, control=, inbox=/home/vmail/rak/mail, alt= May 31 12:50:35 doveadm(rak)<74469><PLteAltHlmLlIgEAImgHsA>: Debug: Namespace virtual: type=private, prefix=Virtual/, sep=/, inbox=no, hidden=no, list=yes, subscriptions=yes location=virtual:/etc/dovecot/virtual:LAYOUT=fs:INDEX=~/virtual May 31 12:50:35 doveadm(rak)<74469><PLteAltHlmLlIgEAImgHsA>: Debug: fs: root=/etc/dovecot/virtual, index=/home/vmail/rak/virtual, indexpvt=, control=, inbox=, alt= May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: Mailbox opened May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7007: Expunge requested May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7013: Expunge requested May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7090: Expunge requested May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7091: Expunge requested May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7094: Expunge requested May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7096: Expunge requested May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7099: Expunge requested May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7101: Expunge requested May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7103: Expunge requested May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7007: Mail expunged May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7013: Mail expunged May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7090: Mail expunged May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7091: Mail expunged May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7094: Mail expunged May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7096: Mail expunged May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7099: Mail expunged May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7101: Mail expunged May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: UID 7103: Mail expunged May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: Purging (new file_seq=1654009655): Too many deleted records (9/23) May 31 12:50:35 doveadm(rak): Debug: Mailbox INBOX: Purging finished, file_seq changed 1654009654 -> 1654009655, size=75108 -> 53936, max_uid=7105 May 31 12:50:35 doveadm(rak): Debug: User session is finished May 31 12:50:35 doveadm(74469): Debug: auth-master: conn unix:/var/dovecot/auth-userdb (pid=6157,uid=0): Disconnected: Connection closed (fd=9) $ doveadm -f table fetch -u rak 'guid uid' mailbox INBOX | sort 0b3bee1414118f6282db0000226807b0 7100 0b4b681c18168f6220940000226807b0 7104 0b83ac1d79cf2962eb3a0100226807b0 6153 0b97791e83108f6282db0000226807b0 7095 1614451513.M180742P77875.hades.rak.ac,S=3516,W=3600 22 1614451513.M180779P77875.hades.rak.ac,S=5252,W=5370 52 1614452870.M315137P88362.hades.rak.ac,S=5516,W=5623 68 1614452870.M315152P88362.hades.rak.ac,S=5977,W=6085 69 23a5a0179e108f6282db0000226807b0 7097 3359032360108f6282db0000226807b0 7093 3b3927352be48c62b65b0000226807b0 7072 4154be0c2fc77761c2550000226807b0 4050 638e073255235d62a5280100226807b0 6660 63b6422e14118f6282db0000226807b0 7102 8bc95639a6168f62ad4e0000226807b0 7105 ab802c1e37ac3d6214680100226807b0 6343 b3d12f21cd108f6282db0000226807b0 7098 bb89431b6e728d6230920000226807b0 7076 c386310f062adb61d1980000226807b0 5306 cb56992484478d62f3090100226807b0 7074 eb709020e70f8f62292e0000226807b0 7092 f19a8c06409a7d61271c0000226807b0 4128 f9cd03363c457c613a0e0000226807b0 4107 guid uid $ diff -u /tmp/guid-prededuplicate.txt /tmp/guid-postdeduplicate.txt --- /tmp/guid-prededuplicate.txt Tue May 31 12:50:24 2022 +++ /tmp/guid-postdeduplicate.txt Tue May 31 12:51:00 2022 @@ -7,27 +7,18 @@ 1614452870.M315137P88362.hades.rak.ac,S=5516,W=5623 68 1614452870.M315152P88362.hades.rak.ac,S=5977,W=6085 69 23a5a0179e108f6282db0000226807b0 7097 -33318e0686108f6282db0000226807b0 7096 -3351581c9c0e8f624cd50000226807b0 7091 3359032360108f6282db0000226807b0 7093 3b3927352be48c62b65b0000226807b0 7072 -3ba2252814118f62a2540100226807b0 7101 4154be0c2fc77761c2550000226807b0 4050 -5b3a7327d73b866299cd0000226807b0 7013 638e073255235d62a5280100226807b0 6660 63b6422e14118f6282db0000226807b0 7102 -8b9b942909118f6282db0000226807b0 7099 8bc95639a6168f62ad4e0000226807b0 7105 -a316ee28980d8f627c460000226807b0 7090 ab802c1e37ac3d6214680100226807b0 6343 b3d12f21cd108f6282db0000226807b0 7098 bb89431b6e728d6230920000226807b0 7076 c386310f062adb61d1980000226807b0 5306 cb56992484478d62f3090100226807b0 7074 -cb79bf3503128f6275220100226807b0 7103 -eb2818367d108f62484b0100a558518d 7094 eb709020e70f8f62292e0000226807b0 7092 f19a8c06409a7d61271c0000226807b0 4128 -f338ad1ff65e8562ff390100226807b0 7007 f9cd03363c457c613a0e0000226807b0 4107 guid uid $ doveconf -n # 2.3.19 (b3ad6004dc): /etc/dovecot/dovecot.conf # Pigeonhole version 0.5.19 (4eae2f79) # OS: OpenBSD 7.1 amd64 # Hostname: hades.rak.ac default_vsz_limit = 128 M doveadm_password = # hidden, use -P to show it first_valid_uid = 1000 mail_attribute_dict = file:%h/dovecot-attributes mail_location = maildir:~/mail:LAYOUT=fs:INDEX=~/indexes mail_plugins = " notify replication virtual" managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext imapsieve vnd.dovecot.imapsieve mbox_write_locks = fcntl mmap_disable = yes namespace inbox { inbox = yes location = mailbox Archive { auto = subscribe special_use = \Archive } mailbox Bin { auto = subscribe special_use = \Trash } mailbox Drafts { auto = subscribe special_use = \Drafts } mailbox Sent { auto = subscribe special_use = \Sent } mailbox Spam { auto = subscribe special_use = \Junk } prefix = separator = / } namespace virtual { location = virtual:/etc/dovecot/virtual:LAYOUT=fs:INDEX=~/virtual mailbox All { auto = subscribe comment = All my messages special_use = \All } mailbox Flagged { auto = subscribe special_use = \Flagged } prefix = Virtual/ separator = / } passdb { args = /etc/mail/auth driver = passwd-file } plugin { fts = flatcurve fts_autoindex = yes fts_autoindex_exclude = \Trash \Junk fts_enforced = yes fts_languages = en fts_tokenizers = generic email-address imapsieve_mailbox1_before = file:/usr/local/dovecot/sieve/report-spam.sieve imapsieve_mailbox1_causes = COPY,APPEND imapsieve_mailbox1_name = Spam imapsieve_mailbox2_before = file:/usr/local/dovecot/sieve/report-ham.sieve imapsieve_mailbox2_causes = COPY imapsieve_mailbox2_from = Spam imapsieve_mailbox2_name = * imapsieve_mailbox3_before = file:/usr/local/dovecot/sieve/report-ham.sieve imapsieve_mailbox3_causes = COPY,APPEND imapsieve_mailbox3_name = RAK imapsieve_url = sieve://localhost/ mail_replica = tcps:eos.rak.ac:12507 plugin = fts fts_flatcurve replication_dsync_parameters = -d -l 30 -U -n "" sieve = file:~/sieve;active=~/.dovecot.sieve sieve_before = /usr/local/dovecot/sieve-before.d/ sieve_global_extensions = +vnd.dovecot.pipe +vnd.dovecot.environment sieve_pipe_bin_dir = /usr/local/dovecot/sieve-pipe sieve_pipe_socket_dir = sieve-pipe sieve_plugins = sieve_imapsieve sieve_extprograms } protocols = imap lmtp sieve replication_dsync_parameters = -d -l 30 -U -n "" service aggregator { fifo_listener replication-notify-fifo { mode = 0666 user = _dovecot } unix_listener replication-notify { user = _dovecot } } service auth { unix_listener auth-userdb { user = vmail } } service doveadm { inet_listener { port = 12507 ssl = yes } vsz_limit = 512 M } service imap-login { inet_listener imap { port = 0 } inet_listener imaps { port = 993 ssl = yes } } service imap { vsz_limit = 512 M } service indexer-worker { vsz_limit = 512 M } service lmtp { drop_priv_before_exec = yes process_min_avail = 5 } service managesieve-login { inet_listener sieve { port = 4190 } inet_listener sieve_deprecated { port = 2000 } } service replicator { process_min_avail = 1 unix_listener replicator-doveadm { mode = 0666 } } service stats { unix_listener stats-reader { user = vmail } unix_listener stats-writer { user = vmail } } ssl_client_ca_file = /etc/ssl/cert.pem ssl_dh = # hidden, use -P to show it userdb { args = uid=vmail gid=vmail home=/home/vmail/%u /etc/mail/auth driver = static } userdb { args = /etc/mail/auth driver = passwd-file } protocol lmtp { mail_plugins = " notify replication virtual sieve" } protocol lda { mail_plugins = " notify replication virtual sieve" } protocol imap { imap_metadata = yes mail_plugins = " notify replication virtual imap_sieve" } -- |)|/ Ryan Kavanagh | 4E46 9519 ED67 7734 268F |\|\ https://rak.ac | BD95 8F7B F8FC 4A11 C97A
participants (1)
-
Ryan Kavanagh