doveadm-deduplicate deletes non-duplicates

gravitini dovecot at gravitini.com
Sun Jun 12 23:09:10 UTC 2022


Replying to: https://dovecot.org/pipermail/dovecot/2022-May/124816.html


Hi,

Looking at the code (and tested via local build from source) it looks 
like doveadm deduplicate in 2.3.19 can cause significant data loss.

A 2022-02-11 commit removed key duplication resulting in undefined 
behaviour which is often truncation of a mailbox to 67 entries. 
(HASH_TABLE_MIN_SIZE)

https://github.com/dovecot/core/commit/320844f50cd669b602d30210e2e5216f65d2050f?diff=split#diff-5842cf9d4248dc515d80ebb45575341b7d76832f979a8ac5f602784cb5b03f2cL121

diff --git a/src/doveadm/doveadm-mail-deduplicate.c 
b/src/doveadm/doveadm-mail-deduplicate.c

index caec758112..2152482876 100644
--- a/src/doveadm/doveadm-mail-deduplicate.c
+++ b/src/doveadm/doveadm-mail-deduplicate.c
@@ -63,8 +63,10 @@ cmd_deduplicate_box(struct doveadm_mail_cmd_context 
*_ctx,
                 if (key != NULL && *key != '\0') {
                         if (hash_table_lookup(hash, key) != NULL)
                                 mail_expunge(mail);
-                       else
+                       else {
+                               key = p_strdup(pool, key);
                                 hash_table_insert(hash, key, 
POINTER_CAST(1));
+                       }
                 }
         }



More information about the dovecot mailing list