doveadm deduplicate commands
Hello, I just migrated my emails from gmail using getmail. In the process I got some emails that have been doubled or tripled How do I run the doveadm command to delete copies of same emails?
I tried running the following:- doveadm deduplicate -u user@domain.net inbox
but I get error:- doveadm(root): Fatal: Unknown argument INBOX
Could someone share some way to automatically remove duplicated messages?
Thanks Kevin
On 13.2.2015 15:54, Kevin Laurie wrote:
Hello, I just migrated my emails from gmail using getmail. In the process I got some emails that have been doubled or tripled How do I run the doveadm command to delete copies of same emails?
I tried running the following:- doveadm deduplicate -u user@domain.net inbox
but I get error:- doveadm(root): Fatal: Unknown argument INBOX
Could someone share some way to automatically remove duplicated messages?
Thanks Kevin
See documentation - man doveadm-deduplicate and man doveadm-search-query should be enough. I'm guessing you're missing "MAILBOX" in the command.
Also when using deduplicate on my servers, it usually doesn't remove every duplicate on first run and needs to be executed repeatedly. Not sure if it's Debian version's (2.2.13) bug or an upstream one. YMMV
Dear Jiri Noted. Thanks
On Fri, Feb 13, 2015 at 11:15 PM, Jiri Bourek bourek@thinline.cz wrote:
On 13.2.2015 15:54, Kevin Laurie wrote:
Hello, I just migrated my emails from gmail using getmail. In the process I got some emails that have been doubled or tripled How do I run the doveadm command to delete copies of same emails?
I tried running the following:- doveadm deduplicate -u user@domain.net inbox
but I get error:- doveadm(root): Fatal: Unknown argument INBOX
Could someone share some way to automatically remove duplicated messages?
Thanks Kevin
See documentation - man doveadm-deduplicate and man doveadm-search-query should be enough. I'm guessing you're missing "MAILBOX" in the command.
Also when using deduplicate on my servers, it usually doesn't remove every duplicate on first run and needs to be executed repeatedly. Not sure if it's Debian version's (2.2.13) bug or an upstream one. YMMV
Dear Jiri,
I tried the following to try to get the inbox deduplicated. My inbox is quite large and urgently need to remove the duplicated messages. Is there an easy way to do this? Sorry for being so presistant but I need help.
The command I tried:-
doveadm deduplicate -u user@domain.net mailbox inbox
On Fri, Feb 13, 2015 at 11:40 PM, Kevin Laurie superinterstellar@gmail.com wrote:
Dear Jiri Noted. Thanks
On Fri, Feb 13, 2015 at 11:15 PM, Jiri Bourek bourek@thinline.cz wrote:
On 13.2.2015 15:54, Kevin Laurie wrote:
Hello, I just migrated my emails from gmail using getmail. In the process I got some emails that have been doubled or tripled How do I run the doveadm command to delete copies of same emails?
I tried running the following:- doveadm deduplicate -u user@domain.net inbox
but I get error:- doveadm(root): Fatal: Unknown argument INBOX
Could someone share some way to automatically remove duplicated messages?
Thanks Kevin
See documentation - man doveadm-deduplicate and man doveadm-search-query should be enough. I'm guessing you're missing "MAILBOX" in the command.
Also when using deduplicate on my servers, it usually doesn't remove every duplicate on first run and needs to be executed repeatedly. Not sure if it's Debian version's (2.2.13) bug or an upstream one. YMMV
On 13.2.2015 16:59, Kevin Laurie wrote:
Dear Jiri,
I tried the following to try to get the inbox deduplicated. My inbox is quite large and urgently need to remove the duplicated messages. Is there an easy way to do this? Sorry for being so presistant but I need help.
The command I tried:-
doveadm deduplicate -u user@domain.net mailbox inbox
I'd try this (in shell):
doveadm search -u user@domain.net mailbox inbox | wc -l
Doveadm will print mailbox-guid and uid pair for every message in inbox, one per line, pipe it into "wc -l", which will count the lines and output a number - that's the count of messages in INBOX.
Then try doveadm deduplicate and after that the search command above again. If the count changed, deduplicate is working - you may only need to run it multiple times.
If the count doesn't change, dovecot is unable to recognize duplicates in your mailbox and you need to find another solution. Maybe check out the "-m" option in man doveadm-deduplicate
Dear Jiri, Thanks for your feedback. Does not work. I guess its because of the usage of getmail. I had set it to read_all = true which downloaded all the messages several times. Will purge the entire box and use getmail again to move all messages. Thanks Kevin
On Sat, Feb 14, 2015 at 12:47 AM, Jiri Bourek bourek@thinline.cz wrote:
On 13.2.2015 16:59, Kevin Laurie wrote:
Dear Jiri,
I tried the following to try to get the inbox deduplicated. My inbox is quite large and urgently need to remove the duplicated messages. Is there an easy way to do this? Sorry for being so presistant but I need help.
The command I tried:-
doveadm deduplicate -u user@domain.net mailbox inbox
I'd try this (in shell):
doveadm search -u user@domain.net mailbox inbox | wc -l
Doveadm will print mailbox-guid and uid pair for every message in inbox, one per line, pipe it into "wc -l", which will count the lines and output a number - that's the count of messages in INBOX.
Then try doveadm deduplicate and after that the search command above again. If the count changed, deduplicate is working - you may only need to run it multiple times.
If the count doesn't change, dovecot is unable to recognize duplicates in your mailbox and you need to find another solution. Maybe check out the "-m" option in man doveadm-deduplicate
Am 2015-02-13 um 17:47 schrieb Jiri Bourek:
I'd try this (in shell):
doveadm search -u user@domain.net mailbox inbox | wc -l
Doveadm will print mailbox-guid and uid pair for every message in inbox, one per line, pipe it into "wc -l", which will count the lines and output a number - that's the count of messages in INBOX.
Then try doveadm deduplicate and after that the search command above again. If the count changed, deduplicate is working - you may only need to run it multiple times.
If the count doesn't change, dovecot is unable to recognize duplicates in your mailbox and you need to find another solution. Maybe check out the "-m" option in man doveadm-deduplicate
One should take great care, guids are not always unique, eg after consolidating several folders into one, that is, when deduplication might become really useful!
Below shell commands give a temptative view of what will be expunged, dont deduplicate, if you do not like what diff says:
## Beware - its not just duplicates sometimes… BOX="mailbox INBOX" USR="-u myname"
# by guid doveadm -f table fetch $USR 'guid hdr.Message-ID hdr.Subject' $BOX | sort --stable -k1,1 > /tmp/F1A.txt doveadm -f table fetch $USR 'guid hdr.Message-ID hdr.Subject' $BOX | sort --stable --uniq -k1,1 > /tmp/F1B.txt diff -u /tmp/F1A.txt /tmp/F1B.txt | less -S doveadm deduplicate $USR $BOX
# by Message-ID doveadm -f table fetch $USR 'guid hdr.Message-ID hdr.Subject' $BOX | sort --stable -k2,2 > /tmp/F2A.txt doveadm -f table fetch $USR 'guid hdr.Message-ID hdr.Subject' $BOX | sort --stable --uniq -k2,2 > /tmp/F2B.txt diff -u /tmp/F2A.txt /tmp/F2B.txt | less -S doveadm deduplicate -m $USR $BOX
-- peter
participants (3)
-
Jiri Bourek
-
Kevin Laurie
-
Peter