In a quest to remove “duplicate” messages sent to both me and lists I subscribe to I came up with this, which I think should clean out my Archive folder, but I’ve been unable to get it to work for scanning all on my list-user email.
$ doveadm -f table fetch -u kremels 'hdr.message-id guid uid hdr.x-listname' mailbox "Archive" | sort| awk 'cnt[$1]++{if (cnt[$1]==2) print prev[$1]; print} {prev[$1]=$0}' |grep -E "[0-9] +$" |awk '{print "doveadm expunge -u kremels MAILBOX-GUID "$2" UID "$3}’
X-listname is a header that my list user applies to each message that comes from a list, and the output is just text to the screen that I can then run manually (I am not confident enough to automatically delete the messages).
I include the X-listname header at the end so that I can exclude lines that don’t end in a number, which means the copies sent directly to me are the ones expunged and the ones from the list are preserved.
So far so good. But ther are issues.
First, even after expunging a message and running doveadm index -u kremels “Archive”, subsequent runs still show the same duplicate messages.
Second, what I really want to do is run this over ALL the mailboxes, except for Junk and Sent but if that is possible I can’t find the right syntax.
-- "He has all the virtues I dislike and none of the vices I admire." Winston Churchill
OK, perhaps I tried to cover too much, so let's just look at this:
If I run this command, I get no errors:
doveadm expunge -u kremels MAILBOX-GUID 1488800748.47633_1.mail.covisp.net UID 22908
But, if I search again
doveadm -f table fetch -u kremels 'hdr.message-id guid uid hdr.x-listname' mailbox 'Archive' | sort| awk 'cnt[$1]++{if (cnt[$1]==2) print prev[$1]; print} {prev[$1]=$0}' |grep -E "[0-9] +$" |awk '{print "doveadm expunge -u kremels MAILBOX-GUID "$2" UID "$3}' | grep 22908 hdr.message-id guid uid hdr.x-listname doveadm expunge -u kremels MAILBOX-GUID 1488800748.47633_1.mail.covisp.net UID 22908
The message is still listed.
Am i misunderstanding something about how expunge works or what it does?
How do i remove the messages in such a way that they will not show up in subsequent searches (as far as I can tell, assuming the 1488800748.47633_1 is the first part of the file name in the maildir, the message is actually deleted).
$ find Maildir -name "1488800*" Maildir/.Archive/cur/1488800350.46962_1.mail.covisp.net:2,S Maildir/.Archive/cur/1488800633.47337_1.mail.covisp.net:2,S Maildir/.Sent/cur/1488800118.M2833P43167.mail.covisp.net,S=1221,W=1251:2,Sad
-- Tragic heroes always moan when the gods take an interest in them, but it's the people the gods ignore who get the really tough deals. --Mort
On Fri, 23 Feb 2018, @lbutlr wrote:
$ doveadm -f table fetch -u kremels 'hdr.message-id guid uid hdr.x-listname' mailbox "Archive" | sort| awk 'cnt[$1]++{if (cnt[$1]==2) print prev[$1]; print} {prev[$1]=$0}' |grep -E "[0-9] +$" |awk '{print "doveadm expunge -u kremels MAILBOX-GUID "$2" UID "$3}?
I was unaware of the syntax "hdr.{header}" -- all the reference materials I've seen only refers to "hdr" which returns the entire header block. This is handy to know because up to now, I've been filtering "hdr" fetches through grep. I've tried updating the Wiki, but it's immutable, so would someone update the documentation:
https://wiki.dovecot.org/Tools/Doveadm/Fetch
(and man page in distribution)
hdr[.{x}]
Header {x} of message. If missing, the
entire header is fetched.
First, even after expunging a message and running doveadm index -u kremels ?Archive?, subsequent runs still show the same duplicate messages.
I suspect client side caching. If you query IMAP directly, does it report the correct number of messages?
(Using openssl s_client, or netcat or telnet, or whatever)
x1 LOGIN kremels yourpassword
x2 SELECT INBOX
... look for "* {count} EXISTS" ...
x3 LOGOUT
If {count} is what you expected, then dovecot has the correct information and it's likely some client-side caching issue.
Second, what I really want to do is run this over ALL the mailboxes, except for Junk and Sent but if that is possible I can?t find the right syntax.
You mean to remove duplicates from any 2 mailboxes, or remove duplicates in mailboxes also found in Archive?
If the latter, try
doveadm -f table fetch -u kremels \
hdr.message-id \
mailbox Archive \
| sort -b >list0
doveadm -f table fetch -u kremels \
'hdr.message-id guid uid' \
NOT mailbox Archive \
NOT mailbox Junk \
NOT mailbox Sent \
| sort -b >list1
The list of duplicate message-id, guid and uid will then be ...
join -j1 list0 list1
You can process it via awk with one invocation of doveadm (2nd form without exclusion of Archive) but you'll need to know the guid of Archive beforehand.
Joseph Tam <jtam.home@gmail.com>
On 2018-02-23 (16:47 MST), Joseph Tam <jtam.home@gmail.com> wrote:
On Fri, 23 Feb 2018, @lbutlr wrote:
$ doveadm -f table fetch -u kremels 'hdr.message-id guid uid hdr.x-listname' mailbox "Archive" | sort| awk 'cnt[$1]++{if (cnt[$1]==2) print prev[$1]; print} {prev[$1]=$0}' |grep -E "[0-9] +$" |awk '{print "doveadm expunge -u kremels MAILBOX-GUID "$2" UID "$3}?
I was unaware of the syntax "hdr.{header}" -- all the reference materials I've seen only refers to "hdr" which returns the entire header block.
the error message from doveadm if you specify an invalid field is:
Available fetch fields: hdr.<name> body.<section> binary.<section> user mailbox mailbox-guid seq uid guid flags modseq hdr body body.snippet text text.utf8 size.physical size.virtual date.received date.sent date.saved date.received.unixtime date.sent.unixtime date.saved.unixtime imap.envelope imap.body imap.bodystructure pop3.uidl pop3.order refcount storageid
First, even after expunging a message and running doveadm index -u kremels ?Archive?, subsequent runs still show the same duplicate messages.
I suspect client side caching.
No, there is no client side involved. I am executing all of these these commands on the mail server. I expunge the messages, I index (or even force-resync) and the next search shows the same messages even though they are not in the Maildir anymore.
If {count} is what you expected, then dovecot has the correct information and it's likely some client-side caching issue.
I would have needed to check the count before doing this, and I did not.
Second, what I really want to do is run this over ALL the mailboxes, except for Junk and Sent but if that is possible I can?t find the right syntax.
You mean to remove duplicates from any 2 mailboxes, or remove duplicates in mailboxes also found in Archive?
I want to find any duplicates (based on msg ID) across all mailboxes, except Sent
doveadm -f table fetch -u kremels
'hdr.message-id guid uid'
NOT mailbox Archive
NOT mailbox Junk
NOT mailbox Sent
| sort -b >list1
Aha! Didn't know you could use NOT mailbox. That probably solves my issue on that score.
-- "It's unacceptable to think" - George W Bush 15/Sep/2006
On 2018-02-23 (18:01 MST), @lbutlr <kremels@kreme.com> wrote:
First, even after expunging a message and running doveadm index -u kremels ?Archive?, subsequent runs still show the same duplicate messages.
I suspect client side caching.
No, there is no client side involved. I am executing all of these these commands on the mail server. I expunge the messages, I index (or even force-resync) and the next search shows the same messages even though they are not in the Maildir anymore.
OK, I did another run yesterday and expunged 732 message dupes. I checked for a few of them before the expunge and there were files there. I checked after the expunge (several minutes after) and the files were NOT there.
They still show up if I do another search.
just now: # doveadm -f table fetch -u kremels 'hdr.message-id guid uid hdr.x-listname mailbox' mailbox 'Archive' | sort | awk 'cnt[$1]++{if (cnt[$1]>=2) print prev[$1]; print} {prev[$1]=$0}' | sort -u -k1,1| awk '{print "doveadm expunge -u kremels MAILBOX-GUID "$2" UID "$3}' | grep 1514660575.66506_1 hdr.message-id guid uid hdr.x-listname mailbox doveadm expunge -u kremels MAILBOX-GUID 1514660575.66506_1.mail.covisp.net UID 60608 # find ~kremels/Maildir -name "1514660575.66506_1*" # doveadm fetch -u kremels MAILBOX-GUID 1514660575.66506_1.mail.covisp.net UID 60608 Fatal: Invalid messageset
So, how do I get expunged messages to not show up in the fetch?
(Yes, I ran doveadm index -u kremels "*")
# doveadm -Dv index -u kremels "*" Debug: Loading modules from directory: /usr/local/lib/dovecot Debug: Module loaded: /usr/local/lib/dovecot/lib20_virtual_plugin.so Debug: Loading modules from directory: /usr/local/lib/dovecot/doveadm Debug: Skipping module doveadm_acl_plugin, because dlopen() failed: /usr/local/lib/dovecot/doveadm/lib10_doveadm_acl_plugin.so: Undefined symbol "acl_user_module" (this is usually intentional, so just ignore this message) Debug: Skipping module doveadm_expire_plugin, because dlopen() failed: /usr/local/lib/dovecot/doveadm/lib10_doveadm_expire_plugin.so: Undefined symbol "expire_set_lookup" (this is usually intentional, so just ignore this message) Debug: Skipping module doveadm_quota_plugin, because dlopen() failed: /usr/local/lib/dovecot/doveadm/lib10_doveadm_quota_plugin.so: Undefined symbol "quota_user_module" (this is usually intentional, so just ignore this message) Debug: Module loaded: /usr/local/lib/dovecot/doveadm/lib10_doveadm_sieve_plugin.so Debug: Skipping module doveadm_fts_plugin, because dlopen() failed: /usr/local/lib/dovecot/doveadm/lib20_doveadm_fts_plugin.so: Undefined symbol "fts_filter_filter" (this is usually intentional, so just ignore this message) Debug: Skipping module doveadm_mail_crypt_plugin, because dlopen() failed: /usr/local/lib/dovecot/doveadm/libdoveadm_mail_crypt_plugin.so: Undefined symbol "mail_crypt_user_get_public_key" (this is usually intentional, so just ignore this message) doveadm(kremels): Debug: Effective uid=1004, gid=1004, home=/home/kremels doveadm(kremels): Debug: Namespace inbox: type=private, prefix=, sep=, inbox=yes, hidden=no, list=yes, subscriptions=yes location=maildir:~/Maildir doveadm(kremels): Debug: maildir++: root=/home/kremels/Maildir, index=, indexpvt=, control=, inbox=/home/kremels/Maildir, alt= doveadm(kremels): Debug: Archive: Mailbox opened because: index doveadm(kremels): Info: Archive: Cache is already up to date doveadm(kremels): Debug: Drafts: Mailbox opened because: index doveadm(kremels): Info: Drafts: Cache is already up to date doveadm(kremels): Debug: Junk: Mailbox opened because: index doveadm(kremels): Info: Junk: Cache is already up to date doveadm(kremels): Debug: Misc.not-to-me: Mailbox opened because: index doveadm(kremels): Info: Misc.not-to-me: Cache is already up to date doveadm(kremels): Debug: bind: Mailbox opened because: index doveadm(kremels): Info: bind: Cache is already up to date doveadm(kremels): Debug: Sent: Mailbox opened because: index doveadm(kremels): Info: Sent: Cache is already up to date doveadm(kremels): Debug: Trash: Mailbox opened because: index doveadm(kremels): Info: Trash: Cache is already up to date doveadm(kremels): Debug: dovecot: Mailbox opened because: index doveadm(kremels): Info: dovecot: Cache is already up to date doveadm(kremels): Debug: freebsd: Mailbox opened because: index doveadm(kremels): Info: freebsd: Cache is already up to date doveadm(kremels): Debug: httpd: Mailbox opened because: index doveadm(kremels): Info: httpd: Cache is already up to date doveadm(kremels): Debug: spamassassin: Mailbox opened because: index doveadm(kremels): Info: spamassassin: Cache is already up to date doveadm(kremels): Debug: bbedit: Mailbox opened because: index doveadm(kremels): Info: bbedit: Cache is already up to date doveadm(kremels): Debug: postfix: Mailbox opened because: index doveadm(kremels): Info: postfix: Cache is already up to date doveadm(kremels): Debug: macosx: Mailbox opened because: index doveadm(kremels): Info: macosx: Cache is already up to date doveadm(kremels): Debug: tidbits: Mailbox opened because: index doveadm(kremels): Info: tidbits: Cache is already up to date doveadm(kremels): Debug: INBOX: Mailbox opened because: index doveadm(kremels): Info: INBOX: Cache is already up to date
-- How soon after the USPS issues the Calvin stamp will you send a letter with one on the envelope? Watterson: Immediately. I'm going to get in my horse and buggy and snail-mail a check for my newspaper subscription.
participants (2)
-
@lbutlr
-
Joseph Tam