Need help deduplicating messages fetched with getmail into dovecot mailbox

Joseph Tam jtam.home at gmail.com
Fri Jan 11 22:56:53 EET 2019


On Fri, 11 Jan 2019, Gabriel Kaufmann wrote:

> Hello Joseph,
>
> thanks for your reply.
>
>> doveadm fetch -u my-mailbox at domain.net 'guid hdr.message-id' ...
>>
>> You're on your own for everything else.
>
> That works and I may be can make it work with that using a shell-script
> interacting with getmail as Filter. But indeed it's fetching ALL
> message-ids. It would be perfect if I could make search query on 'guid
> hdr.message-id' to get only a result (or none) if there is a message
> matching the message-id.

Whether this is good depends on how much duplication there is.  If you're
adding a small number of message to a large corpus, it *may* be better to
loop through message-ids.  If you're merging in a large mailbox,
it's probably better to do bulk dumps of both boxes, then process them.

I'm not sure whether dovecot's caches are sequential O(n) or hashed O(1),
but each query has overhead, so you may be better off doing a dump of
message-ID's, then cross-referencing.

> Do you have any idea if it's possible to use doveadm search for single
> message-id without having to query over all messages?

"-ftable" is just to make it easier to parse.

 	doveadm -ftable fetch -u my-mailbox at domain.net \
 		'guid hdr.message-id' \
 		HEADER message-id '<1546519978.5428 at paypal.com>'

Keep in mind search is for case-insensitive fragments, so this pattern
matches be a superset of the above '1546519978.5428 at PAYPAL.COM'.

Joseph Tam <jtam.home at gmail.com>


More information about the dovecot mailing list