OT - Finding/removing duplicate emails - WAS: Re: [Dovecot] dovecot/lmtp munmap()-ing a lot

Steffen Kaiser skdovecot at smail.inf.fh-brs.de
Tue Jun 10 14:31:13 UTC 2014

On Tue, 10 Jun 2014, Reindl Harald wrote:
> Am 10.06.2014 15:39, schrieb Steffen Kaiser:
>> On Tue, 10 Jun 2014, Reindl Harald wrote:
>>> Am 10.06.2014 15:17, schrieb Steffen Kaiser:
>>>> The basic question is: what is a duplicate?
>>>> However, neither script I would call general enough for automatic processing
>>> dbmail has just "suppress_duplicates = yes" and silently ignores
>>> *new received* messages with the same message-id to the same user
>>> as a global setting
>> Wasn't there a thread some days/weeks ago, that Pigeonhole behaves the same by default and the poster asked how
>> long the timeframe is Pigeonhole remembers the ids?
>> Actually, I still wonder about whether or not the same message-id is sufficient to decide to "silently drop" a
>> message, as I interprete "to ignore a message" as "to drop". They might came different paths, some MUA might not
>> generate ids unqiue world-wide or time-depended, ... . It's a matter of taste, IMHO
> if it generates one it's unlikely to have the same message-id
> for the same RCPT

yes, but then some recipients forward (automatically or manually). Or you 
a fetchmail-like grabber that re-transmits the message, ... .

>                    - usually the current timestamp is part of it

that I mean with "time-depended", but you also used "unlikely" and 
"usually". So you still see a little chance, that the message-id is not 
world-wide unique. ;-)

I know, nowadays all MUAs should be capable of generating sensible message 
ids and some claims about bandwith and such are outdated, too. You have to 
rely on information you do not control -> you have to decide how far to 

Steffen Kaiser
