[Dovecot] v1.1.rc8 released

Ed W lists at wildgooses.com
Tue Jun 3 09:11:33 EEST 2008


Timo Sirainen wrote:
> On Mon, 2008-06-02 at 23:25 +0100, Ed W wrote:
>   
>> Hi
>>
>>     
>>> 	+ deliver: Added -c parameter to provide path to delivered mail.
>>> 	  This allows maildir to save identical mails to multiple recipients
>>> 	  using hard links.
>>>   
>>>       
>> Funnily enough it was on my todo list to whip up a small perl program to 
>> go and scan my maildirs and figure out if this theoretical idea actually 
>> amounted to anything. 
>>
>> Algorithm would be this:
>>
>> Open each message,
>> scan for first blank line. 
>> SHA the rest of the message, store the SHA in a hash (along with the 
>> message size)
>> rinse and repeat and see if we end up with any hashes showing count 
>> greater than 1...
>>
>> This would represent the best case that we could achieve assuming body 
>> content fixed and we find some way to manage variable headers.
>>     
>
> Somewhat faster way would be to get a list of file sizes first and not
> bother checksumming any files which have a unique size.
>   


Could do, but I was trying to expand to the case that the headers were 
different, but the body was the same (eg I suspect that mailing list 
managers might deliver emails one by one (verp), but the body is not 
customised.  Anyway, just wanted to checksum the body of the message not 
the whole message

Actually the motivation for this was I was wondering about the benefit 
of a storage backend where the body was stored per file and the headers 
were stored separately (perhaps in a maildir type format).  I haven't 
looked to see if this is what dbox does already...

I have been looking at git and brackup for backing up maildirs and it's 
got me thinking a bit more about mail storage algorithms

Ed W


More information about the dovecot mailing list