sis deduplication broken from 2.2.16 upwards

Timo Sirainen tss at iki.fi
Fri Mar 11 00:56:00 UTC 2016


> On 11 Mar 2016, at 02:37, Charles Marcus <CMarcus at Media-Brokers.com> wrote:
> 
> On 3/9/2016 9:02 PM, Timo Sirainen <tss at iki.fi> wrote:
>> On 08 Mar 2016, at 01:50, Pavel Stano <stanojr at websupport.sk> wrote:
>>> 
>>> sis attachment deduplication is broken in 2.2.16 upwards.
>>> It is caused by this commit.
>>> https://github.com/dovecot/core/commit/664bf3e236c214aee86294483c379e4fa66c2e63
>>> 
>>> in src/lib-fs/fs-sis.c function fs_sis_try_link() is comparation of
>>> inodes of hash files.
>>> Because fs_stat() after that commit use fstat() on open fd of temporary
>>> file instead of stat on filename. But that temporary file has differnt
>>> inode.
>>> 
>>> It not cause any corruption but it will not save any space.
>>> Because every duplicate attachment will be in separate file.
>> Thanks, fixed: https://github.com/dovecot/core/commit/3b39022ea0513363241cf852b7d454c841584ea1
> 
> So, after the fix is applied, does dovecot silently delete the
> duplicated files, or is there a command that needs to be run manually?

You'd have to do it manually in some way. A script that does something like:

Go through all attachment directories and for each file:
 - Sort files by filename
 - Identify that files A and B the same (beginning of the filename begins with same hash), but have a different inode
 - ln A B.tmp && mv B.tmp B




More information about the dovecot mailing list