[Dovecot] Deduplication active - but how good does it perform?
I have deduplication active in my first mdbox: type mailbox, but how do I find out how well the deduplication works? Is there a way of finding out how much disk space I saved (if I saved some :) )?
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On 6.1.2012, at 12.09, Ralf Hildebrandt wrote:
I have deduplication active in my first mdbox: type mailbox, but how do I find out how well the deduplication works? Is there a way of finding out how much disk space I saved (if I saved some :) )?
You could look at the files in the attachments directory, and see how many links they have. Each file has 2 initially. Each additional link has saved you <size of file> bytes of space.
On 2012-01-06 5:54 AM, Timo Sirainen tss@iki.fi wrote:
On 6.1.2012, at 12.09, Ralf Hildebrandt wrote:
I have deduplication active in my first mdbox: type mailbox, but how do I find out how well the deduplication works? Is there a way of finding out how much disk space I saved (if I saved some :) )?
You could look at the files in the attachments directory, and see how many links they have. Each file has 2 initially. Each additional link has saved you<size of file> bytes of space.
Maybe there could be a doveadm command for this? That would be really useful for some kind of stats applications... especially for promoting its use in environments where large attachments are common...
--
Best regards,
Charles
On 2012-01-06 6:58 AM, Charles Marcus CMarcus@Media-Brokers.com wrote:
On 2012-01-06 5:54 AM, Timo Sirainen tss@iki.fi wrote:
On 6.1.2012, at 12.09, Ralf Hildebrandt wrote:
I have deduplication active in my first mdbox: type mailbox, but how do I find out how well the deduplication works? Is there a way of finding out how much disk space I saved (if I saved some :) )?
You could look at the files in the attachments directory, and see how many links they have. Each file has 2 initially. Each additional link has saved you<size of file> bytes of space.
Maybe there could be a doveadm command for this?
Incidentally, I use rsnapshot (which is simply a wrapper script for rsync) for my disk based backups. It uses hard links so that you can have hourly/daily/weekly/monthly (or whatever naming scheme you want) snapshots of your backups, but each snapshot simply contains hardlinks to the previous snapshots, so you can literally have hundreds of snapshots that only consume a little more space that one single whole snapshot.
Anyway, rsnapshot has to leverage the du command to determine the amount of disk space each snapshot uses (when considered as a separate/standalone snapshot), or how much *actual* space each snapshot consumes (ie, only the files that are *not* hardlinked against a previous backup)...
Maybe this could be a starting point for how to do this...
http://rsnapshot.org/rsnapshot.html#usage
and scroll down to the rsnapshot du command...
--
Best regards,
Charles
Ralf Hildebrandt wrote:
I have deduplication active in my first mdbox: type mailbox, but how do I find out how well the deduplication works? Is there a way of finding out how much disk space I saved (if I saved some :) )?
You could check how much diskspace all the mail uses (or the mail of a user) and compare it to the quota dovecot reports. But I think you would need quota's activated for this.
E.g. on my small server used diskquota is 2GB where doveadm quota reports all users use 3.1GB.
participants (4)
-
Charles Marcus
-
Nick Rosier
-
Ralf Hildebrandt
-
Timo Sirainen