[Dovecot] Questions about single intance storage
Lorens Kockum
dovecot.fdop at tagged.lorens.org
Mon Dec 5 09:19:27 EET 2011
On Mon, Dec 05, 2011 at 01:51:05AM +0200, Timo Sirainen wrote:
> I'm mainly wondering if it's common for backup programs to
> support using a separate program to generate the backups. For
> example if there was a "dovecot-backup" binary that just
> dumps all (or new-since-last-backup) of the users' mails into
> stdout, which the backup program can use. Or perhaps in that
> case there wouldn't really be much of anything for the backup
> to do except to write it to tape..
For databases, most commercial programs use some kind of
application-specific plugins. Bakula has "Client Run Before Job"
which permits indicating a command to be run before proceeding
with the backup, maybe there is more. However I have not heard
of any standardized way of doing it so that the application
provider could provide a way to interact with multiple backup
programs. It's more like each backup vendor boasting that his
backup software can backup application XXX without downtime.
> SIS was designed to work with hard links. They couldn't be
> replaced with symlinks without a redesign (which would be less
> efficient in normal operation).
Right, but if the backup program recognized this then maybe
the replacement by a symlink could be done only in the backup.
rsync has to keep in memory all the inodes and check the list
every time a hard link is found. If it knew that "a hard link in
attachments should link to an identical file name in the common
attachment SiS store.
> [Zimbra]
>
> You mean you first create uncompressed zip files (why not
> just tar?) of all the mails to the filesystem and the backup
> software then backups those zip files?
Well, not I, this is Zimbra's backup system :-) The backups are
the destination zips. Maybe zip is used because the extension
and extraction method is the same whether compressed or not;
that way compression is just an option to the backup program to
be turned on or off.
> Dovecot's mdbox files already contain multiple messages in
> each file, so it should be a lot more efficient to do backups
> on those. And each message in an mdbox file can be compressed
> if zlib plugin is enabled. So I think that sounds quite a lot
> like what you propose.
Is that combined or combinable with SiS? If attachments are
in separate files, that means they are aligned on block
boundaries, which make block-level SiS (like NetApp's) much more
efficient. Think of an attachment sent to all department heads,
all of whom forward the attachment to all their subordinates.
More information about the dovecot
mailing list