[Dovecot] SIS and restoring from backups
On 2012-03-24 7:49 AM, Timo Sirainen tss@iki.fi wrote:
This is already optionally done in v2.0+dbox. MIME attachments can be stored in plain binary form if they can be reconstructed back into their original form. It doesn't break any signed stuff.
Hey Timo,
Splitting this off into a separate thread...
On the question of the existing SIS capability for attachments... have you given any thought as to how to solve the problem of restoring from backups when SIS is used? I was planning on using it initially, until I read on list that restoring from (normal disk-to-disk) backups would not work when SIS was enabled - this is obviously a deal breaker for anyone who relies on backups - which I would think would be almost everyone?
Or maybe I misunderstood the problem?
--
Best regards,
Charles
On 24.3.2012, at 14.01, Charles Marcus wrote:
On the question of the existing SIS capability for attachments... have you given any thought as to how to solve the problem of restoring from backups when SIS is used? I was planning on using it initially, until I read on list that restoring from (normal disk-to-disk) backups would not work when SIS was enabled - this is obviously a deal breaker for anyone who relies on backups - which I would think would be almost everyone?
Or maybe I misunderstood the problem?
You can do full backups from a filesystem snapshot, which works "well enough" (might leave some unused attachments lying around in some rare cases, but that can also happen if Dovecot crashes/dies).
The other possibility is to already use dsync (doveadm backup) to do full backups. With the redesigned dsync you would be able to do incremental backups also. In any case the solution involves de-SISing mails for backup.
On 2012-03-24 8:08 AM, Timo Sirainen tss@iki.fi wrote:
You can do full backups from a filesystem snapshot, which works "well enough" (might leave some unused attachments lying around in some rare cases, but that can also happen if Dovecot crashes/dies).
But the problem isn't with backups, but with restores, right?
The other possibility is to already use dsync (doveadm backup) to do full backups. With the redesigned dsync you would be able to do incremental backups also. In any case the solution involves de-SISing mails for backup.
So, this would make the backup storage requirements larger - maybe dramatically larger for sites that have a lot of large attachments?
Doesn't sound ideal...
I currently use rsnapshot to keep many multiple (daily, weekly, and monthly) hardlinked snapshots, each of which consumes only a tiny fraction of extra storage over and above the first/main snapshot.
Am I correct that enabling SIS as it is currently implemented would break this backup tool?
I was also thinking of asking about how to provide read-only access to these backup snapshots to the users in some kind of special namespace, so that they could all essentially go 'back in time' to grab any emails that they may have inadvertently deleted...
--
Best regards,
Charles
On 24.3.2012, at 14.54, Charles Marcus wrote:
On 2012-03-24 8:08 AM, Timo Sirainen tss@iki.fi wrote:
You can do full backups from a filesystem snapshot, which works "well enough" (might leave some unused attachments lying around in some rare cases, but that can also happen if Dovecot crashes/dies).
But the problem isn't with backups, but with restores, right?
Ah, right. Then it gets tricky.
The other possibility is to already use dsync (doveadm backup) to do full backups. With the redesigned dsync you would be able to do incremental backups also. In any case the solution involves de-SISing mails for backup.
So, this would make the backup storage requirements larger - maybe dramatically larger for sites that have a lot of large attachments?
Some backup systems can do internal deduplication.
I currently use rsnapshot to keep many multiple (daily, weekly, and monthly) hardlinked snapshots, each of which consumes only a tiny fraction of extra storage over and above the first/main snapshot.
Am I correct that enabling SIS as it is currently implemented would break this backup tool?
I'm not sure. Are you running rsnapshot on live filesystem or on a snapshot? On live filesystem there would be race conditions.
I was also thinking of asking about how to provide read-only access to these backup snapshots to the users in some kind of special namespace, so that they could all essentially go 'back in time' to grab any emails that they may have inadvertently deleted...
This should be possible, just point the namespace to such snapshot. You may need to point CONTROL dir to some temporary directory and index dir as well to either temp or to memory.
On 2012-03-24 9:16 AM, Timo Sirainen tss@iki.fi wrote:
On 24.3.2012, at 14.54, Charles Marcus wrote:
On 2012-03-24 8:08 AM, Timo Sirainentss@iki.fi wrote:
You can do full backups from a filesystem snapshot, which works "well enough" (might leave some unused attachments lying around in some rare cases, but that can also happen if Dovecot crashes/dies).
But the problem isn't with backups, but with restores, right?
Ah, right. Then it gets tricky.
Yeah, I seem to remember it was a comment like that that scared me about enabling it...
Can you expand on what exactly is 'tricky' about it? Also, have you given any thought to how to eliminate the 'trickiness'? I'm of the old school and like for my backups to not have any 'trickiness' about them - including performing restores... ;)
So, this would make the backup storage requirements larger - maybe dramatically larger for sites that have a lot of large attachments?
Some backup systems can do internal deduplication.
Hmmm... and actually, rsnapshot (which uses rsync) does just that, which is *why* each additional snapshot only requires a small fraction of additional disk space (compared to the first main/full snapshot).
Am I correct that enabling SIS as it is currently implemented would break this backup tool?
I'm not sure. Are you running rsnapshot on live filesystem or on a snapshot? On live filesystem there would be race conditions.
I've been running it on a live system for a long time, and never had a problem beyond occasional messages like this:
file has vanished: "/var/vmail/example.com/username/cur/1332602593.Vfe02I9e7acdM308676.myhost.example.com:2," rsync warning: some files vanished before they could be transferred (code 24) at main.c(1052) [sender=3.0.9]
but the rsnapshot guys assured me this will and does not cause any real problems, other than those files don't get backed up.
I am however looking forward to migrating this to a VM so I can do snapshot for backups to get consistent point-in-time backups.
I was also thinking of asking about how to provide read-only access to these backup snapshots to the users in some kind of special namespace, so that they could all essentially go 'back in time' to grab any emails that they may have inadvertently deleted...
This should be possible, just point the namespace to such snapshot. You may need to point CONTROL dir to some temporary directory and index dir as well to either temp or to memory.
This is great news! I'm looking forward to getting this all working.
--
Best regards,
Charles
On 25.3.2012, at 18.12, Charles Marcus wrote:
On 2012-03-24 9:16 AM, Timo Sirainen tss@iki.fi wrote:
On 24.3.2012, at 14.54, Charles Marcus wrote:
On 2012-03-24 8:08 AM, Timo Sirainentss@iki.fi wrote:
You can do full backups from a filesystem snapshot, which works "well enough" (might leave some unused attachments lying around in some rare cases, but that can also happen if Dovecot crashes/dies).
But the problem isn't with backups, but with restores, right?
Ah, right. Then it gets tricky.
Yeah, I seem to remember it was a comment like that that scared me about enabling it...
Can you expand on what exactly is 'tricky' about it? Also, have you given any thought to how to eliminate the 'trickiness'? I'm of the old school and like for my backups to not have any 'trickiness' about them - including performing restores... ;)
It's easy to restore a full backup. And it's easy to restore specific users if you have the full backup easily accessible (just run doveadm import with proper settings pointing to backup). What's difficult is if you just want to restore a specific user from the backup and can't easily do random access to all files. Then you'll first need to restore the user's dbox files and then somehow figure out which attachments to restore from the SIS directory.
Am I correct that enabling SIS as it is currently implemented would break this backup tool?
I'm not sure. Are you running rsnapshot on live filesystem or on a snapshot? On live filesystem there would be race conditions.
I've been running it on a live system for a long time, and never had a problem beyond occasional messages like this:
file has vanished: "/var/vmail/example.com/username/cur/1332602593.Vfe02I9e7acdM308676.myhost.example.com:2," rsync warning: some files vanished before they could be transferred (code 24) at main.c(1052) [sender=3.0.9]
I'd guess that with rsnapshot + Maildir you can get duplicate Maildir files if the rsnapshot is accessing a large maildir at the same time as user is changing a message flag. Dovecot usually notices these duplicates and logs a warning about them.
On 2012-03-28 7:57 PM, Timo Sirainen tss@iki.fi wrote:
It's easy to restore a full backup. And it's easy to restore specific users if you have the full backup easily accessible (just run doveadm import with proper settings pointing to backup). What's difficult is if you just want to restore a specific user from the backup and can't easily do random access to all files. Then you'll first need to restore the user's dbox files and then somehow figure out which attachments to restore from the SIS directory.
Well, I think I'm not going to worry about this, since you recently said:
On 2012-03-24 9:16 AM, Timo Sirainen tss@iki.fi wrote:
On 24.3.2012, at 14.54, Charles Marcus wrote:
I was also thinking of asking about how to provide read-only access to these backup snapshots to the users in some kind of special namespace, so that they could all essentially go 'back in time' to grab any emails that they may have inadvertently deleted...
This should be possible, just point the namespace to such snapshot. You may need to point CONTROL dir to some temporary directory and index dir as well to either temp or to memory.
If we really can get these snapshots to automatically show up under a 'Backups' namespace, with each users folders under each snapshot showing by date, so they can easily 'go back in time' and retrieve anything they want from them, that totally eliminates any need for me to do individual restores... :)
I'd guess that with rsnapshot + Maildir you can get duplicate Maildir files if the rsnapshot is accessing a large maildir at the same time as user is changing a message flag. Dovecot usually notices these duplicates and logs a warning about them.
This won't be a problem wither, because our new system will be performing filesystem snapshots for rsnapshot to use as a source.
Thanks again!
--
Best regards,
Charles
participants (2)
-
Charles Marcus
-
Timo Sirainen