[Dovecot] Safely restoring backups
I'm trying to determine what the best way to restore mail with mdbox is. Restoring using maildir was trivial, I just used rsync --ignore-existing which wrote any mails that were removed and didn't touch things that already existed[1]. With mdbox things have become more complicated, and I haven't found a way to restore mail that doesn't result in many message duplicates.
My backup setup is simple, I'm doing daily rsync backups of user's mailstores, as well as weekly backups, so I end up having on my backup server daily.1, daily.2, daily.3, daily.4, weekly.1, weekly.2... each containing the entire contents of the user's mdbox.
The different restore methods I've tried are:
- I tried rsyncing the different backup directories back to the mail storage host, and then doing:
'dsync -R backup -u $user mdbox:/path/to/to/daily.1/mdbox'[2] 'dsync -R backup -u $user mdbox:/path/to/to/daily.2/mdbox'
This works ok, but it gives duplicates of every mail that already exists for every daily/weekly I dsync. This is what the rsync --ignore-existing avoided. This is particularly annoying if I restore the weekly, and multiple daily directories, because then you get a copy for every one you restore. I had thought that the individual message's GIDs would keep them from being duplicated?
- I also tried to use 'doveadm import' in two different ways. The first way I created a 'restored_from_backups' folder and then import each of the daily.#, weekly.# mdboxes into their own subfolder within that folder, for example:
'doveadm import -u $username mdbox:/path/to/daily.1/mdbox restored_from_backups/daily1 all' 'doveadm import -u $username mdbox:/path/to/daily.2/mdbox restored_from_backups/daily2 all' ... etc.
I then go through and subscribe the new folders[3] and the user ends up with a folder structure like this:
restored_from_backups/
daily.1/
INBOX
Trash
folder1
folder2
daily.2
INBOX
Trash
folder1
folder2
etc.
This works fine, except that it results in the user having a entire duplicate copies of their mailbox for each daily/weekly that I restore. That will quickly bring people over quota.
- I also tried to use 'doveadm import' to import all the different directories all into the same restored_from_backups directory, so there are no subdirectories for each daily/weekly under restored_from_backups, like so:
'doveadm mailbox create -u $user -s restored_from_backups' 'doveadm -v -D import -u $user mdbox:/path/to/daily.1/mdbox restored_from_backups all'
I then go through and subscribe the folders[3].
Using this method, the 'restored_from_backups' mailbox is created, and populated with the folders. The only problem with this method is the same as method #1: for every backup I restore, mails are duplicated.
Is there a way I can restore things from backup and avoid duplicates? Is there another method I should try?
thanks for any ideas, pointers, suggestions for improvement, etc.
micah
this would of course bring back mails that were deleted, but that was fine as the user could deal with that.
yes, I know I could setup dsync on the backup server, and use dsync directly to pull the mails from there, but this is complicated in my situation due to how the backupserver works.
why doesn't doveadm import have a -s option to subscribe?
--
On Fri, 2011-10-07 at 11:09 -0400, Micah Anderson wrote:
I'm trying to determine what the best way to restore mail with mdbox is. Restoring using maildir was trivial, I just used rsync --ignore-existing which wrote any mails that were removed and didn't touch things that already existed[1].
If a mail had changed flag, the maildir file got duplicated, which Dovecot complained about if it noticed it.
With mdbox things have become more complicated, and I haven't found a way to restore mail that doesn't result in many message duplicates.
Do you need to restore mails so often that this is really a problem? :)
- I tried rsyncing the different backup directories back to the mail storage host, and then doing:
'dsync -R backup -u $user mdbox:/path/to/to/daily.1/mdbox'[2] 'dsync -R backup -u $user mdbox:/path/to/to/daily.2/mdbox'
This works ok, but it gives duplicates of every mail that already exists for every daily/weekly I dsync. This is what the rsync --ignore-existing avoided. This is particularly annoying if I restore the weekly, and multiple daily directories, because then you get a copy for every one you restore. I had thought that the individual message's GIDs would keep them from being duplicated?
GUIDs can be used to identify messages, but there's no automatic deduplication. It's fine to e.g. copy a message from INBOX to INBOX, which duplicates it. Dovecot shouldn't prevent that.
Is there a way I can restore things from backup and avoid duplicates? Is there another method I should try?
Here's one way, although somewhat slow (and not fully tested):
doveadm -f tab fetch -u user@domain 'mailbox guid' all | sort > guids1 doveadm -f tab fetch -o mdbox:/backups/user -u user@domain 'mailbox-guid guid' all | sort > guids2 diff -u guids1 guids2 | grep '^+[+]' | sed 's/^+//' | awk { system("doveadm import -u user@domain mdbox:/backups/user restored mailbox-guid "$1" guid "$2); }
- why doesn't doveadm import have a -s option to subscribe?
I suppose it could.. Added to v2.1: http://hg.dovecot.org/dovecot-2.1/rev/afec4ceda8e1
participants (2)
-
Micah Anderson
-
Timo Sirainen