Hi all,
first: I'm using version 2.3.4.1
I manage some rather large imap mailboxes which I want to backup on a regular basis. Some of them have relatively heavy traffic and one of them is greater than 30GB in size.
I studied the docs for doveadm backup (https://wiki2.dovecot.org/Tools/Doveadm/Sync) and even did some code research to better understand the process.
The docs state that using stateful synchronization is the most efficient way to synchronize mailboxes, therefore I chose this approach.
Highlevel overview:
- store a copy of the whole maildir in a separate directory (/var/vmail/backup)
- backup to this directory once a minute (trying to make most use of transaction logs) using the last state stored within a file
- create a backup once a day using tar (full, differential and incremental ones) blocking the backup process of the before mentioned step
I quite often receive notifications that doveadm backup returned an exit code of 2, which should be quite normal. These notifications look like that:
dsync(another_address@my.domain): Warning: Failed to do incremental sync for mailbox INBOX, retry with a full sync (Modseq 171631 no longer in transaction log (highest=177818, last_common_uid=177308, nextuid=177309)) dsync(another_address@my.domain): Warning: Mailbox changes caused a desync. You may want to run dsync again: Remote lost mailbox GUID e9149d0ae4e02d532505000026ca4352 (maybe it was just deleted?) Synced another_address@my.domain successfully but missing some changes. Took 3 seconds. Starting retry 1...
The first message seems to point out that the transaction log got rolled and no more contains the messages from the backup dir, right? I thought about setting mail_index_log_rotate_min_age to 1hour to prevent rolling transaction logs too often, but abandoned this thought and increased the backup interval to once a minute. The warnings still appear so maybe my thoughts about transactions logs are wrong. The second message seems less alarming to me.
How does doeveadm backup behave in such situations? Does it directly fall back to a less efficient way of syncing mails? Does the state store the information "retry with a full sync" and the next run uses this mode? To investigate on this I simply measured runtimes an saw that the second/retry run takes a bit longer (up to about 15 seconds) to sync the dir.
I'm afraid of losing messages using my approach. Is it safe to always use doveadm backup -s $state? Simply counting one maildirs files within the live directory and the backup copy shows a 100 fewer files within the backup dir although the script runs only since a few days.
For reference, see my backup script below.
Regards
Christian
#!/bin/bash
# * * * * * /root/bin/backup.sh --sync-only # 12 2 1-7 * * test $(date +\%u) -eq 6 && /root/bin/backup.sh --full # 12 2 8-31 * * test $(date +\%u) -eq 6 && /root/bin/backup.sh --differential # 12 2 * * * test $(date +\%u) -ne 6 && /root/bin/backup.sh
synconly=0 differential=0 fullbackup=0 if [ $# -gt 0 ] ; then if [ "$1" == "--sync-only" ] ; then synconly=1 elif [ "$1" == "--differential" ] ; then differential=1 elif [ "$1" == "--full" ] ; then fullbackup=1 fi fi
basedir="/var/vmail/backup" targetdir="/var/vmail/backup/done" mailaddresses="one_address@my.domain another_address@my.domain yet_another@my.domain"
if [ ! -d "$basedir" ] ; then mkdir -p "$basedir" chown vmail:vmail "$basedir" fi if [ ! -d "$targetdir" ] ; then mkdir -p "$targetdir" chown vmail:vmail "$targetdir" fi
for mailaddr in ${mailaddresses} ; do #echo "Creating backup for $mailaddr."
domainpart=${mailaddr#*@} localpart=${mailaddr%%@*} lockfile="$basedir/$mailaddr.lock" statefile="$basedir/$mailaddr.state" backupdir="$domainpart/$localpart/Maildir" snapshotfile_full="$basedir/$mailaddr.full.snar" snapshotfile="$basedir/$mailaddr.snar" backup_basename="$basedir/${mailaddr}_$(date '+%Y%m%d_%H%M%S')"
( if [ $synconly -eq 1 ] ; then flock -xn 200 if [ $? -eq 1 ] ; then # failed to acquire lock. Skip mailbox silently. exit fi fi
# try to acquire exclusive lock for one minute flock -xw 60 200 if [ $? -eq 1 ] ; then echo "Failed to acquire write lock within 60 seconds. Skipping $mailaddr." exit fi
retries=0 retval=1
until [ $retval -eq 0 ] || [ $retries -ge 3 ] ; do let 'retries++' if [ -f "$statefile" ] ; then oldstate=$(head -1 "$statefile") else oldstate="" fi start_time=$(date +%s) ERROR=$((doveadm backup -u "$mailaddr" -s "$oldstate" "maildir:$basedir/$backupdir") 2>&1 > "$statefile") retval=$? end_time=$(date +%s) let 'duration=end_time-start_time' if [ $retval -eq 2 ] ; then #if [ $retries -gt 1 ] ; then echo "$ERROR" echo "Synced $mailaddr successfully but missing some changes. Took $duration seconds. Starting retry $retries..." #fi elif [ $retval -ne 0 ] ; then echo "$ERROR" echo "Syncing $mailaddr failed. Return code $retval. Took $duration seconds. Removing backup directory and starting retry $retries..." rm -rf "$basedir/$backupdir" rm -f "$statefile" "$snapshotfile" elif [ $retries -gt 1 ] ; then echo "Successful sync took $duration seconds." fi done
# downgrade lock to shared lock flock -sn 200 [ $synconly -eq 1 ] && exit
if [ $retval -ne 0 ] ; then echo "Too many retries. Aborting backup of $mailaddr." exit fi
cd "$basedir" if [ $fullbackup -eq 1 ] || [ ! -f "$snapshotfile_full" ] ; then tar -cpzf "${backup_basename}_full.tar.gz" --level=0 -g "$snapshotfile_full" "$backupdir" cp -f "$snapshotfile_full" "$snapshotfile" else suffix="" if [ $differential -eq 1 ] ; then cp -f "$snapshotfile_full" "$snapshotfile" suffix="_diff" fi
tar -cpzf "${backup_basename}${suffix}.tar.gz" -g "$snapshotfile" "$backupdir" fi cd - > /dev/null mv "${basedir}/"*.tar.gz "$targetdir" ) 200>"$lockfile"
[ $synconly -eq 1 ] && continue # housekeeping newest_full=$(ls -1 "${targetdir}/${mailaddr}_"*_full.tar.gz 2>/dev/null | sort | tail -1) if [ -n "$newest_full" ] ; then #echo "Cleaning up files older than $newest_full..." find "$targetdir" -depth -maxdepth 1 -name "${mailaddr}_*" ! -newer "$newest_full" ! -samefile "$newest_full" -printf 'Deleting %p...\n' -delete fi done