doveadm stateful backup
Christian
christian+dc at famiru.de
Sun Jan 9 20:57:53 UTC 2022
Hi all,
first: I'm using version 2.3.4.1
I manage some rather large imap mailboxes which I want to backup on a
regular basis. Some of them have relatively heavy traffic and one of
them is greater than 30GB in size.
I studied the docs for doveadm backup
(https://wiki2.dovecot.org/Tools/Doveadm/Sync) and even did some code
research to better understand the process.
The docs state that using stateful synchronization is the most efficient
way to synchronize mailboxes, therefore I chose this approach.
Highlevel overview:
- store a copy of the whole maildir in a separate directory
(/var/vmail/backup)
- backup to this directory once a minute (trying to make most use of
transaction logs) using the last state stored within a file
- create a backup once a day using tar (full, differential and
incremental ones) blocking the backup process of the before mentioned step
I quite often receive notifications that doveadm backup returned an exit
code of 2, which should be quite normal. These notifications look like that:
dsync(another_address at my.domain): Warning: Failed to do incremental sync
for mailbox INBOX, retry with a full sync (Modseq 171631 no longer in
transaction log (highest=177818, last_common_uid=177308, nextuid=177309))
dsync(another_address at my.domain): Warning: Mailbox changes caused a
desync. You may want to run dsync again: Remote lost mailbox GUID
e9149d0ae4e02d532505000026ca4352 (maybe it was just deleted?)
Synced another_address at my.domain successfully but missing some changes.
Took 3 seconds. Starting retry 1...
The first message seems to point out that the transaction log got rolled
and no more contains the messages from the backup dir, right? I thought
about setting mail_index_log_rotate_min_age to 1hour to prevent rolling
transaction logs too often, but abandoned this thought and increased the
backup interval to once a minute. The warnings still appear so maybe my
thoughts about transactions logs are wrong. The second message seems
less alarming to me.
How does doeveadm backup behave in such situations? Does it directly
fall back to a less efficient way of syncing mails? Does the state store
the information "retry with a full sync" and the next run uses this
mode? To investigate on this I simply measured runtimes an saw that the
second/retry run takes a bit longer (up to about 15 seconds) to sync the
dir.
I'm afraid of losing messages using my approach. Is it safe to always
use doveadm backup -s $state? Simply counting one maildirs files within
the live directory and the backup copy shows a 100 fewer files within
the backup dir although the script runs only since a few days.
For reference, see my backup script below.
Regards
Christian
#!/bin/bash
# * * * * * /root/bin/backup.sh --sync-only
# 12 2 1-7 * * test $(date +\%u) -eq 6 && /root/bin/backup.sh --full
# 12 2 8-31 * * test $(date +\%u) -eq 6 && /root/bin/backup.sh
--differential
# 12 2 * * * test $(date +\%u) -ne 6 && /root/bin/backup.sh
synconly=0
differential=0
fullbackup=0
if [ $# -gt 0 ] ; then
if [ "$1" == "--sync-only" ] ; then
synconly=1
elif [ "$1" == "--differential" ] ; then
differential=1
elif [ "$1" == "--full" ] ; then
fullbackup=1
fi
fi
basedir="/var/vmail/backup"
targetdir="/var/vmail/backup/done"
mailaddresses="one_address at my.domain another_address at my.domain
yet_another at my.domain"
if [ ! -d "$basedir" ] ; then
mkdir -p "$basedir"
chown vmail:vmail "$basedir"
fi
if [ ! -d "$targetdir" ] ; then
mkdir -p "$targetdir"
chown vmail:vmail "$targetdir"
fi
for mailaddr in ${mailaddresses} ; do
#echo "Creating backup for $mailaddr."
domainpart=${mailaddr#*@}
localpart=${mailaddr%%@*}
lockfile="$basedir/$mailaddr.lock"
statefile="$basedir/$mailaddr.state"
backupdir="$domainpart/$localpart/Maildir"
snapshotfile_full="$basedir/$mailaddr.full.snar"
snapshotfile="$basedir/$mailaddr.snar"
backup_basename="$basedir/${mailaddr}_$(date '+%Y%m%d_%H%M%S')"
(
if [ $synconly -eq 1 ] ; then
flock -xn 200
if [ $? -eq 1 ] ; then
# failed to acquire lock. Skip mailbox silently.
exit
fi
fi
# try to acquire exclusive lock for one minute
flock -xw 60 200
if [ $? -eq 1 ] ; then
echo "Failed to acquire write lock within 60 seconds. Skipping
$mailaddr."
exit
fi
retries=0
retval=1
until [ $retval -eq 0 ] || [ $retries -ge 3 ] ; do
let 'retries++'
if [ -f "$statefile" ] ; then
oldstate=$(head -1 "$statefile")
else
oldstate=""
fi
start_time=$(date +%s)
ERROR=$((doveadm backup -u "$mailaddr" -s "$oldstate"
"maildir:$basedir/$backupdir") 2>&1 > "$statefile")
retval=$?
end_time=$(date +%s)
let 'duration=end_time-start_time'
if [ $retval -eq 2 ] ; then
#if [ $retries -gt 1 ] ; then
echo "$ERROR"
echo "Synced $mailaddr successfully but missing some changes.
Took $duration seconds. Starting retry $retries..."
#fi
elif [ $retval -ne 0 ] ; then
echo "$ERROR"
echo "Syncing $mailaddr failed. Return code $retval. Took
$duration seconds. Removing backup directory and starting retry $retries..."
rm -rf "$basedir/$backupdir"
rm -f "$statefile" "$snapshotfile"
elif [ $retries -gt 1 ] ; then
echo "Successful sync took $duration seconds."
fi
done
# downgrade lock to shared lock
flock -sn 200
[ $synconly -eq 1 ] && exit
if [ $retval -ne 0 ] ; then
echo "Too many retries. Aborting backup of $mailaddr."
exit
fi
cd "$basedir"
if [ $fullbackup -eq 1 ] || [ ! -f "$snapshotfile_full" ] ; then
tar -cpzf "${backup_basename}_full.tar.gz" --level=0 -g
"$snapshotfile_full" "$backupdir"
cp -f "$snapshotfile_full" "$snapshotfile"
else
suffix=""
if [ $differential -eq 1 ] ; then
cp -f "$snapshotfile_full" "$snapshotfile"
suffix="_diff"
fi
tar -cpzf "${backup_basename}${suffix}.tar.gz" -g "$snapshotfile"
"$backupdir"
fi
cd - > /dev/null
mv "${basedir}/"*.tar.gz "$targetdir"
) 200>"$lockfile"
[ $synconly -eq 1 ] && continue
# housekeeping
newest_full=$(ls -1 "${targetdir}/${mailaddr}_"*_full.tar.gz
2>/dev/null | sort | tail -1)
if [ -n "$newest_full" ] ; then
#echo "Cleaning up files older than $newest_full..."
find "$targetdir" -depth -maxdepth 1 -name "${mailaddr}_*" ! -newer
"$newest_full" ! -samefile "$newest_full" -printf 'Deleting %p...\n' -delete
fi
done
More information about the dovecot
mailing list