doveadm stateful backup

Christian christian+dc at famiru.de
Sun Jan 9 20:57:53 UTC 2022


Hi all,

first: I'm using version 2.3.4.1

I manage some rather large imap mailboxes which I want to backup on a 
regular basis. Some of them have relatively heavy traffic and one of 
them is greater than 30GB in size.

I studied the docs for doveadm backup 
(https://wiki2.dovecot.org/Tools/Doveadm/Sync) and even did some code 
research to better understand the process.

The docs state that using stateful synchronization is the most efficient 
way to synchronize mailboxes, therefore I chose this approach.

Highlevel overview:

- store a copy of the whole maildir in a separate directory 
(/var/vmail/backup)
- backup to this directory once a minute (trying to make most use of 
transaction logs) using the last state stored within a file
- create a backup once a day using tar (full, differential and 
incremental ones) blocking the backup process of the before mentioned step

I quite often receive notifications that doveadm backup returned an exit 
code of 2, which should be quite normal. These notifications look like that:

dsync(another_address at my.domain): Warning: Failed to do incremental sync 
for mailbox INBOX, retry with a full sync (Modseq 171631 no longer in 
transaction log (highest=177818, last_common_uid=177308, nextuid=177309))
dsync(another_address at my.domain): Warning: Mailbox changes caused a 
desync. You may want to run dsync again: Remote lost mailbox GUID 
e9149d0ae4e02d532505000026ca4352 (maybe it was just deleted?)
Synced another_address at my.domain successfully but missing some changes. 
Took 3 seconds. Starting retry 1...


The first message seems to point out that the transaction log got rolled 
and no more contains the messages from the backup dir, right? I thought 
about setting mail_index_log_rotate_min_age to 1hour to prevent rolling 
transaction logs too often, but abandoned this thought and increased the 
backup interval to once a minute. The warnings still appear so maybe my 
thoughts about transactions logs are wrong. The second message seems 
less alarming to me.

How does doeveadm backup behave in such situations? Does it directly 
fall back to a less efficient way of syncing mails? Does the state store 
the information "retry with a full sync" and the next run uses this 
mode? To investigate on this I simply measured runtimes an saw that the 
second/retry run takes a bit longer (up to about 15 seconds) to sync the 
dir.

I'm afraid of losing messages using my approach. Is it safe to always 
use doveadm backup -s $state? Simply counting one maildirs files within 
the live directory and the backup copy shows a 100 fewer files within 
the backup dir although the script runs only since a few days.

For reference, see my backup script below.


Regards

Christian


#!/bin/bash

# * * * * * /root/bin/backup.sh --sync-only
# 12 2 1-7 * * test $(date +\%u) -eq 6 && /root/bin/backup.sh --full
# 12 2 8-31 * * test $(date +\%u) -eq 6 && /root/bin/backup.sh 
--differential
# 12 2 * * * test $(date +\%u) -ne 6 && /root/bin/backup.sh

synconly=0
differential=0
fullbackup=0
if [ $# -gt 0 ] ; then
   if [ "$1" == "--sync-only" ] ; then
     synconly=1
   elif [ "$1" == "--differential" ] ; then
     differential=1
   elif [ "$1" == "--full" ] ; then
     fullbackup=1
   fi
fi

basedir="/var/vmail/backup"
targetdir="/var/vmail/backup/done"
mailaddresses="one_address at my.domain another_address at my.domain 
yet_another at my.domain"

if [ ! -d "$basedir" ] ; then
   mkdir -p "$basedir"
   chown vmail:vmail "$basedir"
fi
if [ ! -d "$targetdir" ] ; then
   mkdir -p "$targetdir"
   chown vmail:vmail "$targetdir"
fi

for mailaddr in ${mailaddresses} ; do
   #echo "Creating backup for $mailaddr."

   domainpart=${mailaddr#*@}
   localpart=${mailaddr%%@*}
   lockfile="$basedir/$mailaddr.lock"
   statefile="$basedir/$mailaddr.state"
   backupdir="$domainpart/$localpart/Maildir"
   snapshotfile_full="$basedir/$mailaddr.full.snar"
   snapshotfile="$basedir/$mailaddr.snar"
   backup_basename="$basedir/${mailaddr}_$(date '+%Y%m%d_%H%M%S')"

   (
     if [ $synconly -eq 1 ] ; then
       flock -xn 200
       if [ $? -eq 1 ] ; then
         # failed to acquire lock. Skip mailbox silently.
         exit
       fi
     fi

     # try to acquire exclusive lock for one minute
     flock -xw 60 200
     if [ $? -eq 1 ] ; then
       echo "Failed to acquire write lock within 60 seconds. Skipping 
$mailaddr."
       exit
     fi

     retries=0
     retval=1

     until [ $retval -eq 0 ] || [ $retries -ge 3 ] ; do
       let 'retries++'
       if [ -f "$statefile" ] ; then
         oldstate=$(head -1 "$statefile")
       else
         oldstate=""
       fi
       start_time=$(date +%s)
       ERROR=$((doveadm backup -u "$mailaddr" -s "$oldstate" 
"maildir:$basedir/$backupdir") 2>&1 > "$statefile")
       retval=$?
       end_time=$(date +%s)
       let 'duration=end_time-start_time'
       if [ $retval -eq 2 ] ; then
         #if [ $retries -gt 1 ] ; then
           echo "$ERROR"
           echo "Synced $mailaddr successfully but missing some changes. 
Took $duration seconds. Starting retry $retries..."
         #fi
       elif [ $retval -ne 0 ] ; then
         echo "$ERROR"
         echo "Syncing $mailaddr failed. Return code $retval. Took 
$duration seconds. Removing backup directory and starting retry $retries..."
         rm -rf "$basedir/$backupdir"
         rm -f "$statefile" "$snapshotfile"
       elif [ $retries -gt 1 ] ; then
         echo "Successful sync took $duration seconds."
       fi
     done

     # downgrade lock to shared lock
     flock -sn 200
     [ $synconly -eq 1 ] && exit

     if [ $retval -ne 0 ] ; then
       echo "Too many retries. Aborting backup of $mailaddr."
       exit
     fi


     cd "$basedir"
     if [ $fullbackup -eq 1 ] || [ ! -f "$snapshotfile_full" ] ; then
      tar -cpzf "${backup_basename}_full.tar.gz" --level=0 -g 
"$snapshotfile_full" "$backupdir"
      cp -f "$snapshotfile_full" "$snapshotfile"
     else
      suffix=""
      if [ $differential -eq 1 ] ; then
        cp -f "$snapshotfile_full" "$snapshotfile"
        suffix="_diff"
      fi

      tar -cpzf "${backup_basename}${suffix}.tar.gz" -g "$snapshotfile" 
"$backupdir"
     fi
     cd - > /dev/null
     mv "${basedir}/"*.tar.gz "$targetdir"
   ) 200>"$lockfile"

   [ $synconly -eq 1 ] && continue
   # housekeeping
   newest_full=$(ls -1 "${targetdir}/${mailaddr}_"*_full.tar.gz 
2>/dev/null | sort | tail -1)
   if [ -n "$newest_full" ] ; then
     #echo "Cleaning up files older than $newest_full..."
     find "$targetdir" -depth -maxdepth 1 -name "${mailaddr}_*" ! -newer 
"$newest_full" ! -samefile "$newest_full" -printf 'Deleting %p...\n' -delete
   fi
done



More information about the dovecot mailing list