[Dovecot] dsync is SLOW compared to rsync
Hi all, We are currently using snapshots and rsync to backup a large mail server to a backup mail server. I have been looking into using dsync to replace rsync in hopes that it would make backups more efficient. I decided to test the performance using a single mailbox. Unfortunately dsync seems to run much slower than rsync. Rsync was able to sync the mailbox in 2 seconds. dsync took over a minute. The test was run so that the source and destination are on the same filesystem. We would like to using the new replication system, but that doesn't seem likely since the performance of the underlying dsync is so much slower than rsync. Even with the extra work that dsync is doing I can't believe the difference in performance would be that great. I realize that dsync is actively being worked on and I hope bringing attention to performance issue will provoke some ideas on how to improve it. Here is the output of the tests using dovecot 2.1.3:
[root@n24 bu]# du -hs /home/10.0.1.101/1009/users/testuser% domain.com/Maildir/
517M /home/10.0.1.101/1009/users/testuser%domain.com/Maildir/
[root@n24 bu]# time rsync -va /home/10.0.1.101/1009/users/testuser% domain.com/Maildir/ . sending incremental file list Maildir/ Maildir/dovecot-uidlist [ ... deleted cruft ... ] Maildir/cur/1332387577.M381054P27635.n24,S=14215502,W=14448554:2, Maildir/new/ Maildir/tmp/
sent 540927820 bytes received 1222 bytes 216371616.80 bytes/sec total size is 540855755 speedup is 1.00
real 0m2.677s user 0m3.184s sys 0m1.513s
[root@n24 bu]# time dsync backup -u testuser@domain.com
mdbox:/home/bu/testuser
real 1m9.519s
user 1m7.592s
sys 0m1.126s
[root@n24 bu]# time dsync backup -u testuser@domain.com
sdbox:/home/bu/testuser2
real 1m2.164s user 1m0.882s sys 0m0.993s [root@n24 bu]#
Hi,
maybe try "dsync -o mail_fsync=never".
Cheers, Christoph
-- Christoph Bußenius Rechnerbetriebsgruppe der Fakultäten Informatik und Mathematik Technische Universität München +49 89-289-18519 <> Raum 00.05.055 <> Boltzmannstr. 3 <> Garching
On Fri, 2012-03-23 at 19:02 +0100, Christoph Bußenius wrote:
Hi,
maybe try "dsync -o mail_fsync=never".
That didn't seem to make much of a difference. On a 3.1GB backup it
shaved off 5 seconds. dsync's time was over 6 minutes with or without the mail_fsync=never. rsync copied the same 3.1GB mailbox in 15 seconds. It seems to me that dsync *should* be able to be just as fast, but it currently is spending way too much time doing something. What is it? ...Jeff
Jeff Gustafson wrote:
On Fri, 2012-03-23 at 19:02 +0100, Christoph Bußenius wrote:
Hi,
maybe try "dsync -o mail_fsync=never".
That didn't seem to make much of a difference. On a 3.1GB backup it shaved off 5 seconds. dsync's time was over 6 minutes with or without the mail_fsync=never. rsync copied the same 3.1GB mailbox in 15 seconds. It seems to me that dsync *should* be able to be just as fast, but it currently is spending way too much time doing something. What is it? ...Jeff
Next -- bench "cp -ax", against rsync -axHAX when it has to copy >75% of the data (cp ~6-8x speed). But for file speed, 'dd' is king, as it can use large buffers (~16MB gives best results on my local Gbit network), but it misses all those pesky acls and extended attrs, not to mention file perms...*sigh* Compare that to the I/O done 4k at a time by many older utils...
If I'm writing to the LOCAL HD, instead of the network, then a 1GB-4GB buffer size gives best results (1GB/s raid5). Small buffers are such a PITA!
On Fri, 2012-03-23 at 23:12 -0700, Linda Walsh wrote:
Next -- bench "cp -ax", against rsync -axHAX when it has to copy >75% of the data (cp ~6-8x speed). But for file speed, 'dd' is king, as it can use large buffers (~16MB gives best results on my local Gbit network), but it misses all those pesky acls and extended attrs, not to mention file perms...*sigh* Compare that to the I/O done 4k at a time by many older utils...
cp -ax: real 0m3.088s user 0m0.034s sys 0m3.054s
rsync -axHAX real 0m15.850s user 0m19.314s sys 0m8.816s
dsync's time was over six minutes. Each time I cleared out the
destination folder. dsync is doing something that is taking much, much, much longer to do.
...Jeff
On Fri, 23 Mar 2012, Jeff Gustafson wrote:
That didn't seem to make much of a difference. On a 3.1GB backup it shaved off 5 seconds. dsync's time was over 6 minutes with or without the mail_fsync=never. rsync copied the same 3.1GB mailbox in 15 seconds. It seems to me that dsync *should* be able to be just as fast, but it currently is spending way too much time doing something. What is it?
Syncing 3.1GB in 15 seconds would require a speed of more than 200MB per second. Depending on the harddisks used, that would be quite a challenge. If you use rsync to only transfer the files that changed (based on file modification time) you may or may not miss files that have changed but still have the same time stamp. I assume you didn't use the --checksum parameter to rsync, right?
dsync does so much more than simply copy some files...
-- Maarten
On 24/03/2012 13:21, Maarten Bezemer wrote:
On Fri, 23 Mar 2012, Jeff Gustafson wrote:
That didn't seem to make much of a difference. On a 3.1GB backup it
shaved off 5 seconds. dsync's time was over 6 minutes with or without the mail_fsync=never. rsync copied the same 3.1GB mailbox in 15 seconds. It seems to me that dsync *should* be able to be just as fast, but it currently is spending way too much time doing something. What is it?
Syncing 3.1GB in 15 seconds would require a speed of more than 200MB per second. Depending on the harddisks used, that would be quite a challenge.
rsync is only going to transfer files it believes has changed, so the transfer bandwidth will likely be lower
If you use rsync to only transfer the files that changed (based on file modification time) you may or may not miss files that have changed but still have the same time stamp. I assume you didn't use the --checksum parameter to rsync, right?
Dovecot is not very resiliant to files changing under it, but without the filename changing. I have no idea if it's supposed to work at all, but you might at least expect to see problems if you start doing this?
dsync does so much more than simply copy some files...
Quite probably, but I don't think your expose above illustrates this?
Regards
Ed W
On Sat, 2012-03-24 at 14:21 +0100, Maarten Bezemer wrote:
On Fri, 23 Mar 2012, Jeff Gustafson wrote:
That didn't seem to make much of a difference. On a 3.1GB backup it shaved off 5 seconds. dsync's time was over 6 minutes with or without the mail_fsync=never. rsync copied the same 3.1GB mailbox in 15 seconds. It seems to me that dsync *should* be able to be just as fast, but it currently is spending way too much time doing something. What is it?
Syncing 3.1GB in 15 seconds would require a speed of more than 200MB per second. Depending on the harddisks used, that would be quite a challenge. If you use rsync to only transfer the files that changed (based on file modification time) you may or may not miss files that have changed but still have the same time stamp. I assume you didn't use the --checksum parameter to rsync, right?
The destination directory was empty. I was doing a full backup.
dsync does so much more than simply copy some files...
I realize that. I am hoping that the extra data that dsync has
available to it would improve the speed of syncing backups. My baseline
testing of simply backing up a mailbox to an empty directory shows that
dsync is takes way too long to backup a single mailbox. I have over a
terabyte of data to backup.
I'm currently using rsync and it must traverse tens of thousands of
files and check the time information. It works, but I was hoping dsync
would be a better solution. dsync should be able to sync faster, by
gulping in the index information for each mailbox. I haven't even moved
to the point of sync'ing since the baseline test of simply exporting a
mailbox is so slow.
...Jeff
On 22.3.2012, at 23.25, Jeff Gustafson wrote:
[root@n24 bu]# time dsync backup -u testuser@domain.com
mdbox:/home/bu/testuserreal 1m9.519s user 1m7.592s sys 0m1.126s
Most of the time is spent on usermode CPU code. I doubt the problem is dsync itself, most likely the problem is mdbox's saving code. Or possibly index/cache code. Try the same dsync backup for:
- mbox:/tmp/mbox
- mbox:/tmp/mbox:INDEX=MEMORY
- sdbox:/tmp/sdbox
On Thu, 2012-03-29 at 02:12 +0300, Timo Sirainen wrote:
On 22.3.2012, at 23.25, Jeff Gustafson wrote:
[root@n24 bu]# time dsync backup -u testuser@domain.com
mdbox:/home/bu/testuserreal 1m9.519s user 1m7.592s sys 0m1.126s
Most of the time is spent on usermode CPU code. I doubt the problem is dsync itself, most likely the problem is mdbox's saving code. Or possibly index/cache code. Try the same dsync backup for:
- mbox:/tmp/mbox
- mbox:/tmp/mbox:INDEX=MEMORY
- sdbox:/tmp/sdbox
My tests show that maildir to mdbox or sdbox backup/conversions take
about the same length in time. I noticed maybe a second or two difference between mdbox and sdbox). On a 3.1GB mailbox either one took about 6 minutes. Rsync, on the other hand, took less than a minute. I will re-run the tests with a maildir to maildir backup and see how long it takes.
...Jeff
On 29.3.2012, at 2.51, Jeff Gustafson wrote:
Most of the time is spent on usermode CPU code. I doubt the problem is dsync itself, most likely the problem is mdbox's saving code. Or possibly index/cache code. Try the same dsync backup for:
- mbox:/tmp/mbox
- mbox:/tmp/mbox:INDEX=MEMORY
- sdbox:/tmp/sdbox
My tests show that maildir to mdbox or sdbox backup/conversions take about the same length in time. I noticed maybe a second or two difference between mdbox and sdbox). On a 3.1GB mailbox either one took about 6 minutes. Rsync, on the other hand, took less than a minute. I will re-run the tests with a maildir to maildir backup and see how long it takes.
Try also with INDEX=MEMORY, since the problem may be related to updating the indexes.
Another way to test if the problem is dsync or Dovecot's generic mail saving code is to run:
time doveadm -o mail=mdbox:/tmp/mdbox import mdbox:/path/to/real/mdbox "" all
Or if it's the mail reading code:
time doveadm fetch -u user@domain text all > /dev/null
On Thu, 2012-03-29 at 03:06 +0300, Timo Sirainen wrote:
time doveadm -o mail=mdbox:/tmp/mdbox import mdbox:/path/to/real/mdbox "" all
This tried to write to /root for some reason and failed (dovecot
2.1.3):
# time doveadm -o mail=maildir:/home/bu/test.mdbox import maildir:/home/users/user@domain.com/Maildir "" all doveadm(root): Error: chdir(/root/) failed: Permission denied (euid=10025(vmail) egid=10025(vmail) missing +x perm: /root, we're not in group 0(root), dir owned by 0:0 mode=0550) doveadm(root): Error: chdir(/root) failed: Permission denied doveadm(root): Error: Can't find namespace for mailbox Trash doveadm(root): Error: Can't find namespace for mailbox test
Or if it's the mail reading code:
time doveadm fetch -u user@domain text all > /dev/null
This ran quicker than a full dsync. Only 40s for 3.1GB. rsync still
beat it clocking in at 16s. I ran the fetch command twice figuring the files would get cached by the OS.
...Jeff
On 29.3.2012, at 3.48, Jeff Gustafson wrote:
On Thu, 2012-03-29 at 03:06 +0300, Timo Sirainen wrote:
time doveadm -o mail=mdbox:/tmp/mdbox import mdbox:/path/to/real/mdbox "" all
This tried to write to /root for some reason and failed (dovecot 2.1.3):
# time doveadm -o mail=maildir:/home/bu/test.mdbox import maildir:/home/users/user@domain.com/Maildir "" all doveadm(root): Error: chdir(/root/) failed: Permission denied (euid=10025(vmail) egid=10025(vmail) missing +x perm: /root, we're not in group 0(root), dir owned by 0:0 mode=0550) doveadm(root): Error: chdir(/root) failed: Permission denied doveadm(root): Error: Can't find namespace for mailbox Trash doveadm(root): Error: Can't find namespace for mailbox test
Maybe -o mail_home=/tmp parameter makes it happier? Or possibly it needs -u user@domain, but I'd test that first with a test account to make sure it doesn't break the mailbox in case the userdb lookup overrides some fields.
On Thu, 2012-03-29 at 04:07 +0300, Timo Sirainen wrote:
On 29.3.2012, at 3.48, Jeff Gustafson wrote:
On Thu, 2012-03-29 at 03:06 +0300, Timo Sirainen wrote:
time doveadm -o mail=mdbox:/tmp/mdbox import mdbox:/path/to/real/mdbox "" all
This tried to write to /root for some reason and failed (dovecot 2.1.3):
# time doveadm -o mail=maildir:/home/bu/test.mdbox import maildir:/home/users/user@domain.com/Maildir "" all doveadm(root): Error: chdir(/root/) failed: Permission denied (euid=10025(vmail) egid=10025(vmail) missing +x perm: /root, we're not in group 0(root), dir owned by 0:0 mode=0550) doveadm(root): Error: chdir(/root) failed: Permission denied doveadm(root): Error: Can't find namespace for mailbox Trash doveadm(root): Error: Can't find namespace for mailbox test
Maybe -o mail_home=/tmp parameter makes it happier? Or possibly it needs -u user@domain, but I'd test that first with a test account to make sure it doesn't break the mailbox in case the userdb lookup overrides some fields.
That fixed some errors, but it still is having some sort of trouble
with that command:
# time doveadm -o mail=maildir:/home/bu/user.mdbox import -u user@domain.com maildir:/home/users/user%domain.com/Maildir/ "" all doveadm(user@domain.com): Error: Can't find namespace for mailbox Trash doveadm(user@domain.com): Error: Can't find namespace for mailbox test
...Jeff
On 29.3.2012, at 5.07, Jeff Gustafson wrote:
That fixed some errors, but it still is having some sort of trouble with that command:
# time doveadm -o mail=maildir:/home/bu/user.mdbox import -u user@domain.com maildir:/home/users/user%domain.com/Maildir/ "" all doveadm(user@domain.com): Error: Can't find namespace for mailbox Trash doveadm(user@domain.com): Error: Can't find namespace for mailbox test
Oh, you don't have prefix="" namespace? If you have e.g. prefix="INBOX." namespace then use:
time doveadm -o mail=maildir:/home/bu/user.mdbox import -u user@domain maildir:/home/users/user%domain.com/Maildir/ INBOX all
On Thu, 29 Mar 2012 07:04:26 +0300, Timo Sirainen wrote:
On 29.3.2012, at 5.07, Jeff Gustafson wrote:
That fixed some errors, but it still is having some sort of trouble with that command:
# time doveadm -o mail=maildir:/home/bu/user.mdbox import -u user@domain.com maildir:/home/users/user%domain.com/Maildir/ "" all doveadm(user@domain.com): Error: Can't find namespace for mailbox Trash doveadm(user@domain.com): Error: Can't find namespace for mailbox test
Oh, you don't have prefix="" namespace? If you have e.g. prefix="INBOX." namespace then use:
time doveadm -o mail=maildir:/home/bu/user.mdbox import -u user@domain maildir:/home/users/user%domain.com/Maildir/ INBOX all
Oh! I should have known that was the problem. This was very, very fast. This test is maildir to maildir:
# time doveadm -o mail=maildir:/home/bu/test import -u user@domain.com maildir:/home/users/user%domain.com/Maildir INBOX all
real 0m0.412s user 0m0.036s sys 0m0.088s
But it was just as slow to import into mdbox:
# time doveadm -o mail=mdbox:/home/bu/test2 import -u user@domain.com maildir:/home/users/user%domain.com/Maildir INBOX all
real 7m12.738s user 6m46.161s sys 0m7.046s
mbox... still pretty fast:
# time doveadm -o mail=mbox:/home/bu/test3 import -u user@domain.com maildir:/home/users/user%domain.com/Maildir INBOX all
real 0m58.534s user 0m52.264s sys 0m5.762s
sdbox seems a little on the slow side too:
# time doveadm -o mail=sdbox:/home/bu/test4 import -u user@domain.com maildir:/home/users/user%domain.com/Maildir INBOX all
real 6m11.616s user 6m6.924s sys 0m4.579s
Does information help? It seems that [sm]dbox is on the slow side for the purpose of doing backups.
...Jeff
participants (7)
-
Christoph Bußenius
-
Ed W
-
Jeff Gustafson
-
Linda Walsh
-
Maarten Bezemer
-
ncjeffgus
-
Timo Sirainen