[Dovecot] repeating dsync - questions
I'm moving/converting email from one system to another
The source system is: Dual core x86_64 6GB memory 180 GB raid1 disks ext4 Fedora 9 Dovecot 1.0.15 Maildir format
The destination system is: Dual core x86_64 2GB memory 1TB raid1 disks ext4 Fedora 18 Dovecot 2.1.15 sdbox format
I am moving mail in a series of steps:
cd ( to mail user home directory ) mkdir Maildir
# rsync mail from source system (hoho4) to current (hoho0) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 37 minutes the first time
- # dsync mirror/convert to sdbox format on current system time dsync mirror maildir:~/Maildir
This step takes about 858 minutes (!!)
Looking at -D messages indicates that dsync is deciding between duplicates much of the time.
Looking at the results in a mail browser (Evolution), it seems fine, although the latest mails are not there.
- # pick up more current mail - use same script as in 1) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 5 minutes although this varies depending on the amount of new mail.
- # repeat the 2) dsync step time dsync mirror maildir:~/Maildir
This is where things look peculiar. When I look at the directory of sdbox/mailboxes, I see duplicated directories
[bobgus@hoho0 sdbox]$ cd mailboxes [bobgus@hoho0 mailboxes]$ ls Apple Mail To Do Apple Mail To Do_2a47983780615e5179600000ba55d82c Deleted Messages Deleted Messages_2847983780615e5179600000ba55d82c Drafts Drafts_2447983780615e5179600000ba55d82c Important Important_2947983780615e5179600000ba55d82c INBOX INBOX_1547983780615e5179600000ba55d82c Sent Sent_2547983780615e5179600000ba55d82c Trash Trash_2747983780615e5179600000ba55d82c
The 2nd dsync step has not completed yet. I'm wondering whether the extra directory will be magically moved into the older directory.
(This is unlikely because the file names are duplicated in the new directory)
Is this expected behavior? Are there command changes I can make to speed up the process? Eliminate the duplicate directories?
I've found that when going in one direction, using "backup -R" rather than mirror, works better. I'm going from mbox to sdbox, but doing roughly the same thing you are, rsync and then dsync.
Ken A.
On 4/5/2013 10:40 AM, Bob Gustafson wrote:
I'm moving/converting email from one system to another
The source system is: Dual core x86_64 6GB memory 180 GB raid1 disks ext4 Fedora 9 Dovecot 1.0.15 Maildir format
The destination system is: Dual core x86_64 2GB memory 1TB raid1 disks ext4 Fedora 18 Dovecot 2.1.15 sdbox format
I am moving mail in a series of steps:
cd ( to mail user home directory ) mkdir Maildir
# rsync mail from source system (hoho4) to current (hoho0) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 37 minutes the first time
- # dsync mirror/convert to sdbox format on current system time dsync mirror maildir:~/Maildir
This step takes about 858 minutes (!!)
Looking at -D messages indicates that dsync is deciding between duplicates much of the time.
Looking at the results in a mail browser (Evolution), it seems fine, although the latest mails are not there.
- # pick up more current mail - use same script as in 1) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 5 minutes although this varies depending on the amount of new mail.
- # repeat the 2) dsync step time dsync mirror maildir:~/Maildir
This is where things look peculiar. When I look at the directory of sdbox/mailboxes, I see duplicated directories
[bobgus@hoho0 sdbox]$ cd mailboxes [bobgus@hoho0 mailboxes]$ ls Apple Mail To Do Apple Mail To Do_2a47983780615e5179600000ba55d82c Deleted Messages Deleted Messages_2847983780615e5179600000ba55d82c Drafts Drafts_2447983780615e5179600000ba55d82c Important Important_2947983780615e5179600000ba55d82c INBOX INBOX_1547983780615e5179600000ba55d82c Sent Sent_2547983780615e5179600000ba55d82c Trash Trash_2747983780615e5179600000ba55d82c
The 2nd dsync step has not completed yet. I'm wondering whether the extra directory will be magically moved into the older directory.
(This is unlikely because the file names are duplicated in the new directory)
Is this expected behavior? Are there command changes I can make to speed up the process? Eliminate the duplicate directories?
-- Ken Anderson Pacific Internet - http://www.pacific.net
I tried that a week or so ago, with a 'dsync -R backup', but got the funny named directories, so I read more and am trying the 'dsync mirror' which doesn't require the -R.
How long does the sync step take for you? (Normalize to # of messages..)
Bob G
On Apr 5, 2013, at 13:48, Ken A <ka@pacific.net> wrote:
I've found that when going in one direction, using "backup -R" rather than mirror, works better. I'm going from mbox to sdbox, but doing roughly the same thing you are, rsync and then dsync.
Ken A.
On 4/5/2013 10:40 AM, Bob Gustafson wrote:
I'm moving/converting email from one system to another
The source system is: Dual core x86_64 6GB memory 180 GB raid1 disks ext4 Fedora 9 Dovecot 1.0.15 Maildir format
The destination system is: Dual core x86_64 2GB memory 1TB raid1 disks ext4 Fedora 18 Dovecot 2.1.15 sdbox format
I am moving mail in a series of steps:
cd ( to mail user home directory ) mkdir Maildir
# rsync mail from source system (hoho4) to current (hoho0) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 37 minutes the first time
- # dsync mirror/convert to sdbox format on current system time dsync mirror maildir:~/Maildir
This step takes about 858 minutes (!!)
Looking at -D messages indicates that dsync is deciding between duplicates much of the time.
Looking at the results in a mail browser (Evolution), it seems fine, although the latest mails are not there.
- # pick up more current mail - use same script as in 1) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 5 minutes although this varies depending on the amount of new mail.
- # repeat the 2) dsync step time dsync mirror maildir:~/Maildir
This is where things look peculiar. When I look at the directory of sdbox/mailboxes, I see duplicated directories
[bobgus@hoho0 sdbox]$ cd mailboxes [bobgus@hoho0 mailboxes]$ ls Apple Mail To Do Apple Mail To Do_2a47983780615e5179600000ba55d82c Deleted Messages Deleted Messages_2847983780615e5179600000ba55d82c Drafts Drafts_2447983780615e5179600000ba55d82c Important Important_2947983780615e5179600000ba55d82c INBOX INBOX_1547983780615e5179600000ba55d82c Sent Sent_2547983780615e5179600000ba55d82c Trash Trash_2747983780615e5179600000ba55d82c
The 2nd dsync step has not completed yet. I'm wondering whether the extra directory will be magically moved into the older directory.
(This is unlikely because the file names are duplicated in the new directory)
Is this expected behavior? Are there command changes I can make to speed up the process? Eliminate the duplicate directories?
-- Ken Anderson Pacific Internet - http://www.pacific.net
It's about 300GB of mboxes (I don't know how many messages) and dsync took about 3 hours from scratch in the destination. But, when I sync more frequently, it's much quicker, and depends on the number of changes. With no changes it runs in about 15 min. Ken A.
On 4/5/2013 1:54 PM, Bob Gustafson wrote:
I tried that a week or so ago, with a 'dsync -R backup', but got the funny named directories, so I read more and am trying the 'dsync mirror' which doesn't require the -R.
How long does the sync step take for you? (Normalize to # of messages..)
Bob G
On Apr 5, 2013, at 13:48, Ken A <ka@pacific.net> wrote:
I've found that when going in one direction, using "backup -R" rather than mirror, works better. I'm going from mbox to sdbox, but doing roughly the same thing you are, rsync and then dsync.
Ken A.
On 4/5/2013 10:40 AM, Bob Gustafson wrote:
I'm moving/converting email from one system to another
The source system is: Dual core x86_64 6GB memory 180 GB raid1 disks ext4 Fedora 9 Dovecot 1.0.15 Maildir format
The destination system is: Dual core x86_64 2GB memory 1TB raid1 disks ext4 Fedora 18 Dovecot 2.1.15 sdbox format
I am moving mail in a series of steps:
cd ( to mail user home directory ) mkdir Maildir
# rsync mail from source system (hoho4) to current (hoho0) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 37 minutes the first time
- # dsync mirror/convert to sdbox format on current system time dsync mirror maildir:~/Maildir
This step takes about 858 minutes (!!)
Looking at -D messages indicates that dsync is deciding between duplicates much of the time.
Looking at the results in a mail browser (Evolution), it seems fine, although the latest mails are not there.
- # pick up more current mail - use same script as in 1) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 5 minutes although this varies depending on the amount of new mail.
- # repeat the 2) dsync step time dsync mirror maildir:~/Maildir
This is where things look peculiar. When I look at the directory of sdbox/mailboxes, I see duplicated directories
[bobgus@hoho0 sdbox]$ cd mailboxes [bobgus@hoho0 mailboxes]$ ls Apple Mail To Do Apple Mail To Do_2a47983780615e5179600000ba55d82c Deleted Messages Deleted Messages_2847983780615e5179600000ba55d82c Drafts Drafts_2447983780615e5179600000ba55d82c Important Important_2947983780615e5179600000ba55d82c INBOX INBOX_1547983780615e5179600000ba55d82c Sent Sent_2547983780615e5179600000ba55d82c Trash Trash_2747983780615e5179600000ba55d82c
The 2nd dsync step has not completed yet. I'm wondering whether the extra directory will be magically moved into the older directory.
(This is unlikely because the file names are duplicated in the new directory)
Is this expected behavior? Are there command changes I can make to speed up the process? Eliminate the duplicate directories?
-- Ken Anderson Pacific Internet - http://www.pacific.net
-- Ken Anderson Pacific Internet - http://www.pacific.net
Something must be wrong with my setup. It took 14+ hours for the first dsync pass and it hasn't finished yet on the rerun to pick up the latest mail (around 12+ hours). I have about 9 GB of mail! The destination system is not fast, but..
Maybe I will wait for the 2.2 release..
Thanks for your response.
Bob G
On Fri, 2013-04-05 at 22:45 -0500, Ken A wrote:
It's about 300GB of mboxes (I don't know how many messages) and dsync took about 3 hours from scratch in the destination. But, when I sync more frequently, it's much quicker, and depends on the number of changes. With no changes it runs in about 15 min. Ken A.
On 4/5/2013 1:54 PM, Bob Gustafson wrote:
I tried that a week or so ago, with a 'dsync -R backup', but got the funny named directories, so I read more and am trying the 'dsync mirror' which doesn't require the -R.
How long does the sync step take for you? (Normalize to # of messages..)
Bob G
On Apr 5, 2013, at 13:48, Ken A <ka@pacific.net> wrote:
I've found that when going in one direction, using "backup -R" rather than mirror, works better. I'm going from mbox to sdbox, but doing roughly the same thing you are, rsync and then dsync.
Ken A.
On 4/5/2013 10:40 AM, Bob Gustafson wrote:
I'm moving/converting email from one system to another
The source system is: Dual core x86_64 6GB memory 180 GB raid1 disks ext4 Fedora 9 Dovecot 1.0.15 Maildir format
The destination system is: Dual core x86_64 2GB memory 1TB raid1 disks ext4 Fedora 18 Dovecot 2.1.15 sdbox format
I am moving mail in a series of steps:
cd ( to mail user home directory ) mkdir Maildir
# rsync mail from source system (hoho4) to current (hoho0) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 37 minutes the first time
- # dsync mirror/convert to sdbox format on current system time dsync mirror maildir:~/Maildir
This step takes about 858 minutes (!!)
Looking at -D messages indicates that dsync is deciding between duplicates much of the time.
Looking at the results in a mail browser (Evolution), it seems fine, although the latest mails are not there.
- # pick up more current mail - use same script as in 1) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 5 minutes although this varies depending on the amount of new mail.
- # repeat the 2) dsync step time dsync mirror maildir:~/Maildir
This is where things look peculiar. When I look at the directory of sdbox/mailboxes, I see duplicated directories
[bobgus@hoho0 sdbox]$ cd mailboxes [bobgus@hoho0 mailboxes]$ ls Apple Mail To Do Apple Mail To Do_2a47983780615e5179600000ba55d82c Deleted Messages Deleted Messages_2847983780615e5179600000ba55d82c Drafts Drafts_2447983780615e5179600000ba55d82c Important Important_2947983780615e5179600000ba55d82c INBOX INBOX_1547983780615e5179600000ba55d82c Sent Sent_2547983780615e5179600000ba55d82c Trash Trash_2747983780615e5179600000ba55d82c
The 2nd dsync step has not completed yet. I'm wondering whether the extra directory will be magically moved into the older directory.
(This is unlikely because the file names are duplicated in the new directory)
Is this expected behavior? Are there command changes I can make to speed up the process? Eliminate the duplicate directories?
-- Ken Anderson Pacific Internet - http://www.pacific.net
Maybe take a look at "vmstat 2" and see if i/o is blocking a lot or you are hitting swap space? Someone else may be more informed about the inner workings of dsync and how it handles maildir as opposed to mbox. Best of luck, Ken
On 4/6/2013 1:09 AM, Bob Gustafson wrote:
Something must be wrong with my setup. It took 14+ hours for the first dsync pass and it hasn't finished yet on the rerun to pick up the latest mail (around 12+ hours). I have about 9 GB of mail! The destination system is not fast, but..
Maybe I will wait for the 2.2 release..
Thanks for your response.
Bob G
On Fri, 2013-04-05 at 22:45 -0500, Ken A wrote:
It's about 300GB of mboxes (I don't know how many messages) and dsync took about 3 hours from scratch in the destination. But, when I sync more frequently, it's much quicker, and depends on the number of changes. With no changes it runs in about 15 min. Ken A.
On 4/5/2013 1:54 PM, Bob Gustafson wrote:
I tried that a week or so ago, with a 'dsync -R backup', but got the funny named directories, so I read more and am trying the 'dsync mirror' which doesn't require the -R.
How long does the sync step take for you? (Normalize to # of messages..)
Bob G
On Apr 5, 2013, at 13:48, Ken A <ka@pacific.net> wrote:
I've found that when going in one direction, using "backup -R" rather than mirror, works better. I'm going from mbox to sdbox, but doing roughly the same thing you are, rsync and then dsync.
Ken A.
On 4/5/2013 10:40 AM, Bob Gustafson wrote:
I'm moving/converting email from one system to another
The source system is: Dual core x86_64 6GB memory 180 GB raid1 disks ext4 Fedora 9 Dovecot 1.0.15 Maildir format
The destination system is: Dual core x86_64 2GB memory 1TB raid1 disks ext4 Fedora 18 Dovecot 2.1.15 sdbox format
I am moving mail in a series of steps:
cd ( to mail user home directory ) mkdir Maildir
# rsync mail from source system (hoho4) to current (hoho0) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 37 minutes the first time
- # dsync mirror/convert to sdbox format on current system time dsync mirror maildir:~/Maildir
This step takes about 858 minutes (!!)
Looking at -D messages indicates that dsync is deciding between duplicates much of the time.
Looking at the results in a mail browser (Evolution), it seems fine, although the latest mails are not there.
- # pick up more current mail - use same script as in 1) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 5 minutes although this varies depending on the amount of new mail.
- # repeat the 2) dsync step time dsync mirror maildir:~/Maildir
This is where things look peculiar. When I look at the directory of sdbox/mailboxes, I see duplicated directories
[bobgus@hoho0 sdbox]$ cd mailboxes [bobgus@hoho0 mailboxes]$ ls Apple Mail To Do Apple Mail To Do_2a47983780615e5179600000ba55d82c Deleted Messages Deleted Messages_2847983780615e5179600000ba55d82c Drafts Drafts_2447983780615e5179600000ba55d82c Important Important_2947983780615e5179600000ba55d82c INBOX INBOX_1547983780615e5179600000ba55d82c Sent Sent_2547983780615e5179600000ba55d82c Trash Trash_2747983780615e5179600000ba55d82c
The 2nd dsync step has not completed yet. I'm wondering whether the extra directory will be magically moved into the older directory.
(This is unlikely because the file names are duplicated in the new directory)
Is this expected behavior? Are there command changes I can make to speed up the process? Eliminate the duplicate directories?
-- Ken Anderson Pacific Internet - http://www.pacific.net
-- Ken Anderson Pacific Internet - http://www.pacific.net
I whacked Maildir and sdbox and started over.
Tweek of rsync script - removed trailing / on destination.
Did yum update, restart
Avoided running Firefox on that machine..
It is now running the 1st pass of dsync - estimated finish is about 4.3 hours on 14G of mails
vmstat 2 shows no swapping (now..)
Thanks for your suggestion.
Bob G
On Apr 6, 2013, at 10:19, Ken A <ka@pacific.net> wrote:
Maybe take a look at "vmstat 2" and see if i/o is blocking a lot or you are hitting swap space? Someone else may be more informed about the inner workings of dsync and how it handles maildir as opposed to mbox. Best of luck, Ken
On 4/6/2013 1:09 AM, Bob Gustafson wrote:
Something must be wrong with my setup. It took 14+ hours for the first dsync pass and it hasn't finished yet on the rerun to pick up the latest mail (around 12+ hours). I have about 9 GB of mail! The destination system is not fast, but..
Maybe I will wait for the 2.2 release..
Thanks for your response.
Bob G
On Fri, 2013-04-05 at 22:45 -0500, Ken A wrote:
It's about 300GB of mboxes (I don't know how many messages) and dsync took about 3 hours from scratch in the destination. But, when I sync more frequently, it's much quicker, and depends on the number of changes. With no changes it runs in about 15 min. Ken A.
On 4/5/2013 1:54 PM, Bob Gustafson wrote:
I tried that a week or so ago, with a 'dsync -R backup', but got the funny named directories, so I read more and am trying the 'dsync mirror' which doesn't require the -R.
How long does the sync step take for you? (Normalize to # of messages..)
Bob G
On Apr 5, 2013, at 13:48, Ken A <ka@pacific.net> wrote:
I've found that when going in one direction, using "backup -R" rather than mirror, works better. I'm going from mbox to sdbox, but doing roughly the same thing you are, rsync and then dsync.
Ken A.
On 4/5/2013 10:40 AM, Bob Gustafson wrote:
I'm moving/converting email from one system to another
The source system is: Dual core x86_64 6GB memory 180 GB raid1 disks ext4 Fedora 9 Dovecot 1.0.15 Maildir format
The destination system is: Dual core x86_64 2GB memory 1TB raid1 disks ext4 Fedora 18 Dovecot 2.1.15 sdbox format
I am moving mail in a series of steps:
cd ( to mail user home directory ) mkdir Maildir
# rsync mail from source system (hoho4) to current (hoho0) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 37 minutes the first time
- # dsync mirror/convert to sdbox format on current system time dsync mirror maildir:~/Maildir
This step takes about 858 minutes (!!)
Looking at -D messages indicates that dsync is deciding between duplicates much of the time.
Looking at the results in a mail browser (Evolution), it seems fine, although the latest mails are not there.
- # pick up more current mail - use same script as in 1) time rsync -arv --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir/
This step takes about 5 minutes although this varies depending on the amount of new mail.
- # repeat the 2) dsync step time dsync mirror maildir:~/Maildir
This is where things look peculiar. When I look at the directory of sdbox/mailboxes, I see duplicated directories
[bobgus@hoho0 sdbox]$ cd mailboxes [bobgus@hoho0 mailboxes]$ ls Apple Mail To Do Apple Mail To Do_2a47983780615e5179600000ba55d82c Deleted Messages Deleted Messages_2847983780615e5179600000ba55d82c Drafts Drafts_2447983780615e5179600000ba55d82c Important Important_2947983780615e5179600000ba55d82c INBOX INBOX_1547983780615e5179600000ba55d82c Sent Sent_2547983780615e5179600000ba55d82c Trash Trash_2747983780615e5179600000ba55d82c
The 2nd dsync step has not completed yet. I'm wondering whether the extra directory will be magically moved into the older directory.
(This is unlikely because the file names are duplicated in the new directory)
Is this expected behavior? Are there command changes I can make to speed up the process? Eliminate the duplicate directories?
-- Ken Anderson Pacific Internet - http://www.pacific.net
-- Ken Anderson Pacific Internet - http://www.pacific.net
I am still on my quest for a quick way to move mail from a live Maildir system to a 'soon to be live' sdbox system.
I copy Maildir to new system using: rsync -ar --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir
Then I convert from Maildir to sdbox with: dsync mirror maildir:~/Maildir
Then I copy more messages from live system using rsync Then I do the 'dsync mirror maildir:~/Maildir' again
There were only a few messages that were copied over in the 2nd rsync pass and it went quickly, but the 2nd dsync pass is taking a long time.
Also, I see strange directories in the sdbox directory (see below)
Is this normal? Why is it taking a long time? (debug is set..?) See dovecot -n below
Thanks for your time
Bob G
[bobgus@hoho0 ~]$ du -h sdbox 4.5G sdbox/mailboxes/Sent_09e4633955496151c51a0000ba55d82c/dbox-Mails 4.5G sdbox/mailboxes/Sent_09e4633955496151c51a0000ba55d82c 8.0K sdbox/mailboxes/Apple Mail To Do_0ee4633955496151c51a0000ba55d82c/dbox-Mails 12K sdbox/mailboxes/Apple Mail To Do_0ee4633955496151c51a0000ba55d82c 358M sdbox/mailboxes/Drafts/dbox-Mails 358M sdbox/mailboxes/Drafts 4.5G sdbox/mailboxes/INBOX_f9e3633955496151c51a0000ba55d82c/dbox-Mails 4.5G sdbox/mailboxes/INBOX_f9e3633955496151c51a0000ba55d82c 88K sdbox/mailboxes/Important/dbox-Mails 92K sdbox/mailboxes/Important ... ...
[bobgus@hoho0 ~]$ dovecot -n # 2.1.15: /etc/dovecot/dovecot.conf # OS: Linux 3.8.5-201.fc18.x86_64 x86_64 Fedora release 18 (Spherical Cow) auth_debug = yes auth_mechanisms = plain login cram-md5 auth_verbose = yes disable_plaintext_auth = no first_valid_gid = 1000 first_valid_uid = 1000 mail_debug = yes mail_location = sdbox:~/sdbox managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date ihave mbox_write_locks = fcntl namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = separator = / } passdb { args = scheme=CRYPT username_format=%u /etc/dovecot/users driver = passwd-file } plugin { sieve = ~/.dovecot.sieve sieve_dir = ~/sieve } protocols = imap service auth { unix_listener auth-userdb { mode = 0777 } user = root } service imap-login { vsz_limit = 128 M } service imap { vsz_limit = 768 M } service managesieve-login { vsz_limit = 128 M } service managesieve { vsz_limit = 768 M } ssl = required ssl_cert = </etc/pki/dovecot/certs/dovecot.pem ssl_key = </etc/pki/dovecot/private/dovecot.pem ssl_require_crl = no userdb { args = username_format=%u /etc/dovecot/users driver = passwd-file } [bobgus@hoho0 ~]$
On 7.4.2013, at 17.12, Bob Gustafson <bobgus@rcn.com> wrote:
I am still on my quest for a quick way to move mail from a live Maildir system to a 'soon to be live' sdbox system.
I copy Maildir to new system using: rsync -ar --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir
Then I convert from Maildir to sdbox with: dsync mirror maildir:~/Maildir
Then I copy more messages from live system using rsync
^^ that is the mistake
Then I do the 'dsync mirror maildir:~/Maildir' again
There were only a few messages that were copied over in the 2nd rsync pass and it went quickly, but the 2nd dsync pass is taking a long time.
The second rsync is overwriting all the metadata changes (mailbox GUIDs most importantly) that the first dsync run did.
Also, I see strange directories in the sdbox directory (see below)
Also caused by the same thing.
v2.2 dsync should be able to handle this much better, but in general you shouldn't be mixing rsync and dsync in that way. You could for example install Dovecot v2.1 dsync to the source server (could even be under /tmp by compiling from sources) and then do the conversion directly from source server maildir to destination server sdbox.
On Sun, 2013-04-07 at 20:50 +0300, Timo Sirainen wrote:
On 7.4.2013, at 17.12, Bob Gustafson <bobgus@rcn.com> wrote:
I am still on my quest for a quick way to move mail from a live Maildir system to a 'soon to be live' sdbox system.
I copy Maildir to new system using: rsync -ar --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir
Then I convert from Maildir to sdbox with: dsync mirror maildir:~/Maildir
Then I copy more messages from live system using rsync
^^ that is the mistake
I guess I have a basic misunderstanding of what 'dsync mirror' is doing.
My understanding is that going from Maildir to sdbox, dsync does not mess with the data in Maildir. The Maildir metadata is in one form and the sdbox metadata is in another form (in the sdbox directory).
No new email messages enter the sdbox system to be 'mirrored' to the Maildir system.
I thought of using the 'dsync backup' command, but the sentence "Any changes done in destination are discarded." seems to indicate that each time 'dsync backup' is done, it starts from the beginning. No incremental backup (but this is done in 2.2 ?)
Then I do the 'dsync mirror maildir:~/Maildir' again
There were only a few messages that were copied over in the 2nd rsync pass and it went quickly, but the 2nd dsync pass is taking a long time.
The second rsync is overwriting all the metadata changes (mailbox GUIDs most importantly) that the first dsync run did.
Why does dsync mess with the Maildir metadata? Won't that just confuse the dovecot running on the Maildir system?
Also, I see strange directories in the sdbox directory (see below)
Also caused by the same thing.
v2.2 dsync should be able to handle this much better, but in general you shouldn't be mixing rsync and dsync in that way. You could for example install Dovecot v2.1 dsync to the source server (could even be under /tmp by compiling from sources) and then do the conversion directly from source server maildir to destination server sdbox.
I used rsync because I really don't want the source system messed with. This has been a learning experience with the possibility (and reality) of starting over on the destination system by doing 'rm -rf Maildir' and 'rm -rf sdbox'.
Until I figure it all out, I want that option.
This also means that the 2nd (and nth) spin of (rsync; dsync) needs to take less time, approaching the mean time between emails (although I can disconnect from ISP to do the last batch and then switch dovecots)
On 8.4.2013, at 0.10, Bob Gustafson <bobgus@rcn.com> wrote:
I am still on my quest for a quick way to move mail from a live Maildir system to a 'soon to be live' sdbox system.
I copy Maildir to new system using: rsync -ar --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir
Then I convert from Maildir to sdbox with: dsync mirror maildir:~/Maildir
Then I copy more messages from live system using rsync
^^ that is the mistake
I guess I have a basic misunderstanding of what 'dsync mirror' is doing.
My understanding is that going from Maildir to sdbox, dsync does not mess with the data in Maildir. The Maildir metadata is in one form and the sdbox metadata is in another form (in the sdbox directory).
dsync does mess with metadata in the maildir. also with dsync mirror (as opposed to dsync backup) it can also modify the contents. The main problem here is:
- dsync sees that a folder A in maildir doesn't have a GUID (because dsync is just about the only tool that uses it right now), and assigns the mailbox a new GUID
- dsync syncs the mailbox to sdbox with that GUID
- rsync comes and wipes out the maildir-uidlist that contained the GUID
- second dsync sees that folder A in maildir doesn't have a GUID, and assigns a new GUID to it
- now maildir has folder A with GUID 1, and sdbox has folder A with GUID 2
- dsync thinks they are two different folders, and duplicates them as A and A_2. the A_2 gets also copied back to maildir, because you're using dsync mirror. This is why the second dsync is slow, it's actually doing all the work again, and actually doing twice the work since it's copying the mails from sdbox to maildir as well.
v2.2 dsync is somewhat smarter and can figure out that they are actually the same folder A and it simply changes the other's GUID instead of duplicating all data.
No new email messages enter the sdbox system to be 'mirrored' to the Maildir system.
I thought of using the 'dsync backup' command, but the sentence "Any changes done in destination are discarded." seems to indicate that each time 'dsync backup' is done, it starts from the beginning. No incremental backup (but this is done in 2.2 ?)
dsync backup is incremental. it just wipes out any changes done at the other side (if there happens to be any).
Then I do the 'dsync mirror maildir:~/Maildir' again
There were only a few messages that were copied over in the 2nd rsync pass and it went quickly, but the 2nd dsync pass is taking a long time.
The second rsync is overwriting all the metadata changes (mailbox GUIDs most importantly) that the first dsync run did.
Why does dsync mess with the Maildir metadata? Won't that just confuse the dovecot running on the Maildir system?
Incremental dsync doesn't work (well) without additional metadata.
Ok, see interspersed.
On Mon, 2013-04-08 at 00:53 +0300, Timo Sirainen wrote:
On 8.4.2013, at 0.10, Bob Gustafson <bobgus@rcn.com> wrote:
I am still on my quest for a quick way to move mail from a live Maildir system to a 'soon to be live' sdbox system.
I copy Maildir to new system using: rsync -ar --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir
Then I convert from Maildir to sdbox with: dsync mirror maildir:~/Maildir
Then I copy more messages from live system using rsync
^^ that is the mistake
I guess I have a basic misunderstanding of what 'dsync mirror' is doing.
My understanding is that going from Maildir to sdbox, dsync does not mess with the data in Maildir. The Maildir metadata is in one form and the sdbox metadata is in another form (in the sdbox directory).
dsync does mess with metadata in the maildir. also with dsync mirror (as opposed to dsync backup) it can also modify the contents. The main problem here is:
- dsync sees that a folder A in maildir doesn't have a GUID (because dsync is just about the only tool that uses it right now), and assigns the mailbox a new GUID
- dsync syncs the mailbox to sdbox with that GUID
- rsync comes and wipes out the maildir-uidlist that contained the GUID
- second dsync sees that folder A in maildir doesn't have a GUID, and assigns a new GUID to it
- now maildir has folder A with GUID 1, and sdbox has folder A with GUID 2
- dsync thinks they are two different folders, and duplicates them as A and A_2. the A_2 gets also copied back to maildir, because you're using dsync mirror. This is why the second dsync is slow, it's actually doing all the work again, and actually doing twice the work since it's copying the mails from sdbox to maildir as well.
v2.2 dsync is somewhat smarter and can figure out that they are actually the same folder A and it simply changes the other's GUID instead of duplicating all data.
Ok, how is this for a scheme:
- Initially rsync Maildir to destination system
- Do initial 'dsync backup' from Maildir to sdbox on destination system
- Do a 2nd rsync of new Maildir data from live system to destination system, but don't copy the old maildir-uidlist still in the Maildir of the live system. rsync -ar --times --exclude=dovecot-uidlist <live sys Maildir> <dest system Maildir>
I could also use the --ignore-existing option
In theory, rsync would not touch the dovecot-uidlist file, would not touch the existing message files, but would copy over the new messages received during the time 'dsync backup' was doing its previous run.
- Do a 2nd 'dsync backup' from the Maildir to the sdbox (which hasn't changed since the 1st 'dsync backup')
No new email messages enter the sdbox system to be 'mirrored' to the Maildir system.
I thought of using the 'dsync backup' command, but the sentence "Any changes done in destination are discarded." seems to indicate that each time 'dsync backup' is done, it starts from the beginning. No incremental backup (but this is done in 2.2 ?)
dsync backup is incremental. it just wipes out any changes done at the other side (if there happens to be any).
On 8.4.2013, at 1.32, Bob Gustafson <bobgus@rcn.com> wrote:
Ok, how is this for a scheme:
- Initially rsync Maildir to destination system
- Do initial 'dsync backup' from Maildir to sdbox on destination system
- Do a 2nd rsync of new Maildir data from live system to destination system, but don't copy the old maildir-uidlist still in the Maildir of the live system. rsync -ar --times --exclude=dovecot-uidlist <live sys Maildir> <dest system Maildir>
I could also use the --ignore-existing option
In theory, rsync would not touch the dovecot-uidlist file, would not touch the existing message files, but would copy over the new messages received during the time 'dsync backup' was doing its previous run.
- Do a 2nd 'dsync backup' from the Maildir to the sdbox (which hasn't changed since the 1st 'dsync backup')
Without rsync --delete you may end up with duplicates if message flags had changed during it.
There's also another potential problem. Since you don't now update the dovecot-uidlist, the message UIDs may change. Some clients cache messages by their UID. These clients may lose messages or show wrong messages to users. So for example if:
- rsync + dsync is run to sdbox. dovecot-uidlist now says that next_uid=123
- Maildir receives mail A that gets assigned UID 123
- Maildir receives mail B that gets assigned UID 124
- User deletes mail A
- rsync is run, which copies the new mail B
- dsync is run, which notices a new mail B, and assigns it the a new UID 123
- You switch user to new Dovecot
- dbox receives a new mail C, and gets assigned UID 124
- User's client is now pretty much completely confused about what UIDs 123 and 124 contain. User may see different mails as subject and body. User may not even see the mail B anymore without a client cache rebuild.
Yet another possibility would be to use dsync to migrate the mails using IMAP protocol rather than from Maildir directly: http://wiki2.dovecot.org/Migration/Dsync
My need at the moment is only a *one* time deal.
I just need to copy/convert all of the messages over to the new system and new (sdbox) format *once*. Then all of the clients can start from zero to build their caches based on the new mail box (not that many clients).
I will take a look at the Imap copy process - maybe that would be simpler in the long run. No rsync needed (as long as the source system is not changed by the Imap copy process)
Thanks much for your comments and suggestions
Bob G
On Mon, 2013-04-08 at 13:57 +0300, Timo Sirainen wrote:
On 8.4.2013, at 1.32, Bob Gustafson <bobgus@rcn.com> wrote:
Ok, how is this for a scheme:
- Initially rsync Maildir to destination system
- Do initial 'dsync backup' from Maildir to sdbox on destination system
- Do a 2nd rsync of new Maildir data from live system to destination system, but don't copy the old maildir-uidlist still in the Maildir of the live system. rsync -ar --times --exclude=dovecot-uidlist <live sys Maildir> <dest system Maildir>
I could also use the --ignore-existing option
In theory, rsync would not touch the dovecot-uidlist file, would not touch the existing message files, but would copy over the new messages received during the time 'dsync backup' was doing its previous run.
- Do a 2nd 'dsync backup' from the Maildir to the sdbox (which hasn't changed since the 1st 'dsync backup')
Without rsync --delete you may end up with duplicates if message flags had changed during it.
There's also another potential problem. Since you don't now update the dovecot-uidlist, the message UIDs may change. Some clients cache messages by their UID. These clients may lose messages or show wrong messages to users. So for example if:
- rsync + dsync is run to sdbox. dovecot-uidlist now says that next_uid=123
- Maildir receives mail A that gets assigned UID 123
- Maildir receives mail B that gets assigned UID 124
- User deletes mail A
- rsync is run, which copies the new mail B
- dsync is run, which notices a new mail B, and assigns it the a new UID 123
- You switch user to new Dovecot
- dbox receives a new mail C, and gets assigned UID 124
- User's client is now pretty much completely confused about what UIDs 123 and 124 contain. User may see different mails as subject and body. User may not even see the mail B anymore without a client cache rebuild.
Yet another possibility would be to use dsync to migrate the mails using IMAP protocol rather than from Maildir directly: http://wiki2.dovecot.org/Migration/Dsync
OK, success: see timings (real nn) below commands.
Initial copy of Maildir from live system to test sys (14G of data)
rsync -ar --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir real 37m
Then 1st 'dsync -R backup maildir:~/Maildir' real 828m
Then 2nd rsync to pick up new mail - **don't touch existing files**
rsync -ar --times --ignore-existing hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir real 3m
Then 2nd 'dsync -R backup maildir:~/Maildir' real 12m
The --ignore-existing option on the 2nd rsync allows dsync to process the additional emails in a reasonable amount of time.
The dovecot-uidlist which is modified in the Maildir by dsync is not written over by the 2nd rsync and therefore the 2nd dsync just processes the added messages. (There were no deletes between rsync runs)
Thanks much for your hints and comments.
Bob G
On Mon, 2013-04-08 at 00:53 +0300, Timo Sirainen wrote:
On 8.4.2013, at 0.10, Bob Gustafson <bobgus@rcn.com> wrote:
I am still on my quest for a quick way to move mail from a live Maildir system to a 'soon to be live' sdbox system.
I copy Maildir to new system using: rsync -ar --times hoho4:/home/bobgus/Maildir/ /home/bobgus/Maildir
Then I convert from Maildir to sdbox with: dsync mirror maildir:~/Maildir
Then I copy more messages from live system using rsync
^^ that is the mistake
I guess I have a basic misunderstanding of what 'dsync mirror' is doing.
My understanding is that going from Maildir to sdbox, dsync does not mess with the data in Maildir. The Maildir metadata is in one form and the sdbox metadata is in another form (in the sdbox directory).
dsync does mess with metadata in the maildir. also with dsync mirror (as opposed to dsync backup) it can also modify the contents. The main problem here is:
- dsync sees that a folder A in maildir doesn't have a GUID (because dsync is just about the only tool that uses it right now), and assigns the mailbox a new GUID
- dsync syncs the mailbox to sdbox with that GUID
- rsync comes and wipes out the maildir-uidlist that contained the GUID
- second dsync sees that folder A in maildir doesn't have a GUID, and assigns a new GUID to it
- now maildir has folder A with GUID 1, and sdbox has folder A with GUID 2
- dsync thinks they are two different folders, and duplicates them as A and A_2. the A_2 gets also copied back to maildir, because you're using dsync mirror. This is why the second dsync is slow, it's actually doing all the work again, and actually doing twice the work since it's copying the mails from sdbox to maildir as well.
v2.2 dsync is somewhat smarter and can figure out that they are actually the same folder A and it simply changes the other's GUID instead of duplicating all data.
No new email messages enter the sdbox system to be 'mirrored' to the Maildir system.
I thought of using the 'dsync backup' command, but the sentence "Any changes done in destination are discarded." seems to indicate that each time 'dsync backup' is done, it starts from the beginning. No incremental backup (but this is done in 2.2 ?)
dsync backup is incremental. it just wipes out any changes done at the other side (if there happens to be any).
Then I do the 'dsync mirror maildir:~/Maildir' again
There were only a few messages that were copied over in the 2nd rsync pass and it went quickly, but the 2nd dsync pass is taking a long time.
The second rsync is overwriting all the metadata changes (mailbox GUIDs most importantly) that the first dsync run did.
Why does dsync mess with the Maildir metadata? Won't that just confuse the dovecot running on the Maildir system?
Incremental dsync doesn't work (well) without additional metadata.
participants (3)
-
Bob Gustafson
-
Ken A
-
Timo Sirainen