[Dovecot] Maildirs location migration
Hello Timo,
I'm using dovecot-2.1.15 with Maildir mailboxes located on an NFS filer. Postfix is used to deliver mail through procmail ("mailbox_command) as LDA.
Our team has bought another filer and everything is meant to go on it, so we're planning to migrate those mailboxes to the new filer.
To test it only on a subset of users, I ended up to the following solution :
- keep new messages in postfix mailqueue for those users (via a service and a transport)
- prevent new dovecot authentication for those users (via the "auth-deny" passwd-file type passdb)
- doveadm kick those users
- copy or rsync the mailboxes
- symlink maildir, control and indexes directories to the new filer (nfs mounted on the mail server)
- re-enable mail delivery and imap authentication
I was wondering if such a migration could be done differently in order to be seamless to the user :
for the postfix part, it would be easy, I guess, to create a new service (and a new transport map using this service) to deliver to the new location
but for the imap part, I'm not sure if it can be done. My guess is that, if feasible, it would involve some namespace settings and a dsync copy but I cannot figure out how exactly.
What do you think ?
Thanks
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On 3.4.2013, at 18.42, Thomas Hummel <hummel@pasteur.fr> wrote:
I'm using dovecot-2.1.15 with Maildir mailboxes located on an NFS filer. Postfix is used to deliver mail through procmail ("mailbox_command) as LDA.
Our team has bought another filer and everything is meant to go on it, so we're planning to migrate those mailboxes to the new filer. .. I was wondering if such a migration could be done differently in order to be seamless to the user :
for the postfix part, it would be easy, I guess, to create a new service (and a new transport map using this service) to deliver to the new location
but for the imap part, I'm not sure if it can be done. My guess is that, if feasible, it would involve some namespace settings and a dsync copy but I cannot figure out how exactly.
http://wiki2.dovecot.org/Tools/Dsync#example_converting works for moving mailboxes as well as converting. It works even while procmail is used to deliver mails.
On Thu, Apr 04, 2013 at 10:27:57PM +0300, Timo Sirainen wrote:
http://wiki2.dovecot.org/Tools/Dsync#example_converting works for moving mailboxes as well as converting. It works even while procmail is used to deliver mails.
Thanks. I guess it works with conversion from Maildir to Maildir somewhere else too, right.
So basically, it works as with conventional data moved with rsync in 2 phases (initial copy and sync of the (hopefully small) reminder to minimize «downtime» or incomplete state except that dsync is used to have the mailbox format and dovecot knowledge, right ?
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On Wed, Apr 10, 2013 at 06:28:47PM +0200, Thomas Hummel wrote:
So basically, it works as with conventional data moved with rsync in 2 phases (initial copy and sync of the (hopefully small) reminder to minimize «downtime» or incomplete state except that dsync is used to have the mailbox format and dovecot knowledge, right ?
Isn't there still a critical section : new imap connections could be created (if auth is not denied temporary for this user) while the final sync still to be finished or started ?
Thanks.
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On Wed, 2013-04-10 at 18:59 +0200, Thomas Hummel wrote:
On Wed, Apr 10, 2013 at 06:28:47PM +0200, Thomas Hummel wrote:
So basically, it works as with conventional data moved with rsync in 2 phases (initial copy and sync of the (hopefully small) reminder to minimize «downtime» or incomplete state except that dsync is used to have the mailbox format and dovecot knowledge, right ?
Isn't there still a critical section : new imap connections could be created (if auth is not denied temporary for this user) while the final sync still to be finished or started ?
Not if you kick the users out at the correct time:
- dsync
- switch user to new format
- kick users
- final dsync
It doesn't matter if new connections arrive during the final dsync, because they are using the new format already. dsync merges changes, it doesn't destroy any changes.
On Wed, Apr 10, 2013 at 09:21:40PM +0300, Timo Sirainen wrote:
Not if you kick the users out at the correct time:
- dsync
- switch user to new format
- kick users
- final dsync
It doesn't matter if new connections arrive during the final dsync, because they are using the new format already. dsync merges changes, it doesn't destroy any changes.
It doesn't destroy changes but the user may see an incorrect state for a small amount of time, doesn't he ?
For instance (using dsync to change Maildir location from file1 to filer2) :
. Maildir in source : message tagged as New . initial dsync . user read the message in the source, message is now tagges as Read . switch user to Maildir in destination . kick user . user reconnects and sees, in destination, the message he just read tagged as New as long as the final dsync is not finished
?
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On 11.4.2013, at 13.07, Thomas Hummel <hummel@pasteur.fr> wrote:
It doesn't matter if new connections arrive during the final dsync, because they are using the new format already. dsync merges changes, it doesn't destroy any changes.
It doesn't destroy changes but the user may see an incorrect state for a small amount of time, doesn't he ?
For a small amount of time, yes.
For instance (using dsync to change Maildir location from file1 to filer2) :
. Maildir in source : message tagged as New . initial dsync . user read the message in the source, message is now tagges as Read . switch user to Maildir in destination . kick user . user reconnects and sees, in destination, the message he just read tagged as New as long as the final dsync is not finished
?
Which is probably a few seconds, so I don't see this as much of a problem.
On Thu, Apr 11, 2013 at 01:09:18PM +0300, Timo Sirainen wrote:
the user may see an incorrect state for a small amount of time, doesn't he ?
[...]
For a small amount of time, yes.
[...]
Which is probably a few seconds, so I don't see this as much of a problem.
Well, isn't, as with rsync, the travel time through the filesystem (to find out what's to be sync'ed) incompressible, in which case it would take more than a few seconds on a large mailbox (I'm testing but in more complex conditions) ?
Is dsync, for that matter, fastest than rsync (maybe because using dovecot-uidlist or similar) ?
Besides, how about client side indexing while in this incoherent, not yet fully sync'ed state ? Wouldn't there be corruption risk ?
Thanks.
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On Tue, 2013-04-16 at 12:38 +0200, Thomas Hummel wrote:
On Thu, Apr 11, 2013 at 01:09:18PM +0300, Timo Sirainen wrote:
the user may see an incorrect state for a small amount of time, doesn't he ?
[...]
For a small amount of time, yes.
[...]
Which is probably a few seconds, so I don't see this as much of a problem.
Well, isn't, as with rsync, the travel time through the filesystem (to find out what's to be sync'ed) incompressible, in which case it would take more than a few seconds on a large mailbox (I'm testing but in more complex conditions) ?
Is dsync, for that matter, fastest than rsync (maybe because using dovecot-uidlist or similar) ?
dsync doesn't scan through filesystem. It reads the changes from the index files. If there are no changes it's pretty much instant even with 1M mail mailbox. With changes it's still fast enough (and could be faster still by using incremental syncing with saved state via -s parameter).
Besides, how about client side indexing while in this incoherent, not yet fully sync'ed state ? Wouldn't there be corruption risk ?
The worst that can happen is:
- Client sees new mail 123 in old server
- Client sees only mails up to 122 in the new server
- Client again will see mail 123 after a while
I'm actually not sure how clients will handle that. It is an IMAP protocol violation. It would be possible to add a new flag to dsync where it would treat all new emails as conflicts and give them new UIDs, so in the above case it wouldn't save a mail 123 but 124.
On Tue, Apr 16, 2013 at 02:00:38PM +0300, Timo Sirainen wrote:
dsync doesn't scan through filesystem. It reads the changes from the index files. If there are no changes it's pretty much instant even with 1M mail mailbox. With changes it's still fast enough (and could be faster still by using incremental syncing with saved state via -s parameter).
Ok. Actually, I had benched an initial dsync (i.e. no mail in destination) with a parallelized rsync of precalculated (by an home made tool) chunks of files of the maildir. For a ~3.3G Maildir, dsync took ~1 hour vs 10 min with 4 rsync at a time. This of course is a very unfair comparison to dsync since I was using a cluster to parallelise rsyncs.
But as you said, dsync could be wiser, so I was thinking of using parallel rsync to make the initial mirror and then use dsync instead of rsync in the final step described in the dsync wiki.
I'm still not sure if I should forbid dovecot auth temporary (using auth-deny for instance) or try the seemless way.
Besides, how about client side indexing while in this incoherent, not yet fully sync'ed state ? Wouldn't there be corruption risk ?
The worst that can happen is:
- Client sees new mail 123 in old server
- Client sees only mails up to 122 in the new server
- Client again will see mail 123 after a while
I'm actually not sure how clients will handle that. It is an IMAP protocol violation. It would be possible to add a new flag to dsync where it would treat all new emails as conflicts and give them new UIDs, so in the above case it wouldn't save a mail 123 but 124.
I see. But there are other cases :
for instance, the user deletes a mail foobar in the new server because he reconnects after the kick. I guess dsync would merge the change and would not sync the foobar message from the old server in the final step. But what if another , new, mail foobaz is delivered : would'it get the nextuid which was the uid of the deleted foobar mail, thus confusing the client local indexes ?
Thanks
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On 16.4.2013, at 17.35, Thomas Hummel <hummel@pasteur.fr> wrote:
Besides, how about client side indexing while in this incoherent, not yet fully sync'ed state ? Wouldn't there be corruption risk ?
The worst that can happen is:
- Client sees new mail 123 in old server
- Client sees only mails up to 122 in the new server
- Client again will see mail 123 after a while
I'm actually not sure how clients will handle that. It is an IMAP protocol violation. It would be possible to add a new flag to dsync where it would treat all new emails as conflicts and give them new UIDs, so in the above case it wouldn't save a mail 123 but 124.
I see. But there are other cases :
for instance, the user deletes a mail foobar in the new server because he reconnects after the kick. I guess dsync would merge the change and would not sync the foobar message from the old server in the final step. But what if another , new, mail foobaz is delivered : would'it get the nextuid which was the uid of the deleted foobar mail, thus confusing the client local indexes ?
dsync in general resolves UID conflicts. If there's any chance that an IMAP client could have seen two different messages with the same UID, both of the messages get assigned new UIDs. That's why I was wondering only about the case that I mentioned. There the client couldn't have seen two different messages, but it's possible that some client could hide the mail 123 because it thought it got lost.
On Tue, Apr 16, 2013 at 05:51:21PM +0300, Timo Sirainen wrote:
dsync in general resolves UID conflicts. If there's any chance that an IMAP client could have seen two different messages with the same UID, both of the messages get assigned new UIDs.
I'm not sure I understand this correctly :
let's say that :
- in old, foobar as uid 100
- initial dsync
- user gets relocated, kicked and reconnects to new, then deletes foobar
- final dsync. dsync somehow manages to understand it should not sync foobar from old to new
- migration is over, new message foobaz comes in. Oh, I get it, you mean since uids gets only incremented, this new could not get uid 100 and then confuse the client index ?
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On 16.4.2013, at 18.03, Thomas Hummel <hummel@pasteur.fr> wrote:
On Tue, Apr 16, 2013 at 05:51:21PM +0300, Timo Sirainen wrote:
dsync in general resolves UID conflicts. If there's any chance that an IMAP client could have seen two different messages with the same UID, both of the messages get assigned new UIDs.
I'm not sure I understand this correctly :
let's say that :
- in old, foobar as uid 100
- initial dsync
- user gets relocated, kicked and reconnects to new, then deletes foobar
- final dsync. dsync somehow manages to understand it should not sync foobar from old to new
Yes. It sees that uid 100 was deleted, and keeps nextuid=101.
- migration is over, new message foobaz comes in. Oh, I get it, you mean since uids gets only incremented, this new could not get uid 100 and then confuse the client index ?
The new message gets uid 101, as according to nextuid value. A slightly more complex one would have been:
- you have mails up to 100
- dsync
- old server gets new mail uid=101
- old server deletes uid 101
- new server gets new mail uid 101
- dsync sees that there's a conflict (even though the old mail was already deleted), and gives the new server's new mail uid 102
On Tue, Apr 16, 2013 at 02:00:38PM +0300, Timo Sirainen wrote:
The worst that can happen is:
- Client sees new mail 123 in old server
- Client sees only mails up to 122 in the new server
- Client again will see mail 123 after a while
I'm actually not sure how clients will handle that. It is an IMAP protocol violation.
Why is it a protocol violation ? if new was up to 122, nextuid would have been 123 so what's the problem, protocol wise, to see 123 come later ?
Of course if a new mail is delivered in new as 123, there 's a conflict. But as you said dsync knows how to handle this and would assign new uids to both and the client, as you said, could get confused about what he thought was 123. But even in that case, wouldn't he see the message (as a new one with it's new uid) ? I mean, nothing would be "lost" ?
Thanks
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On 17.4.2013, at 13.19, Thomas Hummel <hummel@pasteur.fr> wrote:
On Tue, Apr 16, 2013 at 02:00:38PM +0300, Timo Sirainen wrote:
The worst that can happen is:
- Client sees new mail 123 in old server
- Client sees only mails up to 122 in the new server
- Client again will see mail 123 after a while
I'm actually not sure how clients will handle that. It is an IMAP protocol violation.
Why is it a protocol violation ? if new was up to 122, nextuid would have been 123 so what's the problem, protocol wise, to see 123 come later ?
Because client saw uidnext=124 on the old server, which shrank back to uidnext=123 on the new server. That shouldn't happen even temporarily.
Of course if a new mail is delivered in new as 123, there 's a conflict. But as you said dsync knows how to handle this and would assign new uids to both and the client, as you said, could get confused about what he thought was 123. But even in that case, wouldn't he see the message (as a new one with it's new uid) ? I mean, nothing would be "lost" ?
Yeah, when conflicts are fixed nothing gets lost. In that "worst case" I mentioned there's no conflict really, just a message that disappears and appears back. Hmm. Maybe Dovecot should keep track of what messages IMAP clients have seen, and automatically figure out when it should change UIDs in those cases.
participants (2)
-
Thomas Hummel
-
Timo Sirainen