[Dovecot] Replication protocol design
I'll probably be implementing multi-master replication this summer. My previous thoughts about it are here: http://dovecot.org/list/dovecot/2007-December/027284.html
Below is a description of how the replication protocol will probably work. It should work just as well for master-slave and multi-master setups. Comments welcome. I'll write later a separate mail about how it'll be implemented to Dovecot.
Goals
If a single (or configurable number of) server dies, no mails must be lost that have been reported to IMAP/SMTP clients as being successfully stored.
Must be able to automatically recover from a server desynchronization, such as:
- server has been offline for a long time
- some mail files have been manually added/deleted
- corrupted data/mail files if they're noticed
In multi-master setup if the link between servers die, the servers must be able to proceed autonomously (kind of conflicts with goal 1 though). When the link comes back, the changes are replicated as soon as possible.
Normal IMAP commands must not be able to cause desynchronization between servers. For example making conflicting flag changes simultaneously in two servers must not result in the servers having different flags.
Must perform well with at least 3 servers in a multi-master setup, preferrably still with tens of servers.
Latency shouldn't be increased noticeably when using servers distributed into 3 or more data centers. Must be usable over high-latency links (in optimistic async mode).
In normal operation send minimal incremental changes.
Protocol
There are two major parts in the protocol: Handling the normal incremental changes and fixing desynchronization.
Originally I thought that maybe the changes could be sent using the same format as Dovecot's transaction logs, but now I'm beginning to think that it's probably not that good idea. The code reuse potential is quite minimal and the format would still have to be extended in several ways so that it won't be directly compatible anyway.
So I'm thinking the protocol could be something text-based. The main benefit is that text-based protocols are easier to debug. Stream compression should drop most of the extra overhead if bandwidth is a problem.
The exact on-wire protocol anyway doesn't matter in the design that's discussed below.
The commands have tags similar to IMAP commands, because some commands may have to be forwarded to other servers and it may take a while to get a reply. During the wait the server may process other commands.
Mailbox master
Each mailbox has a single master server selected. In multi-master setups the master server may be moved by having the destination server simply request it from the current master. The current master must then give it up. If link is lost to the current master, one of the remaining servers will become the new master within the remaining servers.
Each server must be connected to the current master server. Since each mailbox can have a different master server, this typically means all servers are connected to each others. However it's possible to create setups where server connects to only one other server, which in turn connect to more servers. This is useful if there are bandwidth bottlenecks between some servers. This kind of a situation can also happen if in a network A-B-C the link A-C dies, but A-B and B-C continues to work. Because of this all commands must be able to function in a way that the server proxies them to the current master, instead of failing the command and trying to make it the caller's problem to resend the command to the actual master.
When the cluster starts up, a single server is selected as the root master for all mailboxes. If a server doesn't know who the current mailbox master is, it asks from the root. All servers cache the currently known mailbox masters to avoid constant requests to the root.
If the root server dies, another server is selected as the root. Because the new root doesn't know what masters have been requested (and asking all of them from all servers would just waste bandwidth), all the servers are expected to flush their master caches and drop their own master status. The new root doesn't respond to any requests before all servers have notified that they've dropped being a master.
The master status doesn't have to be at mailbox level granularity. It could just as well be configured to move at user, domain or even global level. Perhaps this could be done dynamically, so that higher granularity is used when the master is beginning to change too often between servers.
Mailbox ID
Mailbox IDs are session-specific numbers dynamically assigned for user+mailbox+UIDVALIDITY combinations. All connections have different mailbox IDs. Also send and receive directions have different IDs. This allows the sender to easily replace existing IDs to point to new mailboxes without causing any confusion.
MBOX:
- Mailbox ID
- User name
- Mailbox name
- Mailbox UIDVALIDITY
- Mailbox UIDNEXT
- Mailbox message count
If the receiver finds out it has a different UIDVALIDITY, the mailbox requires a full resync. Message count and UIDNEXT may also be used to determine if replication servers are out of sync.
Requesting master status
MASTER-MOVE:
- Mailbox ID
- [Destination SID] (if forwarding)
The command is sent to the last known master for the mailbox. The server will keep forwarding the command until it reaches the current master. During the forwarding other servers may want to request something from the master. These requests must be delayed by the forwarding servers until the move is finished.
Saving messages
SAVE:
- Mailbox IDs
- Received date
- [IMAP UID] (only if we're the master)
- Global UID (stays the same when copying the message)
- Message text Reply:
- [IMAP UID] (only if not specified in parameters)
- [Current mailbox master SID] (if it was moved)
If current server is not the master, the SAVE is sent to the master which gives the message its UID. The master server then replicates the message to other servers with the UID parameter set.
The mailbox master may have already changed by the time server receives a save request. If server receives a SAVE without IMAP UID parameter, it's responsible for finding out the new mailbox master and sending a new SAVE request to it. Once the new master replies with the IMAP UID, the server can reply to the original SAVE request, also providing the new master SID so the future requests can be sent there directly.
To be sure the message doesn't get lost, the server should not reply OK to the IMAP/SMTP client until it has received a reply from SAVE.
Copying messages
COPY:
- Source mailbox ID
- Destination mailbox ID
- Source IMAP UID
- Global UID
- [Destination IMAP UID] (only if we're the master)
- Destination received date Reply:
- [Destination IMAP UID] (only if not specified in parameters)
- [Current mailbox master SID] (if it was moved)
Source mailbox ID + source IMAP UID identifies the message to be copied. It's expected to contain the given global UID (which is just an extra sanity check). Otherwise it works the same way as SAVE.
Since the message already exists, it's probably not necessary to wait for a reply before replying OK to originating IMAP client.
Expunging messages
EXPUNGE:
- Mailbox ID
- UID range (No reply)
Expunges also have to be sent via master server (the same way as SAVE) to avoid COPY command failing in some servers because it was just expunged.
Changing message flags
STORE:
- Mailbox ID
- UID range
- Added flags/keywords
- Removed flags/keywords
- [Current modseq] (master sends)
- [Highest modseq of the messages before this change] (non-master sends)
- [flag: this is a CONDSTORE STORE UNCHANGEDSINCE] (non-master may send) [Reply:
- UIDs where STORE was rejected to (if CONDSTORE flag was used) ]
Stores also have to be sent via master server to avoid flag desynchronization. Master first checks if it has higher modseqs in the messages. Then it applies all the changes and forwards the changes to other servers. For messages that had higher initial modseqs their flags are sent to the server sending the STORE to fix a potential desync.
If CONDSTORE flag is set, the change is rejected for messages that had a higher modseq. Non-masters shouldn't reply to a STORE UNCHANGEDSINCE command before the change has been replicated to master and the rejections have been processed.
Mailbox synchronization
If a mailbox is determined to have changed externally (e.g. network connection down for too long, causing replication logs to get full) the mailbox state needs to be synchronized between servers.
SYNC:
- UIDVALIDITY
- UIDNEXT
- Message count
- For each message:
- UID
- Global UID
- Modseq
- Flags and keywords
- Received date Reply:
- (Sync finished)
Receiving server compares the parameters with its own mailbox state. If it finds previously unseen global UIDs, their message texts are requested:
FETCH:
- Mailbox ID
- UID Reply:
- Message text
SAVE, EXPUNGE and STORE commands are used to synchronize the mailbox.
A special case is when two servers have been saving messages independently from each others. In this case it's possible that the servers have used the same UIDs for different messages (different global UIDs). These need to be resolved by giving both conflicting UIDs new unused UIDs, otherwise IMAP clients may show them as wrong messages from their caches.
FIXME: If the other server had expunged a conflicting UID it still should be given a new UID. How do we find out this has happened?
On Tue, 29 Apr 2008, Timo Sirainen wrote:
I'll probably be implementing multi-master replication this summer. My previous thoughts about it are here: http://dovecot.org/list/dovecot/2007-December/027284.html
Below is a description of how the replication protocol will probably work. It should work just as well for master-slave and multi-master setups. Comments welcome. I'll write later a separate mail about how it'll be implemented to Dovecot.
(I'm CC:ing tv since he and I once chatted about IMAP replication.)
Multi-master will be very interesting. I'm really curious what you will do as far as this scenario:
- A mail arrives in server1
- User reads it (therefore the mail has a UID assigned)
- server2 gets isolated from server1
- A mail arrives at server2
- User logs in to server2 and reads it (therefore the mail has a UID assigned)
- server2 and server1 are connected again, and are told to sync
Questions:
a. Does the client get told two different messages have the same UID in the folder?
b. What is the graceful sync proposal?
Oh, look - you mention that right at the end of your protocol:
A special case is when two servers have been saving messages independently from each others. In this case it's possible that the servers have used the same UIDs for different messages (different global UIDs). These need to be resolved by giving both conflicting UIDs new unused UIDs, otherwise IMAP clients may show them as wrong messages from their caches.
FIXME: If the other server had expunged a conflicting UID it still should be given a new UID. How do we find out this has happened?
I guess this doesn't address my Question (a), though.
The above scenario may be handled by your root-based election process - but what prevents two roots from simultaneously existing, and therefore two masters existing, and then the above scenario happening?
I'm really interested to see where this goes - I think this would be really swell as a replacement for my use of offlineimap. I also want to have you consider letting non-root users use Dovecot replication features; that way, I could just have a cron job that asks my laptop's Dovecot to replicate from my server's Dovecot, without giving it any special permissions.
Thanks for all the great work Dovecot represents!
-- Asheesh.
-- I joined scientology at a garage sale!!
On Mon, 2008-04-28 at 16:33 -0700, Asheesh Laroia wrote:
Multi-master will be very interesting. I'm really curious what you will do as far as this scenario:
- A mail arrives in server1
- User reads it (therefore the mail has a UID assigned)
I also thought about this kind of a late UID allocation (or late UID conflict resolution), but it probably becomes too complex and it should happen only rarely anyway. So the UID will be assigned immediately when mail is being stored.
- server2 gets isolated from server1
- A mail arrives at server2
- User logs in to server2 and reads it (therefore the mail has a UID assigned)
- server2 and server1 are connected again, and are told to sync
Questions:
a. Does the client get told two different messages have the same UID in the folder?
If user is able to log into either server1 or server2 then there's really nothing that can be done to avoid seeing the same UID contain different messages. But I think in normal situations if a user is able to connect to either one of the servers, the servers should be able to connect to each others as well.
In step 6 the UID conflict will be noticed and all conflicting messages be given new unused UIDs so that caching clients won't be confused.
The above scenario may be handled by your root-based election process - but what prevents two roots from simultaneously existing, and therefore two masters existing, and then the above scenario happening?
The master process tries to prevent the situation from happening in normal conditions to avoid conflict resolution, but it doesn't rely on it working.
I'm really interested to see where this goes - I think this would be really swell as a replacement for my use of offlineimap. I also want to have you consider letting non-root users use Dovecot replication features; that way, I could just have a cron job that asks my laptop's Dovecot to replicate from my server's Dovecot, without giving it any special permissions.
Hmm. The replication itself could probably be done pretty safely. Mainly by just not allowing user to become replication master. But it would have to be treated somewhat differently from normal replication servers, like not writing (long) replication logs to disk while waiting for it to become online.
I also thought about using IMAP protocol for initiating the replication and performing the replication using some IMAP extensions. But it might be a bit too chatty/bloaty. But maybe for user-initiated replication it would be useful. Regular IMAP login and after a X-REPLICATE command it executes the replication binary. Or maybe just make the replication server directly listen on a different port using a different protocol. I'm not sure yet. :)
On Tue, Apr 29, 2008 at 02:59:52AM +0300, Timo Sirainen wrote:
- A mail arrives in server1
- User reads it (therefore the mail has a UID assigned)
- server2 gets isolated from server1
- A mail arrives at server2
- User logs in to server2 and reads it (therefore the mail has a UID assigned)
- server2 and server1 are connected again, and are told to sync
If user is able to log into either server1 or server2 then there's really nothing that can be done to avoid seeing the same UID contain different messages. But I think in normal situations if a user is able to connect to either one of the servers, the servers should be able to connect to each others as well.
One specialization of "gets isolated" is "goes down", "crashes", etc. That fulfills the above scenario; note nothing implies synchronicity.
But your answer of "old UIDs are removed, new ones chosen" does answer the question. It does some have some "interesting" side effects, though, like messages disappearing from INBOX and showing up again soon after.
Note that replication really must handle the conflict at the same time on both sides, giving the messages the same new UIDs, with no deliveries interfering with that. Otherwise you get very silly behavior.
You're gonna live through some interesting times.
-- :(){ :|:&};:
Timo Sirainen wrote:
I'll probably be implementing multi-master replication this summer. My previous thoughts about it are here: http://dovecot.org/list/dovecot/2007-December/027284.html
Below is a description of how the replication protocol will probably work. It should work just as well for master-slave and multi-master setups. Comments welcome. I'll write later a separate mail about how it'll be implemented to Dovecot.
My first thoughts are that although this seems like a really exciting feature set - it doesn't *appear* to solve many use cases ie fully disconnected modifications?
I'm thinking:
- personal imap server on my laptop which will intermittently sync back to the office. While offline I will copy messages around and perhaps create new messages (drafts, folders containing notes, etc)
- two offices connected by dialup. Users can connect to either server and create/delete content to their hearts content, eg local smtp on each server which delivers to the local inbox, but syncs back up to the main server, means we can mail other local users directly without connecting the dialup link, but if you go back to the main office eventually your mailbox back there syncs up to be the same as in the satellite office
I think they imapsync style protocol is the most powerful starting point for master/master sync. However, that style of sync can be made more efficient by using the LEMONADE features you previously added, in particular by offering a limited log file of recent actions in a folder we can avoid a full sync and apply only the recent changes to a given folder (fallback to full sync and compare if folders fall out of sync)
The other advantage of the imapsync style protocol is that it's much simpler to support partial replication, eg replicate only certain folders over certain date ranges (or only parts of certain messages, eg excluding large attachments)
My interest is for scenarios like a cruiseliner where we want crew to be able to email each other instantly onboard without involving the satellite link, but when they go ashore they should still be able to see the same mail in their inbox back on the server onshore (or they move to another ship then we sync their mailbox across)
Any comments?
Ed W
On Tue, 2008-04-29 at 16:28 +0100, Ed W wrote:
I think they imapsync style protocol is the most powerful starting point for master/master sync. However, that style of sync can be made more efficient by using the LEMONADE features you previously added, in particular by offering a limited log file of recent actions in a folder we can avoid a full sync and apply only the recent changes to a given folder (fallback to full sync and compare if folders fall out of sync)
The updated design #2 should address this. The mailbox synchronization step works pretty much the same as QRESYNC.
The other advantage of the imapsync style protocol is that it's much simpler to support partial replication, eg replicate only certain folders over certain date ranges (or only parts of certain messages, eg excluding large attachments)
This isn't really there yet.. QRESYNC supports replicating only a specific UID range and that would be easy to add to my replication protocol as well if needed. Excluding large attachments then.. Well, two possibilities:
a) Add a FETCH-SMALL command that drops large attachements and somehow remembers this so that they could later be downloaded again when there's more bandwidth or user requests it.
b) Use IMAP protocol for the synchronization and let the client figure out itself what/when/how it wants to fetch.
Using IMAP protocol for replication has at least two disadvantages:
It's a bit too chatty, wasting bandwidth on replies the replication isn't interested in.
Sending updated flag/keyword changes can't be done in a standard way, because it only shows the last flag+keyword state, not the changes that were done (e.g. "\Seen" vs. "+\Seen -\Flagged").
Hi
The updated design #2 should address this. The mailbox synchronization step works pretty much the same as QRESYNC.
Thanks will look through that
a) Add a FETCH-SMALL command that drops large attachements and somehow remembers this so that they could later be downloaded again when there's more bandwidth or user requests it.
Agreed - it's tricky.
Not 100% sure myself how to handle this yet. I think it requires some extensions quite a long way outside of the normal IMAP process, but it's probably a very nice feature for those that need it.
Using IMAP protocol for replication has at least two disadvantages:
- It's a bit too chatty, wasting bandwidth on replies the replication isn't interested in.
Compression eliminates a huge amount of traffic (I typically see 12:1 or better). Also pipelining commands eliminates much of the disadvantage of the chatty behaviour (lets assume >1sec latency on most dialup links, so latency is definitely a killer)
I currently use a small self written proxy app which does some simple analysis of what imap client is talking and does some prefetching via pipelined commands to reduce latency and also sets up a compressed pipe back to the server. Even over broadband it gives quite a significant speedup on large folders and on dialup it gives a huge performance boost. Nothing too clever going on though
Perhaps we could look at some optimisations like that in the first instance?
- Sending updated flag/keyword changes can't be done in a standard way, because it only shows the last flag+keyword state, not the changes that were done (e.g. "\Seen" vs. "+\Seen -\Flagged").
Hmm... I guess that can only be done anyway by storing the state before and after and figuring out the changes based on a comparison?
I do like the idea of making this more generic and hence hackable than writing all the code into dovecot itself. Perhaps we could start with an external proxy app at each end of the link which is external to the imap server, ie basically start with IMAP sync. This seems easy to knock up in teh scripting language of choice (eg imapsync) and we can then easily hack on the protocol and choice of commands to bring the servers into sync. I guess if some obvious bottlenecks occur then it's simple to make the protocol across the wire slightly different and subsequently look at how those changes could be moved into the imap protocol itself?
The proxy app then gives a clean break to monitor stuff like changes in flags (expecting this ot need some support from the server though to avoid duplicating the index data?). It would potentially make it possible to support other imap servers than dovecot, although I don't believe that should be on the roadmap, but others may want to code that up themselves?
So effectively start with ImapSync style app and then use knowledge of the IMAP server and the QRESYNC stuff to make it very much more optimised. Other imap servers then have the option to code up the requried missing features and we have invented a standardised way to sync two servers...
Sound any good?
Ed W
On May 1, 2008, at 3:23 PM, Ed W wrote:
- Sending updated flag/keyword changes can't be done in a standard
way, because it only shows the last flag+keyword state, not the changes
that were done (e.g. "\Seen" vs. "+\Seen -\Flagged").Hmm... I guess that can only be done anyway by storing the state
before and after and figuring out the changes based on a comparison?
Dovecot stores flag changes as "added flags" and "removed flags" in
transaction file, so it doesn't need to do any comparing to figure out
what had changed. This makes the flag changes also more reliable. For
example if a message originally had flags (\Flagged) and then two
servers changed them:
S1: STORE 1 +FLAGS \Answered S2: STORE 1 +FLAGS \Seen S2: STORE 1 -FLAGS \Flagged
If replication protocol sent the changes as +flags -flags, it would be
unambiguous what the final flags are: (\Answered \Seen).
If replication protocol instead sent the flags as their currently
known flag states (as IMAP protocol does):
S1: * 1 FLAGS (\Answered \Flagged) S2: * 1 FLAGS (\Seen)
There aren't any good ways to figure out what the wanted final flags
are supposed to be.
I do like the idea of making this more generic and hence hackable
than writing all the code into dovecot itself. Perhaps we could
start with an external proxy app at each end of the link which is
external to the imap server, ie basically start with IMAP sync.
That would work for the mailbox synchronization part, but I'm more
interested in the incremental synchronization part which replicates
all changes in all mailboxes immediately. That's not really possible
to base on an external proxy. Mostly because the IMAP protocol
supports seeing changes only in a single mailbox at a time, and trying
to change that would most likely make the protocol different enough
from IMAP that there's not much point in using IMAP as a base anymore.
Dovecot stores flag changes as "added flags" and "removed flags" in transaction file, so it doesn't need to do any comparing to figure out what had changed. This makes the flag changes also more reliable. For example if a message originally had flags (\Flagged) and then two servers changed them:
S1: STORE 1 +FLAGS \Answered S2: STORE 1 +FLAGS \Seen S2: STORE 1 -FLAGS \Flagged
If replication protocol sent the changes as +flags -flags, it would be unambiguous what the final flags are: (\Answered \Seen).
If replication protocol instead sent the flags as their currently known flag states (as IMAP protocol does):
S1: * 1 FLAGS (\Answered \Flagged) S2: * 1 FLAGS (\Seen)
There aren't any good ways to figure out what the wanted final flags are supposed to be.
Sounds like a good candidate for a slightly customised IMAP command to get that info?
I do like the idea of making this more generic and hence hackable than writing all the code into dovecot itself. Perhaps we could start with an external proxy app at each end of the link which is external to the imap server, ie basically start with IMAP sync.
That would work for the mailbox synchronization part, but I'm more interested in the incremental synchronization part which replicates all changes in all mailboxes immediately. That's not really possible to base on an external proxy. Mostly because the IMAP protocol supports seeing changes only in a single mailbox at a time, and trying to change that would most likely make the protocol different enough from IMAP that there's not much point in using IMAP as a base anymore.
I'm not sure. Consider a design where we have two ways to sync servers.
Live instant replication. Done by setting a given folder to be monitored for live changes. All changes made to that folder cause a transaction log to be generated (actually probably two logs, one listing the operations and another possibly listing the data relating to the affected messages). These log files could be a simple incremental bz2 file which occasional flush points so that they can be truncated up to a flush point easily. At any point it would be possible to simply take that file and use the transport mechanism of choice (usb stick, cd, internet, etc) to replay that log back on the other server.
We can guarantee that any such transactional sync will go wrong for lots of reasons, not least on disk changes outside of the control of the server, eg backup/restore, corruption, etc. Therefore there is a need for an online style sync where we simply compare the list of files in both folders and resolve the changes to bring both into sync (IMAPSync style)
Now where I was going with this is that it's going to need a custom protocol to get at those log files in 1) above anyway and we might want to turn it on and off per folder, so it's could end up being a runtime parameter, hence does it matter whether it lives inside the server code or outside. However, I have lost my train of thought now so I will just quietly slink away...
Ed W
On Thu, 1 May 2008, Ed W wrote:
I currently use a small self written proxy app which does some simple analysis of what imap client is talking and does some prefetching via pipelined commands to reduce latency and also sets up a compressed pipe back to the server. Even over broadband it gives quite a significant speedup on large folders and on dialup it gives a huge performance boost. Nothing too clever going on though
Perhaps we could look at some optimisations like that in the first instance?
Are you serious? Can I have a copy!?
-- Asheesh.
--
You're currently going through a difficult transition period called "Life."
Asheesh Laroia wrote:
On Thu, 1 May 2008, Ed W wrote:
I currently use a small self written proxy app which does some simple analysis of what imap client is talking and does some prefetching via pipelined commands to reduce latency and also sets up a compressed pipe back to the server. Even over broadband it gives quite a significant speedup on large folders and on dialup it gives a huge performance boost. Nothing too clever going on though
Perhaps we could look at some optimisations like that in the first instance?
Are you serious? Can I have a copy!?
We get sensible amounts of email down a 2,400 baud connection, that's 20KB per *minute* for those of you using broadband right now...
Ed W
Timo Sirainen wrote:
Using IMAP protocol for replication has at least two disadvantages:
It occurs to me that there is at least one advantage of putting as much as possible into the imap protocol:
mailclients are kind of a special of replication client. Really they want to be able to do as much as possible of this stuff as possible also... Although clients typically lag a few weeks behind new RFCs being written (grin), I guess the point is that if some of this stuff is available via IMAP or an IMAP extension then others might one day use or benefit from it?
Just a thought...
Does a modification to IDLE to monitor more folders help us at all?
Ed W
On Fri, 2008-05-02 at 07:30 +0100, Ed W wrote:
Timo Sirainen wrote:
Using IMAP protocol for replication has at least two disadvantages:
It occurs to me that there is at least one advantage of putting as much as possible into the imap protocol:
mailclients are kind of a special of replication client. Really they want to be able to do as much as possible of this stuff as possible also... Although clients typically lag a few weeks behind new RFCs being written (grin), I guess the point is that if some of this stuff is available via IMAP or an IMAP extension then others might one day use or benefit from it?
I agree that it would be nice to offer this capability for IMAP clients, but I don't see a way to do this in any reasonable way. If the client wants a continuous replication for all mailboxes the resulting protocol will barely even look like IMAP anymore. And trying to merge all the initial sync + continuous replication + IMAP functionality to the same process would be quite ugly.
There is still a small hope though. Merging initial sync + IMAP functionality could be possible somewhat cleanly since the initial sync is quite close to QRESYNC functionality. But I'm not yet convinced that it's a good idea to separate initial sync and continuous replication to separate processes. I'll think more about this once I start planning replication milestone 2 details.
Does a modification to IDLE to monitor more folders help us at all?
Not really. IDLE in general doesn't allow anything else than getting EXPUNGE notifications immediately. Anything else is allowed to be sent immediately to the client even if it doesn't use IDLE (old Dovecot versions used to do this, but it caused problems with some clients).
Anyway even if Dovecot finds out that there were changes in other mailboxes, it would have to notify about these to the client somehow. It basically means either a) adding a mailbox parameter to all untagged replies (ugly) or b) sending a "mailbox xyz changed" notification and have the client change the mailbox to find out what changed (inefficient).
Timo Sirainen wrote:
I agree that it would be nice to offer this capability for IMAP clients, but I don't see a way to do this in any reasonable way. If the client wants a continuous replication for all mailboxes the resulting protocol will barely even look like IMAP anymore. And trying to merge all the initial sync + continuous replication + IMAP functionality to the same process would be quite ugly.
My first thought was an IMAP command which would read a folder's "log files" from a certain point in time and continue to stream all changes down the connection from that point onwards. Normal IMAP commands could then be used to fetch the details of the changes
However, that's not compatible with the idea of a single iteration sync (takes at least two round trips)
A command to grab the log files and relevant data would seem to work, but doesn't really offer any benefits to be done down the same socket (other than perhaps to be able to use normal imap auth to limit the mailboxes which are accessible?). Actually being able to limit the accessible mailboxes suddenly seems quite important... We can't necessarily trust the remote location admin so we do need some way to restrict their access - perhaps a normal login is required (this presumably means the need for multiple connections to sync multiple mailboxes though? This may not scale for large numbers of mailboxes?)
My thought is still that it's hard to separate the needs of a typical IMAP client from the requirements to sync two servers - there much be some overlap here that we can exploit? Ideally we also want this stuff to be documented in a way that would allow other IMAP servers to participate in sync (although there is nearly zero chance of it ever happening I guess...)
Given how little stuff like QRESYNC is apparently implemented in the real world I guess it's no big problem to "improve" the spec further if this helps (as long as we document where we deviate)
Hmm
Ed W
participants (4)
-
Asheesh Laroia
-
Ed W
-
Timo Sirainen
-
Tommi Virtanen