Scaling to 10 Million IMAP sessions on a single server
I just read this blog: https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-co... <https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/> about scaling to 12 Million Concurrent Connections on a single server and it got me thinking.
Would it be possible to scale Dovecot IMAP server to 10 Million IMAP sessions on a single server?
I think the current implementation of having a separate process manage each active IMAP session (w/ the possibility of moving idling sessions to a single hibernate process) will never be able to deploy a single server managing 10 Million IMAP sessions.
But, would it be possible to implement a new IMAP server plugin that uses a fixed configurable pool of “worker” processes, much like NGINX or PHP-FPM does. These servers can probably scale to 10 Million TCP connections, if the server is carefully tuned and has enough cores/memory to support that many active sessions.
I’m thinking that the new IMAP server could use some external database (e.g., Redis or Memcached) to save all the sessions state and have the “worker” processes poll the TCP sockets for new IMAP commands to process (fetching the session state from the external database when it has a command that is waiting on a response). The Dovecot IMAP proxies could even queue incoming commands to proxy many incoming requests to a smaller number of backend connections (like ProxySQL does for MySQL requests). That might allow each Dovecot proxy to support 10 Million IMAP sessions and a single backend could support multiple front end Dovecot proxies (to scale to 100 Million concurrent IMAP connections using 10 proxies for 100 Million connections and 1 backend server for 10 Million connections).
Of course, the backend server may need to be beefy and have very fast NVMe SSDs for local storage, but changing the IMAP server to manage a pool of workers instead of requiring a process per active session, would allow bigger scale up and could save large sites a lot of money.
Is this a good idea? Or, am I missing something?
Kevin
On Tue, 21 Feb 2017 09:49:39 -0500 KT Walrus wrote:
I just read this blog: https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-co... <https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/> about scaling to 12 Million Concurrent Connections on a single server and it got me thinking.
While that's a nice article, nothing in it was news to me or particular complex when one does large scale stuff, like Ceph for example.
Would it be possible to scale Dovecot IMAP server to 10 Million IMAP sessions on a single server?
I'm sure Timo's answer will (or would, if he could be bothered) be along the lines of: "Sure, if you give me all your gold and then some for a complete rewrite of, well, everything".
What you're missing and what the bad idea here is that as mentioned before scale-up only goes so far. I was feeling that my goal of 500k users/sessions in 2-node active/active cluster was quite ambitious and currently I'm looking at 200k sessions as something achievable with the current Dovecot and other limitations.
But even if you were to implement something that can handle 1 million or more sessions per server, would you want to? As in, if that server goes down, the resulting packet, authentication storm will be huge and most like result in a proverbial shit storm later. Having more than 10% or so of your customers on one machine and thus involved in an outage that you KNOW will hit you eventually strikes me as a bad idea.
I'm not sure how the design below meshes with Timo's lofty goals and standards when it comes to security as well.
And a push with the right people (clients) to support IMAP NOTIFY would of course reduce the number of sessions significantly.
Finally, Dovecot in proxy mode already scales quite well.
Christian
I think the current implementation of having a separate process manage each active IMAP session (w/ the possibility of moving idling sessions to a single hibernate process) will never be able to deploy a single server managing 10 Million IMAP sessions.
But, would it be possible to implement a new IMAP server plugin that uses a fixed configurable pool of “worker” processes, much like NGINX or PHP-FPM does. These servers can probably scale to 10 Million TCP connections, if the server is carefully tuned and has enough cores/memory to support that many active sessions.
I’m thinking that the new IMAP server could use some external database (e.g., Redis or Memcached) to save all the sessions state and have the “worker” processes poll the TCP sockets for new IMAP commands to process (fetching the session state from the external database when it has a command that is waiting on a response). The Dovecot IMAP proxies could even queue incoming commands to proxy many incoming requests to a smaller number of backend connections (like ProxySQL does for MySQL requests). That might allow each Dovecot proxy to support 10 Million IMAP sessions and a single backend could support multiple front end Dovecot proxies (to scale to 100 Million concurrent IMAP connections using 10 proxies for 100 Million connections and 1 backend server for 10 Million connections).
Of course, the backend server may need to be beefy and have very fast NVMe SSDs for local storage, but changing the IMAP server to manage a pool of workers instead of requiring a process per active session, would allow bigger scale up and could save large sites a lot of money.
Is this a good idea? Or, am I missing something?
Kevin
--
Christian Balzer Network/Systems Engineer
chibi@gol.com Global OnLine Japan/Rakuten Communications
http://www.gol.com/
A more efficient algorithm would reduce computational complexity, and the need for expensive power-hungry CPUs.
Sent from ProtonMail Mobile
On Wed, Feb 22, 2017 at 5:12 AM, Christian Balzer <'chibi@gol.com'> wrote: On Tue, 21 Feb 2017 09:49:39 -0500 KT Walrus wrote:
I just read this blog: https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-co... <https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/> about scaling to 12 Million Concurrent Connections on a single server and it got me thinking.
While that's a nice article, nothing in it was news to me or particular complex when one does large scale stuff, like Ceph for example.
Would it be possible to scale Dovecot IMAP server to 10 Million IMAP sessions on a single server?
I'm sure Timo's answer will (or would, if he could be bothered) be along the lines of: "Sure, if you give me all your gold and then some for a complete rewrite of, well, everything".
What you're missing and what the bad idea here is that as mentioned before scale-up only goes so far. I was feeling that my goal of 500k users/sessions in 2-node active/active cluster was quite ambitious and currently I'm looking at 200k sessions as something achievable with the current Dovecot and other limitations.
But even if you were to implement something that can handle 1 million or more sessions per server, would you want to? As in, if that server goes down, the resulting packet, authentication storm will be huge and most like result in a proverbial shit storm later. Having more than 10% or so of your customers on one machine and thus involved in an outage that you KNOW will hit you eventually strikes me as a bad idea.
I'm not sure how the design below meshes with Timo's lofty goals and standards when it comes to security as well.
And a push with the right people (clients) to support IMAP NOTIFY would of course reduce the number of sessions significantly.
Finally, Dovecot in proxy mode already scales quite well.
Christian
I think the current implementation of having a separate process manage each active IMAP session (w/ the possibility of moving idling sessions to a single hibernate process) will never be able to deploy a single server managing 10 Million IMAP sessions.
But, would it be possible to implement a new IMAP server plugin that uses a fixed configurable pool of "worker" processes, much like NGINX or PHP-FPM does. These servers can probably scale to 10 Million TCP connections, if the server is carefully tuned and has enough cores/memory to support that many active sessions.
I’m thinking that the new IMAP server could use some external database (e.g., Redis or Memcached) to save all the sessions state and have the "worker" processes poll the TCP sockets for new IMAP commands to process (fetching the session state from the external database when it has a command that is waiting on a response). The Dovecot IMAP proxies could even queue incoming commands to proxy many incoming requests to a smaller number of backend connections (like ProxySQL does for MySQL requests). That might allow each Dovecot proxy to support 10 Million IMAP sessions and a single backend could support multiple front end Dovecot proxies (to scale to 100 Million concurrent IMAP connections using 10 proxies for 100 Million connections and 1 backend server for 10 Million connections).
Of course, the backend server may need to be beefy and have very fast NVMe SSDs for local storage, but changing the IMAP server to manage a pool of workers instead of requiring a process per active session, would allow bigger scale up and could save large sites a lot of money.
Is this a good idea? Or, am I missing something?
Kevin
-- Christian Balzer Network/Systems Engineer chibi@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/
On 22 Feb 2017, at 6.12, Christian Balzer <chibi@gol.com> wrote:
On Tue, 21 Feb 2017 09:49:39 -0500 KT Walrus wrote:
I just read this blog: https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-co... <https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/> about scaling to 12 Million Concurrent Connections on a single server and it got me thinking.
While that's a nice article, nothing in it was news to me or particular complex when one does large scale stuff, like Ceph for example.
Would it be possible to scale Dovecot IMAP server to 10 Million IMAP sessions on a single server?
I'm sure Timo's answer will (or would, if he could be bothered) be along the lines of: "Sure, if you give me all your gold and then some for a complete rewrite of, well, everything”.
Well. The current bottleneck in achieving that would probably be the memory amount required. With 12M active sessions (non-hibernated) the memory requirement for that 12M active user single instance server would be huge. Approximately 10TB.
If 12M active sessions is the target then the architecture of one user per imap process needs to be abandoned.
Sami
On Feb 21, 2017, at 11:12 PM, Christian Balzer <chibi@gol.com> wrote:
On Tue, 21 Feb 2017 09:49:39 -0500 KT Walrus wrote:
I just read this blog: https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-co... <https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/><https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/ <https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/>> about scaling to 12 Million Concurrent Connections on a single server and it got me thinking.
While that's a nice article, nothing in it was news to me or particular complex when one does large scale stuff, like Ceph for example.
Would it be possible to scale Dovecot IMAP server to 10 Million IMAP sessions on a single server?
I'm sure Timo's answer will (or would, if he could be bothered) be along the lines of: "Sure, if you give me all your gold and then some for a complete rewrite of, well, everything”.
It will be a long time before I would need to scale to 10 Million users and I will be happy to pay for the rewrite of the IMAP plugin when the time comes, if not done before then by someone else.
I have seen proposals for a new client protocol called JMAP that seem to be all about running a mail server at scale like an NGINX https web server can scale. That got me thinking about wether there is anything fundamental about IMAP that causes it to be difficult to scale. After looking into Dovecot’s current IMAP implementation, I think the approach was taken that fundamentally would have scaling issues (as in, one backend process per IMAP session). I see a couple years ago, work was done to “migrate” idling IMAP sessions to a single process that “remembers” the state of the IMAP session and can restore it back to a backend process when the idling is done.
But, the only estimate that I have read about the “migrate idling” is that you are likely to see only a 20% reduction of the number of concurrent processes you need if you are running at 50,000 IMAP sessions per mail server. 20% reduction is not nearly enough of a benefit for scale. I would need to see at least an order of magnitude improvement to scale (and hopefully, several orders of magnitude).
So, in my mind, since these IMAP sessions are long lived with infrequent bursts of activity, a better approach would be to manage the session data in memory or in an external datastore and only process using the session data when there is activity. Much like Web Sockets and even HTTPS requests are handled today for installations that need to scale to support millions of active users.
As for Dovecot, I would think the work done to “migrate” idling IMAP sessions would be a good start to implementing managing a large number of sessions with a fixed pool of worker processes like other web servers do.
So, my question really is:
Is there anything about the IMAP protocol that would prevent an implementation from scaling to 10 Million users per server? Or, do we need to push for a new protocol like JMAP that has been designed to scale better (by being stateless with the server requests)?
Kevin
On 22 Feb 2017, at 17.07, KT Walrus <kevin@my.walr.us> wrote:
I have seen proposals for a new client protocol called JMAP that seem to be all about running a mail server at scale like an NGINX https web server can scale. That got me thinking about wether there is anything fundamental about IMAP that causes it to be difficult to scale. After looking into Dovecot’s current IMAP implementation, I think the approach was taken that fundamentally would have scaling issues (as in, one backend process per IMAP session). I see a couple years ago, work was done to “migrate” idling IMAP sessions to a single process that “remembers” the state of the IMAP session and can restore it back to a backend process when the idling is done.
But, the only estimate that I have read about the “migrate idling” is that you are likely to see only a 20% reduction of the number of concurrent processes you need if you are running at 50,000 IMAP sessions per mail server. 20% reduction is not nearly enough of a benefit for scale. I would need to see at least an order of magnitude improvement to scale (and hopefully, several orders of magnitude).
My long-term plans are something like this:
imap-hibernate process can be used more aggressively. Not necessarily even for just IDLEing sessions, but for any session that isn't actively being used. And actually if the server is too busy, even active sessions could be hibernated. That would be somewhat similar to cooperative multitasking. When this is done, you can think of the current imap processes as the worker processes.
More state will be transferred to imap-hibernate process, so it can perform simpler commands without recreating the IMAP process. For example STATUS replies can be returned from cached state as long as it hasn't actually changed.
imap-hibernate is currently tracking changed state via inotify (etc.) This mostly work, but it's also unnecessarily sometimes waking up. For example just because one IMAP session performed a FETCH that added something to dovecot.index.cache, it doesn't mean that there are any real changes. We'll need some mail plugin that notifies imap-hibernate process when some real change has happened.
Hibernated sessions can even be moved away entirely from backends into IMAP proxies. The IMAP proxy can then reconnect to backend to re-establish the session. This allows even switching backends entirely, as long as the storage is shared. This requires that backends notify the proxy whenever something changes to the user, which is mostly a continuation of the previous item (just TCP notification instead of UNIX socket notification).
IMAP proxies can also perform similar limited functionality as imap-hibernate processes. Possibly running the same imap-hibernate processes.
And kind of a reverse of hibernation: imap processes can also preserve the user's imap session and opened folder indexes in memory even after the IMAP client has disconnected. If the same user connects back, the imap process can quickly be re-used with all the state already open. This is especially useful for client that create many short-lived connections, such as webmails.
So after all these changes there would practically be something like 1000 imap processes constantly open and either doing work or waiting for a recently disconnected IMAP client to come back.
As Christian already mentioned, the Dovecot proxies are supposed to be able to handle quite a lot of connections. I wouldn't be surprised if you can already do millions of connections with them. Most of our customers haven't tried scaling them very hard because they don't really want to create multiple IP addresses for servers, which is required to avoid running out of TCP ports (or I guess there could be multiple destination ports, but that also complicates things and Dovecot doesn't currently support that in an easy way either).
Is there anything about the IMAP protocol that would prevent an implementation from scaling to 10 Million users per server? Or, do we need to push for a new protocol like JMAP that has been designed to scale better (by being stateless with the server requests)?
I guess mainly the message sequence numbers in IMAP protocol makes this more difficult, but it's not an impossible problem to solve.
On Feb 22, 2017, at 2:44 PM, Timo Sirainen <tss@iki.fi> wrote:
I guess mainly the message sequence numbers in IMAP protocol makes this more difficult, but it's not an impossible problem to solve.
Any thoughts on the wisdom of supporting an external database for session state or even mailbox state (like using Redis or even MySQL)?
Also, would it help reliability or scalability to store a copy of the index data in an external database?
I want to use mdbox format but I have heard that these index files do get corrupted occasionally and have to be rebuilt (possibly using an older version of the index file to construct a new one). I worry that using mdbox might cause my users to see the IMAP flags suddenly reset back to a previous state (like seeing previously read messages becoming unread in their mail clients).
If a copy of the index data were stored in an external database, such problems of duplicate messages occurring in a dovecot cluster could be handled by having the cluster “lookup” the index data using the external database instead of the local copy stored on the server. An external database could easily implement unique serial numbers cluster-wide. In the site I’m working on building, I even use Redis to implement “message queues” between Postfix and Dovecot (via redis push/pop feature). Currently, I am only delivering new messages via IMAP instead of LMTP (no LMTP will be available to my backend mail servers, only IMAP).
If you stored the MD5 checksum of the index files (and even the message files) in the external database, you could also run a background process that would periodically check for corruption of the local index files using the checksums from the database, making mdbox format even more bulletproof.
And, the best thing about using an external database is that making the external database highly available is not a problem (as most sites already do that). The index data stored in the database would become the “source of truth” with the local index files/session data being an efficient cache for the mailstore. And, re-caching could occur as needed to make the whole cluster more reliable.
Kevin
On 2/22/2017, 3:46:08 PM, KT Walrus <kevin@my.walr.us> wrote:
I want to use mdbox format but I have heard that these index files do get corrupted occasionally and have to be rebuilt (possibly using an older version of the index file to construct a new one). I worry that using mdbox might cause my users to see the IMAP flags suddenly reset back to a previous state (like seeing previously read messages becoming unread in their mail clients).
This is the only reason I haven't moved to mdbox myself. I really, really wish there was a way to not have to worry about losing flags.
On 22 Feb 2017, at 22.46, KT Walrus <kevin@my.walr.us> wrote:
On Feb 22, 2017, at 2:44 PM, Timo Sirainen <tss@iki.fi> wrote:
I guess mainly the message sequence numbers in IMAP protocol makes this more difficult, but it's not an impossible problem to solve.
Any thoughts on the wisdom of supporting an external database for session state or even mailbox state (like using Redis or even MySQL)?
Also, would it help reliability or scalability to store a copy of the index data in an external database?
I mainly see such external databases as additional reasons for things to break. And even if not, additional extra layers of latency.
The thoughts I've had about storing such internal state in the Dovecot Proxy layer make sense because the IMAP sessions have to have active TCP connections. All the state can be stored by the process that is responsible for the TCP connection itself. There's not much point storing such state outside the process: If the process or the TCP connection dies, the state needs to be forgotten about in any case since there's no "state resume" command in IMAP (and even if there were, the state probably should then be stored in that command itself rather than on the server side).
I want to use mdbox format but I have heard that these index files do get corrupted occasionally and have to be rebuilt (possibly using an older version of the index file to construct a new one). I worry that using mdbox might cause my users to see the IMAP flags suddenly reset back to a previous state (like seeing previously read messages becoming unread in their mail clients).
Both sdbox and mdbox formats have this problem in theory. Practically, there are many huge mdbox/sdbox installations and I don't think they see such problems much, if ever. Dovecot attempts pretty hard already not to lose flags with sdbox/mdbox. There are also separate dovecot.index.backup files that are kept just for this purpose.
If a copy of the index data were stored in an external database, such problems of duplicate messages occurring in a dovecot cluster could be handled by having the cluster “lookup” the index data using the external database instead of the local copy stored on the server.
This sounds a bit similar to the "obox" format that we use for storing emails and indexes to object storage in Dovecot Pro. That isn't open source though..
If you stored the MD5 checksum of the index files (and even the message files) in the external database, you could also run a background process that would periodically check for corruption of the local index files using the checksums from the database, making mdbox format even more bulletproof.
I don't see why this would need an external database. I've long had in my TODO to add hashes/checksums to all of the Dovecot index files so it could properly detect corruption and ignore that. Hopefully that's not too far into the future anymore.
And, the best thing about using an external database is that making the external database highly available is not a problem (as most sites already do that). The index data stored in the database would become the “source of truth” with the local index files/session data being an efficient cache for the mailstore. And, re-caching could occur as needed to make the whole cluster more reliable.
In my opinion external database is just shifting the problem from one place to another. Yes, sometimes it's still useful. Dovecot supports all kinds of databases for all kinds of purposes, like with dict API you can access LDAP, SQL or Cassanda. I mostly like Cassandra nowadays, but it has its problems as well (tombstones). I'm not aware of any highly available database that actually scales and really just works without problems. (I'm talking about clusters with more than just 2 servers. Ideally more than just 2 datacenters.)
On 23 Feb 2017, at 23.00, Timo Sirainen <tss@iki.fi> wrote:
I mainly see such external databases as additional reasons for things to break. And even if not, additional extra layers of latency.
Oh, just thought that I should clarify this and I guess other things I said. I think there are two separate things we're possibly talking about in here:
Temporary state: This is what I was mainly talking about. State related to a specific IMAP session. This doesn't take much space and can be stored in the proxy's memory since it's specific to the TCP session anyway.
Permanent state: This is mainly about the storage. A lot of people use Dovecot with NFS. So one possibility for storing the permanent state is NFS. Another possibility with Dovecot Pro is to store it to object storage as blobs and keep a local cache of the state. A 3rd possibility might be to use some kind of a database for storing the permanent state. I'm fine with the first two, but with 3rd I see a lot of problems and not a whole lot of benefit. But if you think of the databases (or even NFS) as blob storage, you can think of them the same as any object storage and use the same obox format with them. What I'm mainly against is attempting to create some kind of a database that has structured format like (imap_uid, flags, ...) - I'm sure that can be useful for various purposes but performance or scalability isn't one of them.
On Feb 23, 2017, at 4:21 PM, Timo Sirainen <tss@iki.fi> wrote:
On 23 Feb 2017, at 23.00, Timo Sirainen <tss@iki.fi> wrote:
I mainly see such external databases as additional reasons for things to break. And even if not, additional extra layers of latency.
Oh, just thought that I should clarify this and I guess other things I said. I think there are two separate things we're possibly talking about in here:
- Temporary state: This is what I was mainly talking about. State related to a specific IMAP session. This doesn't take much space and can be stored in the proxy's memory since it's specific to the TCP session anyway.
Moving the IMAP session state to the proxy so the backend can just have a fixed pool of worker processes is really what I think is necessary for scaling to millions of IMAP sessions. I still think it would be best to store this state in a way that you could at least “remember” the backend server that is implementing the IMAP session and the auth data. To me, that would be to use Redis for session state. Redis is a very efficient in-memory database where the data is persistent and replicated. And, it is popular enough to be well tested and easy to use (the API is very simple).
I use HAProxy for my web servers and HAProxy supports “stick” tables to map a client IP to the same backend server that was selected when the session was first established. HAProxy then supports proxy “peers” where the “stick” tables are shared between multiple proxies. That way, if a proxy fails, I can move the VIP over (or let DNS round-robin) to another proxy and still get the same backend (which has session state) without having the proxy pick some other backend (losing the backend session state). It might be fairly complex for HAProxy to share these “stick” tables across a cluster of proxies, but I would think it would be easy to use Redis to cache this data so all proxies could access this shared data.
I’m not sure if Dovecot proxies would benefit from “sticks and peers” for IMAP protocol, but it would be nice if Dovecot proxies could maintain the IMAP session if the connections needed to be moved to another proxy (for failover). Maybe it isn’t so bad if a dovecot proxy all of a sudden “kicked” 10 Million IMAP sessions, but this might lead to a “login” flood for the remaining proxies. So, at least the authorization data (the passdb queries) should be shared between proxies using Redis.
- Permanent state: This is mainly about the storage. A lot of people use Dovecot with NFS. So one possibility for storing the permanent state is NFS. Another possibility with Dovecot Pro is to store it to object storage as blobs and keep a local cache of the state. A 3rd possibility might be to use some kind of a database for storing the permanent state. I'm fine with the first two, but with 3rd I see a lot of problems and not a whole lot of benefit. But if you think of the databases (or even NFS) as blob storage, you can think of them the same as any object storage and use the same obox format with them. What I'm mainly against is attempting to create some kind of a database that has structured format like (imap_uid, flags, ...) - I'm sure that can be useful for various purposes but performance or scalability isn't one of them.
I would separate the permanent state into two: the indexes and the message data. As I understand it, the indexes are the meta data about the message data. I believe, that to scale, the indexes need fast read access so this means storing on local NVMe SSD storage. But, I want the indexes to be reliably shared between all backend servers in a dovecot cluster. Again, this means to me that you need some fast in-memory database like Redis to be the “source of truth” for the indexes. I think doing read requests to Redis is very fast so you might not have to store a cache of the index on local NVMe SSD storage, but maybe I’m wrong.
As for the message data, I would really like the option of storing this data in some external database like MongoDB. MongoDB stores documents as JSON (actually BSON) data which seems perfect for email storage since emails are all text files. This would allow me to manage storage using the tools/techniques that an external database uses. MongoDB is designed to be hugely scalable and supports High Availability. I would rather manage a cluster of MongoDB instances containing a petabyte of data than trying to distribute the data among many Dovecot IMAP servers. The IMAP servers would then only be responsible for implementing IMAP and not be loaded down with all sorts of I/O so might be able to scale to 10 Million IMAP sessions per server.
If a MongoDB option wasn’t available, using cloud object storage would be a reasonable second choice. Unfortunately, the “obox” support you mentioned doesn’t seem to be open source. So, I am stuck using local disks (hopefully SSDs, but this is pricey) on multiple backend servers. I had reliability problems using NFS for a previous project and I am hesitant to try this solution for scaling Dovecot. Fortunately, my mailboxes are all very small (maybe 2MBs per user) since I delete messages older than 30 days and I store attachments (photos and videos) in cloud object storage served with local web server caching. So, scaling message data shouldn't be an issue for me for a long time.
Kevin
Comparison of Dovecot, Uwash, Courier, Cyrus and M-Box: http://www.isode.com/whitepapers/mbox-benchmark.html
Quoting Ruga <ruga@protonmail.com>:
Comparison of Dovecot, Uwash, Courier, Cyrus and M-Box: http://www.isode.com/whitepapers/mbox-benchmark.html
Wow. That comparison is only 11.5 years old.
The "default" file system of reiserfs and gcc-3.3 were dead giveaways.
I suspect Dovecot's changed a tad since that test.
=R=
Yes, and they (isode) still use it as marketing evidence. The benchmarking tool project also seems out of maintenance.
Sent from ProtonMail Mobile
On Thu, Feb 23, 2017 at 8:52 AM, M. Balridge <'dovecot@r.paypc.com'> wrote: Quoting Ruga <ruga@protonmail.com>:
Comparison of Dovecot, Uwash, Courier, Cyrus and M-Box: http://www.isode.com/whitepapers/mbox-benchmark.html
Wow. That comparison is only 11.5 years old.
The "default" file system of reiserfs and gcc-3.3 were dead giveaways.
I suspect Dovecot's changed a tad since that test.
=R=
On 23 Feb 2017, at 00:33, Ruga <ruga@protonmail.com> wrote:
Comparison of Dovecot, Uwash, Courier, Cyrus and M-Box: http://www.isode.com/whitepapers/mbox-benchmark.html
Uwash? as in UW IMAP that I used briefly in 1999? That hasn't seen an update in a decade?
-- Apple broke AppleScripting signatures in Mail.app, so no random signatures.
On Feb 21, 2017, at 11:12 PM, Christian Balzer <chibi@gol.com> wrote:
But even if you were to implement something that can handle 1 million or more sessions per server, would you want to? As in, if that server goes down, the resulting packet, authentication storm will be huge and most like result in a proverbial shit storm later. Having more than 10% or so of your customers on one machine and thus involved in an outage that you KNOW will hit you eventually strikes me as a bad idea.
The idea would be to store session state in an external database like Redis. I use Redis for PHP session data on the web servers and Redis is implemented as a high-availability cluster (using Redis Sentinels). If the IMAP session state is maintained externally in a high-availability datastore, then rebooting a mail server or having it go down unexpectedly should not mean that all existing sessions are “kicked” and the clients would need to log in again. Rather, a backup mail server or servers could take the load and just use the high-availability datastore to manage the sessions that were on the old server.
One potential problem, if not using shared storage for the mailboxes, is that dovecot replication is asynchronous so a small number of IMAP sessions might be out of date with the data on the replacement server, so some of the data in Redis might need to be re-cached to reflect the state of the backup mailstore. Other than that, I don’t think there would be much of a "proverbial shit storm” caused by the failure of one mail server, even if that server were to handle 1 million or more sessions per server. The remaining mail servers in the cluster would need to be able to absorb the load (maybe cluster in 3 server clusters would be the norm so each remaining server would only have to be able to take 50% of the sessions from the failed server while it is unavailable).
Kevin
participants (8)
-
@lbutlr
-
Christian Balzer
-
KT Walrus
-
M. Balridge
-
Ruga
-
Sami Ketola
-
Tanstaafl
-
Timo Sirainen