A more efficient algorithm would reduce computational complexity, and the need for expensive power-hungry CPUs.
Sent from ProtonMail Mobile
On Wed, Feb 22, 2017 at 5:12 AM, Christian Balzer <'chibi@gol.com'> wrote: On Tue, 21 Feb 2017 09:49:39 -0500 KT Walrus wrote:
I just read this blog: https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-co... <https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/> about scaling to 12 Million Concurrent Connections on a single server and it got me thinking.
While that's a nice article, nothing in it was news to me or particular complex when one does large scale stuff, like Ceph for example.
Would it be possible to scale Dovecot IMAP server to 10 Million IMAP sessions on a single server?
I'm sure Timo's answer will (or would, if he could be bothered) be along the lines of: "Sure, if you give me all your gold and then some for a complete rewrite of, well, everything".
What you're missing and what the bad idea here is that as mentioned before scale-up only goes so far. I was feeling that my goal of 500k users/sessions in 2-node active/active cluster was quite ambitious and currently I'm looking at 200k sessions as something achievable with the current Dovecot and other limitations.
But even if you were to implement something that can handle 1 million or more sessions per server, would you want to? As in, if that server goes down, the resulting packet, authentication storm will be huge and most like result in a proverbial shit storm later. Having more than 10% or so of your customers on one machine and thus involved in an outage that you KNOW will hit you eventually strikes me as a bad idea.
I'm not sure how the design below meshes with Timo's lofty goals and standards when it comes to security as well.
And a push with the right people (clients) to support IMAP NOTIFY would of course reduce the number of sessions significantly.
Finally, Dovecot in proxy mode already scales quite well.
Christian
I think the current implementation of having a separate process manage each active IMAP session (w/ the possibility of moving idling sessions to a single hibernate process) will never be able to deploy a single server managing 10 Million IMAP sessions.
But, would it be possible to implement a new IMAP server plugin that uses a fixed configurable pool of "worker" processes, much like NGINX or PHP-FPM does. These servers can probably scale to 10 Million TCP connections, if the server is carefully tuned and has enough cores/memory to support that many active sessions.
I’m thinking that the new IMAP server could use some external database (e.g., Redis or Memcached) to save all the sessions state and have the "worker" processes poll the TCP sockets for new IMAP commands to process (fetching the session state from the external database when it has a command that is waiting on a response). The Dovecot IMAP proxies could even queue incoming commands to proxy many incoming requests to a smaller number of backend connections (like ProxySQL does for MySQL requests). That might allow each Dovecot proxy to support 10 Million IMAP sessions and a single backend could support multiple front end Dovecot proxies (to scale to 100 Million concurrent IMAP connections using 10 proxies for 100 Million connections and 1 backend server for 10 Million connections).
Of course, the backend server may need to be beefy and have very fast NVMe SSDs for local storage, but changing the IMAP server to manage a pool of workers instead of requiring a process per active session, would allow bigger scale up and could save large sites a lot of money.
Is this a good idea? Or, am I missing something?
Kevin
-- Christian Balzer Network/Systems Engineer chibi@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/