On Tue, 2010-03-16 at 10:55 -0500, Stan Hoeppner wrote:
Timo Sirainen put forth on 3/16/2010 6:37 AM:
On Tue, 2010-03-16 at 02:45 -0500, Stan Hoeppner wrote:
Concentrate on rewriting imapd into a threaded model, and get it right.
I could give a lot of reasons for this, but: no.
Ok, what am I missing? Given the current clone/fork imap parallelism architecture, wouldn't spawning imap worker threads from a master imap process be the most straightforward change to accomplish the process count shrink? With the least code changes, and thus be least likely to introduce new bugs?
Threads are useful when you need to do CPU intensive work in parallel, and threads need access to shared data structures. Dovecot doesn't do CPU intensive work, so threads do nothing but add extra overhead (scheduling, memory).
Maybe I didn't read your previous post correctly. It sure sounded like worker threads were exactly what you were describing.
I wanted asynchronous disk I/O from kernel. Since kernel doesn't support it yet, I'm forced to use threads until that happens. As soon as kernel supports AIO well, I can get rid of threads.
And anyway there's only going to be 1..n worker AIO threads per process, where n depends on load. So a single process could handle even thousands of IMAP connections, but if the connections are mainly idling have only a few AIO worker threads.
If you're not looking at threads, can you briefly describe the program flow and subroutine layout of the new imap server process(s)? You've got me really curious about this now since you're not looking at threads.
It's mostly the same as now, except now some functions return "try again later" if data isn't yet in memory. The first message in this thread tried to explain it and also had a link to previous thoughts about it.
If you don't clone/fork or threads, how else can you get decent parallel client scalability? Are you looking at nested fast looping serial code, and using AIO to minimize stalling the loops due to blocked I/O?
Yes to looping serial code, but if by nested you mean some kind of recursive looping, no. Recursive looping would just make everything break.
So there's just a single loop that waits for new things to do and then does as much of it as it can, and when it finds that it can't do more without waiting on disk IO, it returns back to the loop to find more stuff to do. Dovecot already does non-blocking network I/O that way, so things already support that well. It's just the disk I/O part that now needs to be made non-blocking. Previously I've thought it would make using the mailbox accessing APIs horribly difficult and ugly, but nowadays I've figured out a plan that doesn't change the API much (described in those two mails).