Re: [Dovecot] Design: Asynchronous I/O for single/multi-dbox

16 Mar 2010


      On Tue, 2010-03-16 at 10:55 -0500, Stan Hoeppner wrote:
...
Timo Sirainen put forth on 3/16/2010 6:37 AM:
...
On Tue, 2010-03-16 at 02:45 -0500, Stan Hoeppner wrote:
...
Concentrate on rewriting imapd into a threaded model, and get it right.
I could give a lot of reasons for this, but: no.
Ok, what am I missing?  Given the current clone/fork imap parallelism
architecture, wouldn't spawning imap worker threads from a master imap
process be the most straightforward change to accomplish the process count
shrink?  With the least code changes, and thus be least likely to introduce
new bugs?
Threads are useful when you need to do CPU intensive work in parallel,
and threads need access to shared data structures. Dovecot doesn't do
CPU intensive work, so threads do nothing but add extra overhead
(scheduling, memory).
...
Maybe I didn't read your previous post correctly.  It sure sounded like
worker threads were exactly what you were describing.
I wanted asynchronous disk I/O from kernel. Since kernel doesn't support
it yet, I'm forced to use threads until that happens. As soon as kernel
supports AIO well, I can get rid of threads.
And anyway there's only going to be 1..n worker AIO threads per process,
where n depends on load. So a single process could handle even thousands
of IMAP connections, but if the connections are mainly idling have only
a few AIO worker threads.
...
If you're not looking
at threads, can you briefly describe the program flow and subroutine layout
of the new imap server process(s)?  You've got me really curious about this
now since you're not looking at threads.
It's mostly the same as now, except now some functions return "try again
later" if data isn't yet in memory. The first message in this thread
tried to explain it and also had a link to previous thoughts about it.
...
If you don't clone/fork or threads, how else can you get decent parallel
client scalability?  Are you looking at nested fast looping serial code, and
using AIO to minimize stalling the loops due to blocked I/O?
Yes to looping serial code, but if by nested you mean some kind of
recursive looping, no. Recursive looping would just make everything
break.
So there's just a single loop that waits for new things to do and then
does as much of it as it can, and when it finds that it can't do more
without waiting on disk IO, it returns back to the loop to find more
stuff to do. Dovecot already does non-blocking network I/O that way, so
things already support that well. It's just the disk I/O part that now
needs to be made non-blocking. Previously I've thought it would make
using the mailbox accessing APIs horribly difficult and ugly, but
nowadays I've figured out a plan that doesn't change the API much
(described in those two mails).