[Dovecot] Apple patch 10
Patch #10 allows the pop3 and imap mail processes to handle multiple
clients. We know this weakens the security model but it greatly
increases scalability especially when clients are idle.
Here's how it works. When there are no mail processes, or none
serving fewer than mail_max_connections clients, the master creates a
new mail process pretty much like it does now, but with some new
environment variables (PERSISTENT_MAIL_PROCESS and CONNECTION_ID, and
ADVISE_SET[GU]ID instead of RESTRICT_SET[GU]ID) and a unix-domain
socket connection back to the master. For future clients the master
sends the client socket and a full dump of the environment via that
connection. Persistent mail processes multiplex I/O for their
connected clients and switch the whole environment and the effective
user ID, group ID, and supplemental groups each time. The result is
that most of dovecot's code and assumptions remain unchanged: the
mail process executes imap/pop commands with the permissions of the
user and with nearly the same environment so getenv() still works as
expected. Some assumptions do change, for instance global state
variables like last_partial and quota_set are forbidden, and must be
stored per-client or per-user. Also it's no longer OK just to exit or
panic on error, for that disconnects multiple users not just one.
Some key entry points in the patch are create_mail_process() in mail- process.c and io_env_switch() in ioloop.c.
Notes about this patch:
The base for this patch is dovecot-1.1.7 + Apple patch 9, not
because this patch needs Open Directory but simply because it adds one
line to the Apple attributions in COPYING.This patch depends on Apple patch 7 (hash_table_create/destroy).
This patch introduces the following new config options for the pop3
and imap protocols:- mail_process_per_connection = yes
- mail_max_connections = 20
The default value of mail_process_per_connection (yes) preserves the
current secure model. Changing it to no allows multiplexing, with up
to mail_max_connections simultaneous clients per process. See dovecot- example.conf for more info.
It restricts the mail_executable config option to minimize
unpleasant surprises. When mail_process_per_connection = no the
communication protocol between the master process and the mail
processes changes as described above. A naive third-party
mail_executable might break. See dovecot-example.conf for more info.Some changes are intentionally awkward in order to minimize code
deltas to simplify merges (of new dovecot releases into our source
tree). These are so marked. Feel free to clean them up.Changes tagged with "APPLE" are ours, including the whole new
directory mail-common. All the untagged changes, including the entire
contents of the new files mail-user.[ch], are copied straight from
dovecot-1.2 (alpha3, I believe). We needed dovecot-1.2's support for
multiple users in order to handle quotas correctly, and we needed it
in 1.1. Specifically, we copied these changes from dovecot-1.2 (and a
couple others too I think): 8082 http://hg.dovecot.org/dovecot-1.2/rev/db66611fd195 Added struct mail_user and fixed the code to support multiple
users per process. 8084 http://hg.dovecot.org/dovecot-1.2/rev/f12f8c1da0bf Forgot to add mail-user.* files in previous struct mail_user
commit. 8085 http://hg.dovecot.org/dovecot-1.2/rev/bf83aa9c3f4a Removed pool parameter from mail_namespaces_init*(). Use
mail_user's pool intead. 8091 http://hg.dovecot.org/dovecot-1.2/rev/ceca59aaae89 quota-fs: compile fix for previous changes. 8096 http://hg.dovecot.org/dovecot-1.2/rev/f35a8a3dc06d Fixed FS quota compiling and Maildir++ quota with multiple users. 8109 http://hg.dovecot.org/dovecot-1.2/rev/e7929190cd32 fts-solr: Fixed compiling with recent struct mail_user changes. 8137 http://hg.dovecot.org/dovecot-1.2/rev/b2a258213ee0 Created mail_user_[try_]home_expand(). Used them for expanding
mail directories. 8294 http://hg.dovecot.org/dovecot-1.2/rev/8aa69e3d27ef Trash plugin: Assign storage to all mailboxes at startup so
errors are caught immediately. Also previous optimization change broke trash plugin when using
multiple mail_users. This change fixes it to work again. If a change in the patch is not marked somehow with APPLE then it
comes from dovecot-1.2.Some other data structures also needed to be made per-client/user,
for instance last_partial from imap-fetch-body.c, quota_set from quota- plugin.c, and process_primary_gid etc. from restrict-access.c.Sending the master process SIGINFO (on platforms that support that
signal) makes it print its view of all the clients connected to all
the mail processes. Sending SIGINFO to an individual mail process
makes it print its view. These are not necessarily always the same.
For instance, on a config reload (SIGHUP), the master disconnects from
the mail processes but leaves them running. The output is admittedly
kinda geeky.The logging tag for a persistent mail process is just "*" instead of
the user name, since a process can serve multiple users. Some
individual messages (such as connect/disconnect) identify the user.The expire plugin uses global state variables which are incompatible
with persistent mail processes. The patch detects trouble but does
not fix it because we don't need it. We leave that as an exercise for
you :).The dict client interface is blocking which is unfortunate for
persistent mail processes. The patch does not address this, since the
delay is probably small and we don't use it anyway.Some of the accounting is gnarly because of the need to honor
mail_max_userip_connections in a world where a single user can have
any number of active connections on any number of mail processes.
Finally, here's the patch.
On Tue, 2009-01-06 at 11:57 -0600, Mike Abbott wrote:
Here's how it works. When there are no mail processes, or none
serving fewer than mail_max_connections clients, the master creates a
new mail process pretty much like it does now, but with some new
environment variables (PERSISTENT_MAIL_PROCESS and CONNECTION_ID, and
ADVISE_SET[GU]ID instead of RESTRICT_SET[GU]ID) and a unix-domain
socket connection back to the master. For future clients the master
sends the client socket and a full dump of the environment via that
connection. Persistent mail processes multiplex I/O for their
connected clients and switch the whole environment and the effective
user ID, group ID, and supplemental groups each time.
Interesting. I hadn't thought of the possibility of doing this in ioloop. Eventually, perhaps in Dovecot v1.3, I hope to get rid of this configuration-in-environment and the environment switching can be dropped then.
Also I thought that your patch would only put the same user's connections to the same process. If done like that it possibly wouldn't even need environment switching and at least wouldn't need uid/gid switching. Also since it would be safer it could be done without a specific configuration option.
Perhaps mail_process_per_connection=yes would work like you've done currently and =no would still connect the same user to same processes. Hmm. Actually even with =yes it should try to do that when possible.
- Sending the master process SIGINFO (on platforms that support that
signal) makes it print its view of all the clients connected to all
the mail processes. Sending SIGINFO to an individual mail process
makes it print its view. These are not necessarily always the same.
For instance, on a config reload (SIGHUP), the master disconnects from
the mail processes but leaves them running. The output is admittedly
kinda geeky.
Maybe just SIGUSR2 that's available everywhere?
- The logging tag for a persistent mail process is just "*" instead of
the user name, since a process can serve multiple users. Some
individual messages (such as connect/disconnect) identify the user.
Couldn't this also be done the same way as environment switching?
Actually the context switching details could probably be abstracted out of ioloop and just made into configurable callbacks where imap/pop3 would do the environment/etc. switch.
- The dict client interface is blocking which is unfortunate for
persistent mail processes. The patch does not address this, since the
delay is probably small and we don't use it anyway.
Even if it was non-blocking, its callers would eventually have to block because the calls always(?) come via lib-storage API, which is blocking in most places. Non-blocking lib-storage API would be nice in theory but I haven't figured out how to do it without making the API horrible to use.
And a small implementation issue: mail-processes.c:mail_connections should probably be an array instead of a hash table.
Also I thought that your patch would only put the same user's connections to the same process.
Our aim was high scalability so we had to mix users on processes. We
are aware of the security implications.
I could see a trinary state for this: "off" for the current behavior,
"safe" for the behavior you describe, and "max" for the behavior in
this patch.
Maybe just SIGUSR2 that's available everywhere?
Sure. We chose SIGINFO because libraries on some systems hijack
SIGUSR[12] for their own purposes and we didn't want to step on any
toes.
- The logging tag for a persistent mail process is just "*" instead
of the user name Couldn't this also be done the same way as environment switching?
No because the master process adds the tag, not the mail process.
mail-processes.c:mail_connections should probably be an array instead of a hash table.
Sure, either way. The hash table affords quick lookups when
mail_max_connections is large, but scanning an array for an int is
fast too.
On Jan 6, 2009, at 6:51 PM, Mike Abbott wrote:
- The logging tag for a persistent mail process is just "*"
instead of the user name Couldn't this also be done the same way as environment switching?No because the master process adds the tag, not the mail process.
Ah, right. But that could be changed. I think login process also does
that, or at least I've thought that at some point it should do that.
So that with process_per_connection=yes the prefix can't be changed
but with =no it could be empty and the child process could set it at
will.
mail-processes.c:mail_connections should probably be an array instead of a hash table.
Sure, either way. The hash table affords quick lookups when
mail_max_connections is large, but scanning an array for an int is
fast too.
But the current code looks like it finds the first free connection_id
in any case, which can basically be thought of as the first available
connection index number. And that's somewhat faster to find in array
than in hash. And lookup for the index number is also at least as fast
in array. So I don't really see any advantages for hash here.
Here is a feature I wish Apple might consider implementing:
When Dovecot is compiled for Mac OS X and using Maildir, have all mail files be written to disk with a dedicated OSType (equivalent to the .eml extension).
This would make existing QuickLook generators and Spotlight importers (e.g. I'm happily using http://www.mew.org/feature/spotlight.html.en , which works based on OSType) automatically available to index the Dovecot Maildir.
[I haven't looked, but could this be a matter of modifying a single write function in Dovecot?]
FZiegler
On Jan 6, 2009, at 5:51 PM, I wrote:
Also I thought that your patch would only put the same user's
connections to the same process.I could see a trinary state for this: "off" for the current
behavior, "safe" for the behavior you describe, and "max" for the
behavior in this patch.
FYI: I just implemented the trinary state, so mail processes can
share no, one user's, or many users' connections. For the next round
of patches, someday.
participants (3)
-
fz.2003@klacto.net
-
Mike Abbott
-
Timo Sirainen