[Dovecot] Dovecot v2.2 plans

list at airstreamcomm.net list at airstreamcomm.net
Wed Feb 15 05:08:05 EET 2012


On Mon, 13 Feb 2012 13:47:06 +0200, Timo Sirainen <tss at iki.fi> wrote:
> Here's a list of things I've been thinking about implementing for
Dovecot
> v2.2. Probably not all of them will make it, but I'm at least interested
in
> working on these if I have time.
> 
> Previously I've mostly been working on things that different companies
> were paying me to work on. This is the first time I have my own company,
> but the prioritization still works pretty much the same way:
> 
>  - 1. priority: If your company is highly interested in getting
something
>  implemented, we can do it as a project via my company. This guarantees
>  that you'll get the feature implemented in a way that integrates well
into
>  your system.
>  - 2. priority: Companies who have bought Dovecot support contract can
let
>  me know what they're interested in getting implemented. It's not a
>  guarantee that it gets implemented, but it does affect my priorities.
:)
>  - 3. priority: Things other people want to get implemented.
> 
> There are also a lot of other things I have to spend my time on, which
are
> before the 2. priority above. I guess we'll see how things work out.
> 
> dsync-based replication
> -----------------------
> 
> I'll write a separate post about this later. Besides, it's coming for
> Dovecot v2.1 so it's a bit off topic, but I thought I'd mention it
anyway.
> 
> Shared mailbox improvements
> ---------------------------
> 
> Support for private flags for all mailbox formats:
> 
> namespace {
>   type = public
>   prefix = Public/
>   mail_location =
mdbox:/var/vmail/public:PVTINDEX=~/mdbox/indexes-public
> }
> 
>  - dsync needs to be able to replicate the private flags as well as
shared
>  flags.
>  - might as well add a common way for all mailbox formats to specify
which
>  flags are shared and which aren't. $controldir/dovecot-flags would say
>  which is the default (private or shared) and what flags/keywords are
the
>  opposite.
>  - easy way to configure shared mailboxes to be accessed via imapc
>  backend, which would allow easy shared mailbox accesses across servers
or
>  simply between two system users in same server. (this may be tricky to
>  dsync.)
>  - global ACLs read from a single file supporting wildcards, instead of
>  multiple different files
>  - default ACLs for each namespace/storage root (maybe implemented using
>  the above..)
> 
> Metadata / annotations
> ----------------------
> 
> Add support for server, mailbox and mail annotations. These need to be
> dsyncable, so their changes need to be stored in various .log files:
> 
> 1. Per-server metadata. This is similar to subscriptions: Add changes to
> dovecot.mailbox.log file, with each entry name a hash of the metadata
key
> that was changed.
> 
> 2. Per-mailbox metadata. Changes to this belong inside
> mailbox_transaction_context, which write the changes to mailbox's
> dovecot.index.log files. Each log record contains a list of changed
> annotation keys. This gives each change a modseq, and also allows easily
> finding out what changes other clients have done, so if a client has
done
> ENABLE METADATA Dovecot can easily push metadata changes to client by
only
> reading the dovecot.index.log file.
> 
> 3. Per-mail metadata. This is pretty much equivalent to per-mailbox
> metadata, except changes are associated to specific message UIDs.
> 
> The permanent storage is in dict. The dict keys have components:
>  - priv/ vs. shared/ for specifying private vs. shared metadata
>  - server/ vs mailbox/<mailbox guid>/ vs. mail/<mailbox guid>/<uid>
>  - the metadata key name
> 
> This would be a good time to improve the dict configuration to allow
> things like:
>  - mixed backends for different hierarchies (e.g. priv/mailbox/* goes to
a
>  file, while the rest goes to sql)
>  - allow sql dict to be used in more relational way, so that mail
>  annotations could be stored with tables: mailbox (id, guid) and
>  mail_annotation (mailbox_id, key, value), i.e. avoid duplicating the
guid
>  everywhere.
> 
> Things to think through:
>  - How to handle quota? Probably needs to be different from regular mail
>  quota. Probably some per-user "metadata quota bytes" counter/limit.
>  - Dict lookups should be done asynchronously and prefetched as much as
>  possible. For per-mail annotation lookups mail_alloc() needs to include
a
>  list of annotations that are wanted.
> 
> Configuration
> -------------
> 
> Copy all mail settings to namespaces, so it'll be possible to use
> per-namespace mailbox settings. Especially important for imapc_*
settings,
> but can be useful for others as well. Those settings that aren't
explicitly
> defined in the namespace will use the global defaults. (Should doveconf
-a
> show all of these values, or simply the explicitly set values?)
> 
> Get rid of *.conf.ext files. Make everything part of dovecot.conf, so
> doveconf -n outputs ALL of the configuration. There are mainly 3 config
> files I'm thinking about: dict-sql, passdb/userdb sql, passdb/userdb
ldap.
> The dict-sql is something I think needs a bigger redesign (mentioned
above
> in "Metadata" section), but the sql/ldap auth configs could be merged.
One
> way could be:
> 
> sql_db sqlmails {
>   # most settings from dovecot-sql.conf.ext, except for queries
>   driver = mysql
>   connect = ...
> }
> 
> ldap_db ldapmails {
>   # most settings from dovecot-ldap.conf.ext, except attributes/filters
> }
> 
> passdb {
>   driver = sql
>   db = sqlmails
>   sql_query = select password from users where username = '%u'
> }
> passdb {
>   driver = ldap
>   db = ldapmails
>   ldap_attributes {
>     password = %{ldap:userPassword}
>   }
>   ldap_filter = ...
> }
> 
> The sql_db {} and ldap_db {} would be generic enough to be used
everywhere
> (e.g. dict-sql), not just for passdb/userdb.
> 
> Some problems:
>  - Similar to the per-namespace mail settings, doveconf -a would output
>  all sql_query, ldap_attributes, ldap_filter, etc. settings for all
>  passdbs/userdbs. Perhaps a similar solution?
>  - The database configs contain passwords, so they should be readable
only
>  by root. This makes running dovecot-lda and maybe doveadm difficult,
since
>  they fail at "permission denied" when trying to open the config. There
are
>  probably only two solutions: a) The db configs need to be
!include_try'd
>  or b) the configs can be world-readable, but only passwords are placed
to
>  only-root-readable files by using "password = <db.password"
> 
> IMAP state saving/restoring
> ---------------------------
> 
> IMAP connections are often long running. Problems with this:
> 
> 1. Currently each connection requires a separate process (at least to
work
> reliably), which means each connection also uses quite a lot of memory
even
> when they aren't doing anything for a long time.
> 2. Some clients don't handle lost connections very nicely. So Dovecot
> can't be upgraded without causing some user annoyance. Also in a cluster
if
> you want to bring down one server, the connections have to be
disconnected
> before they can be moved to another server.
> 
> If IMAP session state could be reliably saved and later restored to
> another process, both of the above problems could be avoided entirely.
> Typically when a connection is IDLEing there are really just 4 things
that
> need to be remembered: username, selected mailbox name, its UIDVALIDITY
and
> HIGHESTMODSEQ. With this information the IMAP session can be fully
restored
> in another process without losing any state. So, what we could do is:
> 
> 1. When an IMAP connection has bee IDLEing for a while (configurable
> initial time, could be dynamically adjusted):
>  - move the IMAP state and the connection fd to imap-idle process
>  - the old imap process is destroyed
>  - imap-idle process can handle lots of IMAP connections
>  - imap-idle process also uses inotify/etc. to watch for changes in the
>  specified mailbox
>  - if any mailbox changes happen or IMAP client sends a command, start
up
>  a new imap process, restore the state and continue from where we left
off
>  - This could save quite a lot of memory at the expense of some CPU
usage
> 
> 2. Dovecot proxy <-> backend protocol could be improved to support
moving
> connection to another backend. Possibly using a separate control
connection
> to avoid making the proxying less efficient in normal operation.
> 
> 3. When restarting Dovecot, move all the connections to a process that
> keeps the connections open for a while. When Dovecot starts up, create
imap
> processes back to the connections. This allows changing configuration
for
> existing client connections (which sometimes may be bad! need to add
checks
> against client-visible config conflicts), upgrading Dovecot, etc.
without
> being visible to clients. The only problem is SSL connections: OpenSSL
> doens't provide a way to save/restore state, so either you need to set
> shutdown_clients=no (and possibly keep some imap-login processes doing
SSL
> proxying for a long time), or SSL connections need to be killed. Of
course
> the SSL handling could be outsourced to some other software/hardware
> outside Dovecot.
> 
> The IMAP state saving isn't always easy. Initially it could be
implemented
> only for the simple cases (which are a majority) and later extended to
> cover more.
> 
> IMAP extensions
> ---------------
> 
>  - CATENATE is already implemented by Stephan
>  - URLAUTH is also planned to be implemented, somewhat differently than
in
>  Apple's patch. The idea is to create a separate imap-urlauth service
that
>  provides extra security.
>  - NOTIFY extension could be implemented efficiently using mailbox list
>  indexes, which already exists in v2.1.
>  - FILTERS extension can be easily implemented once METADATA is
implemented
>  - There are also other missing extensions, but they're probably less
>  important: BINARY & URLAUTH=BINARY, CONVERT, CONTEXT=SORT,
>  CREATE-SPECIAL-USE, MULTISEARCH, UTF8=* and some i18n stuff.
> 
> Backups
> -------
> 
> Filesystem based backups have worked well enough with Dovecot in the
past.
> But with new features like single instance storage it's becoming more
> difficult. There's no 100% consistent way to even get filesystem level
> backups with SIS enabled, because deleting both the message file and its
> attachment files can't be done atomically (although usually this isn't a
> real problem). Restoring SIS mails is more difficult though, first you
need
> to restore the dbox mail files and then you need to figure out what
> attachment files from SIS need to be restored, and finally you'll need
to
> do doveadm import to put them into their final destination.
> 
> I don't have much experience with backup software, but other people in
my
> company do. The initial idea is to implement a Dovecot backup agent to
one
> (commercial) backup software, which allows doing online backups and
> restoring mails one user/mailbox/mail at a time. I don't know the
details
> yet how exactly this is going to be implemented, but the basic plan is
> probably to implement a "backup" mail storage backend, which is a
> PostgreSQL pg_dump-like flat file containing mails from all mailboxes.
> doveadm backup/import can then export/import this format via
stdout/stdin.
> Incremental backups could possibly be done by giving a timestamp of
> previous backup run (I'm not sure about this yet).
> 
> Once I've managed to implement the first fully functional backup agent,
it
> should become clearer how to implement it to other backup solutions.
> 
> Random things
> -------------
> 
>  - dovecot.index.cache file writing is too complex, should be simplified
>  - Enable auth_debug[_passwords]=yes on-the-fly for some specific
>  users/IPs via doveadm
>  - Optimize virtual mailboxes using mailbox list indexes. It wouldn't
>  anymore need to keep all the backend mailboxes' index files open.
>  - Would be nice to go forward with supporting key-value databases as
mail
>  storage backends.

Timo,

I know you mentioned you would cover this in a coming post, but we were
curious what the new dsync replication will be capable of.  Would it
monitor changes to mailboxes and push automatic replication to the remote
mail store, and if this is the case could it be an N-way replication setup
in which any host in a cluster can participate in the replication?  Do you
consider this to be a high availability solution?

Thanks,

Michael




More information about the dovecot mailing list