Moving a large number of messages into an IMAP folder on a dovecot
server takes quite a long time. This is a popular operation for users
migrating their stored messages from their old mail servers through a
mail user agent, and it causes a poor first impression of dovecot.
I'd like your help to speed this up.
I have studied this problem in recent versions, including 1.1.[17-19]
and 1.2.[4-6] but I'll focus on 1.2.6, using Maildir++. It seems to
be caused by O(n^2) behavior in maintaining the UID list, where n is
the number of messages. The client selects a folder then sends
thousands of individual APPEND commands (no use of multiappend
unfortunately). These cause dovecot to behave as follows:
- For every other APPENDed message, dovecot appends the new UID to
the list quickly. No problem here, this is fast. - For every other other APPENDed message, dovecot scans the entire
UID list. This is an O(n) algorithm. Since it happens every n/2
times it causes O(n^2) behavior across n consecutive APPENDs.
More specifically, the uidlist->files hash table with all the UIDs, checking for
- APPEND a message to a folder that already contains 30,000
messages. maildir_uidlist_refresh_fast_init() takes the fast path. - APPEND a second message to the same folder.
maildir_uidlist_refresh_fast_init() takes the slow path: it populates
duplicates. At the end of the command maildir_uidlist_deinit()
destroys the hash table. Note how this APPEND takes much longer than
#1.
3. APPEND a third message to the same folder. Just like #1.
4. APPEND a fourth message to the same folder. Just like #2, and
since #2 destroyed the hash, it has to do all that work over again.
5. And so on.
What can be done to make maildir_uidlist_refresh_fast_init() choose
the fast path more often?