[Dovecot] Overly long email of miscellaneous Dovecot migration questions

Mark Moseley moseleymark at gmail.com
Wed Mar 17 21:15:18 EET 2010


First off, thanks for the reply. I appreciate it greatly!

On Tue, Mar 16, 2010 at 4:36 PM, Timo Sirainen <tss at iki.fi> wrote:
> On 17.3.2010, at 1.01, Mark Moseley wrote:
>
>> * Since Dovecot 2.0 seems like it's just around the corner, that's all
>> I've been testing, and indeed all I've even looked at.
>
> Yes, hopefully it's coming soon :)
>
>> * Our #1 main motivation for looking Dovecot is relief for our
>> currently overtaxed NFS servers, mostly in the form of the index
>> files. Benchmarking dovecot looks great, even with the index files in
>> the maildir.
>
> Have you read the thread starting from http://dovecot.org/list/dovecot/2010-January/046106.html and spanning a month or so? That provides a good view of potential problems with NFS.

I'd read through that thread and subsequently forgot the implications.
Presumably the same pitfalls apply to 2.0? From the wiki and from the
thread, it sounds like this just affects index files. One thing I
didn't see in the thread (though it'd be easy to miss in a thread that
long) is whether performance suffered horribly with 'noac' on a
*separate* index mount (the mentions I can find where they tried
'noac', it was on the entire mail store, not a separate index mount)
-- though doing 'noac' on anything seems like a recipe for disaster.
Another extreme option might be to just keep indexes on local disk
only. Obviously users hitting another server wouldn't benefit from the
other server and it'd incur updates to the indexes but they seem
somewhat minimal -- in sane cases at least where the number of
newer-than-the-index messages is reasonable. With aggressive load
balancer stickiness at maybe a /24 netblock level, I'd be able to keep
people localized for a little while (with no way around people
accessing the same mailbox from home and work). Am I underestimating
the costs of incremental index updates on local disks? In testing, tt
seemed like dovecot was pretty conservative in disk accesses/reads
when updating. I'm hoping the 95%-of-the-time worst case will be
someone hitting the same mailbox from two locations, so they'd be
hitting at most two servers. With local index caches, I'm mildly sure
I could live with that (esp if I can talk my way into an SSD on each
one). Am I being too naive?


>> * Exim: We currently deliver all of our mail via Exim on separate
>> servers. Our POP3/IMAP servers only do POP3/IMAP and the Exim mail
>> servers delivering to maildirs only do Exim. From what I've seen in
>> the docs and various threads, from what I can gather, the best thing
>> to do in that case would be to use Exim's built-in maildir handling,
>> instead of using 'deliver'. That would be my preference anyway, but I
>> wanted to make sure I didn't misinterpret things.
>
> v2.0 supports also LMTP server, so you could deliver to Dovecot that way.
>
>> * Any problems running Courier POP3 and Dovecot IMAP for a while,
>> possibly Courier IMAP and Dovecot IMAP concurrently?
>
> Courier POP + Dovecot IMAP is fine. But concurrently running both POPs or both IMAPs is just going to cause trouble because of conflicting UIDs. You might be able to make both Dovecot and Courier use the same POP3 UIDLs, but I wouldn't really trust it.
>
> One possibility would be to just run the migration script on login, so users would migrate to Dovecot as they log in.

Good to know. I'll definitely limit everything IMAP to just dovecot
then (POP3 was a given). Looking back, I forgot about the fact that
folder subscriptions could get out of whack too, in a mixed IMAP
environment, if someone updated their subscriptions on one platform.


>> * Union mailboxes: I'm pretty sure in a fairly recent thread that Timo
>> said that something like a 'union' mailbox (at least with maildir)
>> wasn't possible.
>
> I also had a thought where you could do that if you wrote some scripts and such that copied the mails to the other storage and replaced the original file with a symlink. But that of course has some potential race conditions and other problems with either side of the symlink disappearing but the other not.
>
> Single-dbox would really be the best solution for this in future. It's currently somewhat broken in v2.0 tree, but it probably won't be too long until I'm going to start doing a migration from Maildir to dbox for a similar NFS system than yours.

My main concern with doing anything with symlinks is how easy it'd be
for users to inadvertently undo, by moving messages around or
deleting/recreating folders.


>> I tried messing with multiple 'private' namespaces
>> (i.e. a namespace called "ARCHIVE" with a "location" different than
>> the INBOX location, ideally placed on slower but denser NFS servers)
>> but even with 'hidden=no' and 'list=yes', only the main INBOX folders
>> would show up, so I'm guessing that's not going to work.
>
> You can create more namespaces, so I guess that was some kind of a configuration problem.

I'll play around with that more. So it should be possible to have more
than one private 'hidden=no; list=yes' namespace that'd show up in a
mail client's subscribable folders list?


>> That would be
>> a killer feature, to be able to serve an alternate namespace that
>> would show up in a mail client's subscribable list that could be on
>> slower storage than the main inbox (though I'm not sure a mail client
>> can even handle multiple namespaces).
>
> The clients don't need to be aware of namespaces. And you should be able to do that already. But do you think users would actually move their mails to there?

Good to know. As far as getting people to use it, I'll let marketing
worry about that one :)   Though one thought would be to set quotas on
the ARCHIVE namespace to be much higher than the regular INBOX one
(with the assumption of us returning regular INBOX quotas to a
reasonable level). Or alternatively, we could move unused mail there
and it'd still be accessible with the same mail account (though
obviously with user intervention to subscribe to the new folder). Or
we could start sticking Spam there. Offhand, before I start sifting
through source code, is Dovecot on a 32-bit system limited to a 32-bit
-- i.e. 2 gig -- limit on quota size like Courier is?


>> * Any problems with keeping only quota limits in db and not current
>> quota numbers? Our limits come out of a SQL table but  the current
>> counts just live in the maildir file.
>
> That's how most people do it.

Excellent, thanks


>> * Any problem with leaving the namespace in "Courier compatibility"
>> mode? I.e. in namespace 'private', leaving "prefix = INBOX.". With 4
>> million mailboxes, FAQs all over the place, and support reps trained
>> in a particular way of doing things in IMAP, it'd be hellish to try to
>> change the prefix (I know I could leave the courier namespace around
>> with 'hidden=yes' but retraining support staff is perhaps better left
>> to phase #2). Do I lose anything besides tidiness by not changing it
>> to "prefix =" as if I was deploying dovecot from scratch? Does it hurt
>> performance in any significant way? Benchmarking doesn't look any
>> different, so I'm guessing not.
>
> Keeping the INBOX. prefix for the initial migration is a good idea. Once everything has worked perfectly for a few months, moving to hidden=yes could be a good idea too. There are some clients that don't like the INBOX. prefix.

Good to know. Yeah, it'd be a good phase 2 thing. There'll already be
so many moving parts that it'd be good to do the optional ones later
on.


>> * One thing that threw me and might be good for a FAQ (unless it was
>> just me misconfiguring things) was when I started playing with putting
>> the index files in an alternate location. I was utterly perplexed why
>> it'd create the directories for the indexes but they'd be empty. Based
>> on their location and names when they're in the maildir, I was just
>> looking for the same dovecot.index* files right in the alternate
>> directory. It wasn't till I started strace'ing that I noticed that the
>> index files were indeed getting created but in a subdirectory called
>> .INBOX (and with me just doing 'ls').
>
> I try to avoid having a FAQ. Instead I try to change Dovecot so that the question itself goes away. I haven't heard of people having this problem before .. but one possible solution could be to just create a file to the directory containing some text that makes it clear. Then again that is slightly bloaty too..

It's definitely a thing to just chalk up to me being a newbie and
probably isn't worth intervention. As far as a marker file on the same
level as .INBOX, I know for my part I'd rather save the 4 million
extra inodes instead of having that placeholder.


>> If we're going to have to live with
>> users complaining about a one-time redownload of just post-conversion
>> mail, I'll need to get started convincing the higher-ups that that's
>> life.
>
> Check what the new Courier POP3 UIDLs look like. If all of them are maildir base filename, setting pop3_uidl_format=%f should make this conversion easy. Just run it through once to get old UIDLs converted and then you don't need to worry about it, because both Courier and Dovecot give the same UIDLs. But I never really understood how Courier assigns new UIDLs, sometimes it seems to use the filename and sometimes not..

So do you mean keep pop3_uidl_format=%f basically forever? That'd work
for me, but are there any big implications, performance-wise or other,
of not using the default "%08Xu%08Xv"? Does that require a
corresponding change to the conversion script? So I'd have Dovecot use
the Courier-style UIDLs, run the conversion to get the old UIDLs (i.e.
anything not using the Courier "%f", which is a bit) into
dovecot-uidlist, and then not worry about any new mail, since clients
will see the same UIDLs (and dovecot will just update dovecot-uidlist
with those UIDLs the first time they log into Dovecot's POP3)?

A good portion of these are generated by our Courier 4.x, with lots
more still kicking around from Courier 3.x. The main wrinkle is that
we've acquired many many mailboxes from other companies running
different servers, so the filenames can be a mixed bag. I'll hope for
the best :)


More information about the dovecot mailing list