[Dovecot] some questions on dovecot or rather a mail system setup

Stan Hoeppner stan at hardwarefreak.com
Tue Oct 9 10:57:43 EEST 2012


On 10/8/2012 4:37 PM, Christoph Anton Mitterer wrote:

The proper way to accomplish your goals, or at least the big ones.

> - I generally want to have _all_ mail (which is not sorted out because
> of being spam) to be archived at the local server.

http://www.postfix.org/postconf.5.html#always_bcc

> - But(!) I want to selectively keep (in addition) mail at the internet
> server.
> For example I may want to select the folder that contains all mail form
> some friend to be kept online completely.

See above.

> But I may want to decide that mailinglists keep only the last 10 days
> and/or 1000 messages of mail.

http://wiki2.dovecot.org/Plugins/Expire

Does age based deletion, but not folder message count based.  You must
use your MUA, TBird, for the latter.  It's far easier to configure this
in TBird than in Dovecot config files.  You seem like the type who wants
flexibility so you can change things often, so use TBird to be happy here.

> - The idea is, that the local server regularly (when it is
> online/running) catches new mail from the internet server... and stores
> it in the archive.

This is not an option.  The system must be up and connected to the
internet 24x7x365.  It must have an MX record associated and a valid
domain, or a VPN tunnel and entries in both systems hosts files, along
with a Postfix transport table, and other tweaks.

http://www.postfix.org/transport.5.html

If you refuse to run this "local server" 24x7x365 then you will have to
use a fetchmail based solution, which will not work well, and whose
configuration will prompt you to kill yourself.  I cannot help you with
any of that.

> - So apart from new mail that has not yet been read, that local archive
> contains always all mails that are also on the internet server... the
> later may contain (for specific directories) the same, or just parts of.

No.  Mail arriving at the colo/VPS host is immediately sent to the
always_bcc address, an address and mailbox on your home server.  You
will create a duplicate IMAP folder structure on the home server by hand
in your MUA.  Once this is completed you will write individual user
sieve scripts that sort the mail into folder just as it is sorted on the
colo/VPS server.  Basically, home server Dovecot IMAP config is
identical in structure  to colo/VPS setup, only the mailbox account
names differ.  Folder tree, folders, sieve scripts identical, retention
policy different.

> - The MUAs will then have two imap accounts, one to the internet server
> and one to the local archive,... each one being usable, depending on
> where I am.

Yep.

> 
> 
> 1) This is where my first problem arises:
> How can I implement that mail flow, especially:
> - How do I secure that all mail is read from the internet server (i.e.
> that nothing is "forgotten"?

Done:  always_bcc

> - How do I make sure that no mails are retrieved twice (or more)? A
> problem which I often had with pop, when the mail client crashed during
> sync?

Done:  always_bcc

> - Further it must be secured, that when I delete something on the
> internet server, it is NOT deleted on the local server (on the next
> mail-fetching).... this is why I don't use the word "sync".

Done:  always_bcc

> a) One stupid solution would be, that I duplicate all mail on the online
> server,... one part is for staying online, one part is for being fetched
> to the local archive.

Done:  always_bcc

And yes that is stupid.

> As soon as it was fetched... that copy gets removed (always).
> That solution would give a clean and secured separation of both?
> b) I don't think offlineimap or any other caching-like solution is the
> right thing... especially as one must always fear that such a cache may
> be accidentally wiped.
> 
> Are there better solutions than (a)?

Yes.  Already done:  always_bcc

> 2) Problem would be already a refinement of a working solution for (1)
> (but obviously not when using (1).(a) ).
> When e.g. reply to or forward a mail using the online server,... and
> that mail had already been fetched,... can I make the flag synced?

No.  Your stated goal is that the local server is a mail archive put
into service due to limited space on your colo/VPS server.  An archive
is an archive, not a secondary online server.  It should only be
accessed, read only, when you want to search and read an old message.
And in fact, since this is an archive, you should implement the zlib
plugin with dbox so all this archived mail is compressed in real time.

Make up your mind.  You can't have it both ways.  I hear the iPhone5 can
do anything automatically, no setup.  Get one of those, problem solved. ;)

> 3) Is dovecot suitable for the local server?

Yes.  Probably more than any other IMAP server.

> - I couldn't sue maildir locally, because I loose just to much space to
> the block fragmentation.

Maildir causes the least filesystem fragmentation.  You must be thinking
of mbox, which causes heavy fragmentation due to constant appends past
EOF.  As I said you need dbox.  One email per file, similar to maildir,
but better integration and performance with Dovecot.

> - I'd prefer not to use dbox (the thing that the indices are crucial
> scares me a bit off).

Are you designing/building this home server to be unreliable?  Does it
crash often?  If so fix that problem and dbox is fine.  If can't make it
reliable use maildir which has expendable indexes.

> a) When using mbox... is dovecot able to manage a really big folder
> hierarchy that basically ever keeps growing... with easily several 100k
> mails per folder... and that is in total already over 100GB?

You have 100K emails in a single Dovecot mbox file?  Or are you talking
about an IMAP folder in TB that has no email in it, but many more IMAP
folders whose combined email total is 100K?

If you're worried about dbox index corruption, then you should be far
more worried about mbox file corruption.  With mbox files that large I'm
surprised you've not hit it already.  This would suggest that system is
pretty stable.

> - I would prefer to have fast full text search. Does dovecot provide
> this?

Yes.  The problem with speed is two fold:

1.  You must FTS often to keep the search indexes up to date.  Wait a
week between searches, after many new emails have been added to the IMAP
folder, and your search crawls, as the file contents must be reindexed
before the search starts.  So you need to have a cron'd script that
searches daily to keep the indexes up to date.

2.  The mailbox file formats that best avoid fragmentation also have the
slowest FTS times as the OS much open every file, 100K of them.  If you
use mbox or mdbox, you have far fewer files to open.  mbox has the
fastest FTS times of any format when indexes aren't fully up to date.
It's also the fastest when updating the indexes.  Your home server
probably has a single SATA disk.  mbox wins hands down for FTS due to
very low IOPS load on the disk.  The downside here is lack of good
compression support--once you compress an mbox file you can't add new
mail to it.  This is where mdbox with compression comes in handy.  With
you 100K emails declaration, I think you're best served by mdbox with
zlib compression.

> I was looking into database backed mail systems (again,... just for the
> local archive)... namely dbmail and archiveopteryx (are there other open
> source solutions?)...
> Not sure which of the two... or whether it's a good idea at all.
> I remember some dovecot wiki page that showed a comparison which said
> that both do not perfectly implement imap.
> 
> Any suggestions with respect to that?

If you're worried about fragmentation, or performance, I'd steer clear
of a database driven mail store.


Please, please, do not reply to each of my points here, and do not make
this thread 100 replies.  I'm not here to hold your hand.  I don't have
the time (nor patience) to engage in these lengthy emails.  I gave you
the architectural overview to build the correct solution to your
problem.  It's up to you to choose to use it or not, and if so, to do
your own homework and self education, asking here only if something is
unclear to you.

In closing, you need real time bcc delivery which solves a ton of your
mentioned problems.  I'm not open to debating the merits of this.  If
you're not willing to meet the requirements for always_bcc, and you're
determined to power the home server down most of the time, then you need
assistance from someone else, as I simply have never used fetchmail,
period, and have no idea if it can meet your needs.  My guess is no,
simply because, AFAIK, it doesn't work with LDA, which means you can't
use sieve scripts and Dovecot's automatic sorting and indexing.

Good luck.

-- 
Stan



More information about the dovecot mailing list