POP/IMAP HA solution for 5k users

Sinergizmas Sin ergizmas sinergizmas at gmail.com
Thu Aug 28 13:29:14 UTC 2014


Hello, I want to build simple, cheap HA solution for 5000 users with
postfix/dovecot. Each user will have 1.3 GB mailbox (maildir) storage
quota. All of the users will use POP and also plus half of them will use
IMAP from smart phones. Sometimes they will hit webmail (squirrelmail). I
plan that every user will send 20 and receive 20 mails per day.

I have 3 locations, but electricity are not very stable there. We have
outages for couple of hours 3-4 times per year.  So I plan distribute all
servers (it's more like good PC's) across locations. I'm very tied on
budget also. Or maybe I could say I don't have it and I'm forced to use
hardware equipment which is given. We are talking about quite pour HA
solution here. Don't judge me please.

I start to plan architecture and I would be very thankful for all your
thoughts. I'm not expert in this area, but I need to learn, fire up email
system and manage it.


I will have SMTP server (smtp.example.com) which will filter mail with
Clam, Spamassasin, Greylisting and forward emails to IMAP/ POP3/webmail
server (mail.example.com). Mail.example.com will sync all user and mail
changes to the third server mail2.example.com. Mail2.example.com server is
just stand by, hot backup server in cases if main mail server or smtp will
not be reachable.

Locations of servers are in different cities:

Locations #1 - SMTP servers as virtual machines (vmware server or
virtualbox), with CPU i3, 6GB RAM, SSD, Centos 6, Virtualmin

Location #2 - Dovecot/webmail server: CPU i5, 16GB RAM, HDD 1SSD for
OS/dovecot indexes and SATA 2TBx4 LVM (total 8TB) for Maildir's, Centos 6,
Virtualmin

Location #3 HOT BACKUP (always online) server in case of mail.example.com
or smtp.example.com failure, electricity outage and etc. It contains
Dovecot/webmail server: CPU i3, 8GB RAM, HDD 1SSD for OS/dovecot
indexes and SATA 2x4TB LVM (total 8TB), Centos 6, Virtualmin


DNS configuration (and location number):

IN  MX  10  smpt.example.com.   (#1)
IN  MX  20  mail.example.com.     (#2)
IN  MX  30  mail2.example.com.   (#3)
....
smtp  IN  A       aa.aa.aa.aa
mail  IN  A       bb.bb.bb.bb
mail2 IN  A       cc.cc.cc.cc


So I need to solve SMTP, POP, IMAP high availability tasks.

SMTP I think will be ok for external users. SMTP feature has all three
servers.
If smtp.example.com server is offline mail is sent to mail.example.com
If mail.example.com is offline, then mail is sent to mail2.example.com
If mail2.example.com is offline, the senders SMTP server will hold an email
in a queue by default for 3 days (in most cases) and will try to deliver in
some intervals.

Local (users of domain example.com) SMTP/POP/IMAP users will have problems,
because their domain server will be dead. This means that connections from
user MTA agents (outlook POP) could not be delivered because DNS A type
record pointing to a degraded server.

When problem with server mail.example.com (POP/IMAP/webmail) connectivity
will happen, I think to login to DNS server and point record (A type) to a
working server. So if server in location #2 mail.example.com is dead, I
will change
mail  IN  A       bb.bb.bb.bb  ----> cc.cc.cc.cc
The same with the server smtp.example.com. If it's down, I will change
mail  IN  A       aa.aa.aa.aa  ----> cc.cc.cc.cc

I know that this method is not perfect because of ISP DNS caching around
the globe, but this the only option I have. Most of my users use ISP that
refresh DNS in a period of 30 minutes so it won't be very terrible to half
of hour to wait for the connection.

My questions:

1. Is this infrastructure is ok for 5000 users? If you think it's not,
please write how many users it will serve "normally"? Or maybe I it could
handle and 10 000 users load?

2. Is the logic of all setup is right in such situation like mine? What
other problems could arise?

3. I prefer to create users from one server. So the LDAP option is the best
for me?
In case of LDAP it will be on server #2 and I need to replicate it to #3.
Is it rigth? Then server #2 will be offline does #3 LDAP will take place?
Or should I better use batch command to create user in each server per
single command? Actually I do not want to manage and LDAP as extra service.
I don't have experience with it.

4. The main problem is to synchronize maildir of POP/IMAP/webmail servers.
Because of distance I do not want to use DRBD. It's not suitable for WAN
connections.
GlusterFS seems an option but.. I saw on internet users also having
problems with it performance. Sometimes gluster hang and all cluster must
be restarted. Maybe I can sync /home with gluster where maildir resides,
but we need to take care of dovecot index and control files synchronization
also. The performance of retrieving index'es by several thousands users
every 2-5 minutes must be excellent.

The only option I found is Dovecot Dsync (or newer Doveadm) command. I made
initial tests on two servers, but only to get work with SSH command. The
method using TCP connection doesn't work for me. So question, if I use SSH
method in several thousand user system does it cause me trouble? SSH adds
some overhead comparing to TCP. But could make a permament tunnel and TCP
also would be quite secure.

So please tell me what methods are you using to sync two mail servers
(especially for POP accounts) in different locations?

5. File system. Is it ok to use EXT4 file system? I know that XFS is better
in performance working with small files (maildir = a lot of small files),
but in power outage it can become corrupted without journal.

Thanks.


More information about the dovecot mailing list