Hello, I want to build simple, cheap HA solution for 5000 users with postfix/dovecot. Each user will have 1.3 GB mailbox (maildir) storage quota. All of the users will use POP and also plus half of them will use IMAP from smart phones. Sometimes they will hit webmail (squirrelmail). I plan that every user will send 20 and receive 20 mails per day.
I have 3 locations, but electricity are not very stable there. We have outages for couple of hours 3-4 times per year. So I plan distribute all servers (it's more like good PC's) across locations. I'm very tied on budget also. Or maybe I could say I don't have it and I'm forced to use hardware equipment which is given. We are talking about quite pour HA solution here. Don't judge me please.
I start to plan architecture and I would be very thankful for all your thoughts. I'm not expert in this area, but I need to learn, fire up email system and manage it.
I will have SMTP server (smtp.example.com) which will filter mail with Clam, Spamassasin, Greylisting and forward emails to IMAP/ POP3/webmail server (mail.example.com). Mail.example.com will sync all user and mail changes to the third server mail2.example.com. Mail2.example.com server is just stand by, hot backup server in cases if main mail server or smtp will not be reachable.
Locations of servers are in different cities:
Locations #1 - SMTP servers as virtual machines (vmware server or virtualbox), with CPU i3, 6GB RAM, SSD, Centos 6, Virtualmin
Location #2 - Dovecot/webmail server: CPU i5, 16GB RAM, HDD 1SSD for OS/dovecot indexes and SATA 2TBx4 LVM (total 8TB) for Maildir's, Centos 6, Virtualmin
Location #3 HOT BACKUP (always online) server in case of mail.example.com or smtp.example.com failure, electricity outage and etc. It contains Dovecot/webmail server: CPU i3, 8GB RAM, HDD 1SSD for OS/dovecot indexes and SATA 2x4TB LVM (total 8TB), Centos 6, Virtualmin
DNS configuration (and location number):
IN MX 10 smpt.example.com. (#1) IN MX 20 mail.example.com. (#2) IN MX 30 mail2.example.com. (#3) .... smtp IN A aa.aa.aa.aa mail IN A bb.bb.bb.bb mail2 IN A cc.cc.cc.cc
So I need to solve SMTP, POP, IMAP high availability tasks.
SMTP I think will be ok for external users. SMTP feature has all three servers. If smtp.example.com server is offline mail is sent to mail.example.com If mail.example.com is offline, then mail is sent to mail2.example.com If mail2.example.com is offline, the senders SMTP server will hold an email in a queue by default for 3 days (in most cases) and will try to deliver in some intervals.
Local (users of domain example.com) SMTP/POP/IMAP users will have problems, because their domain server will be dead. This means that connections from user MTA agents (outlook POP) could not be delivered because DNS A type record pointing to a degraded server.
When problem with server mail.example.com (POP/IMAP/webmail) connectivity will happen, I think to login to DNS server and point record (A type) to a working server. So if server in location #2 mail.example.com is dead, I will change mail IN A bb.bb.bb.bb ----> cc.cc.cc.cc The same with the server smtp.example.com. If it's down, I will change mail IN A aa.aa.aa.aa ----> cc.cc.cc.cc
I know that this method is not perfect because of ISP DNS caching around the globe, but this the only option I have. Most of my users use ISP that refresh DNS in a period of 30 minutes so it won't be very terrible to half of hour to wait for the connection.
My questions:
Is this infrastructure is ok for 5000 users? If you think it's not, please write how many users it will serve "normally"? Or maybe I it could handle and 10 000 users load?
Is the logic of all setup is right in such situation like mine? What other problems could arise?
I prefer to create users from one server. So the LDAP option is the best for me? In case of LDAP it will be on server #2 and I need to replicate it to #3. Is it rigth? Then server #2 will be offline does #3 LDAP will take place? Or should I better use batch command to create user in each server per single command? Actually I do not want to manage and LDAP as extra service. I don't have experience with it.
The main problem is to synchronize maildir of POP/IMAP/webmail servers. Because of distance I do not want to use DRBD. It's not suitable for WAN connections. GlusterFS seems an option but.. I saw on internet users also having problems with it performance. Sometimes gluster hang and all cluster must be restarted. Maybe I can sync /home with gluster where maildir resides, but we need to take care of dovecot index and control files synchronization also. The performance of retrieving index'es by several thousands users every 2-5 minutes must be excellent.
The only option I found is Dovecot Dsync (or newer Doveadm) command. I made initial tests on two servers, but only to get work with SSH command. The method using TCP connection doesn't work for me. So question, if I use SSH method in several thousand user system does it cause me trouble? SSH adds some overhead comparing to TCP. But could make a permament tunnel and TCP also would be quite secure.
So please tell me what methods are you using to sync two mail servers (especially for POP accounts) in different locations?
- File system. Is it ok to use EXT4 file system? I know that XFS is better in performance working with small files (maildir = a lot of small files), but in power outage it can become corrupted without journal.
Thanks.