[Dovecot] Dovecot design-question
Hi everybody,
we're currently in the process of drafting our new mailserver-setup. Instead of a single-server-setup we'd like to have two equal servers behind a loadbalancer like LVS and shared mailhomes on NFS.
We'd like to use dovecot for POP/IMAP, dovecot-deliver as LDA.
It's probably the best idea to direct SMTP and POP/IMAP always to the same server behind the loadbalancer (because dovecot-deliver is used which updates indexes?)
If we think of a "active/passive" setup: dovecot index-files locally or on the nfs-share?
Is it possible to activate loadbalancing for SMTP in this environment if dovecot index-files are held locally on each machine (POP/IMAP is directed only to one of the two servers?)
Any comments welcome :-) Werner
Hi,
- It's probably the best idea to direct SMTP and POP/IMAP always to the same server behind the loadbalancer (because dovecot-deliver is used which updates indexes?)
like mentioned on the website, but "would it be possible without nasty errors" ?
- If we think of a "active/passive" setup: dovecot index-files locally or on the nfs-share?
what's the better approach here?
thank you, Werner
On Thu, 2010-02-04 at 17:19 +0100, Werner wrote:
Hi,
- It's probably the best idea to direct SMTP and POP/IMAP always to the same server behind the loadbalancer (because dovecot-deliver is used which updates indexes?)
like mentioned on the website, but "would it be possible without nasty errors" ?
If both servers access index files at the same time, it's going to cause errors.
- If we think of a "active/passive" setup: dovecot index-files locally or on the nfs-share?
what's the better approach here?
active/passive is guaranteed to work at least. Or any setup where a user's indexes aren't accessed at the same time by two machines.
Hi,
- It's probably the best idea to direct SMTP and POP/IMAP always to the same server behind the loadbalancer (because dovecot-deliver is used which updates indexes?)
like mentioned on the website, but "would it be possible without nasty errors" ?
If both servers access index files at the same time, it's going to cause errors.
I'm currently running Tests for concurrent delivery via dovecot-deliver to a mailbox on a nfs-share with postal. And so far, i did not notice any problems when incoming SMTP is directed via LVS to two Mailservers which in parallel drop messages with dovecot-deliver to the users mailbox and updates dovecots index.files (btw. dovecot 1.2.10 in use, mail_nfs_storage = yes, mail_nfs_index = yes).
The only thing I've found so far:
Feb 16 17:33:46 cmx2 postfix/pipe[24221]: DD3F118A22A: to=<werner@example.com>, relay=dovecot, delay=3.4, delays=0.75/0/0/2.6, dsn=4.3.0, status=deferred (temporary failure. Command output: Internal error occurred. Refer to server log for more information. [2010-02-16 17:33:43])
This message was delivered in the next run to the users mailbox.
active/passive is guaranteed to work at least. Or any setup where a user's indexes aren't accessed at the same time by two machines.
We'd like to direct IMAP/IMAPS/POP/POPS/HTTP/HTTPS to one machine only but SMTP should be directed via LVS to both servers in parallel. And regarding to the mentioned tests - i assume this will work.
Any notes/annotations from you guys ? Why don't I experience NFS/Index-issues?
Regards, Werner
On 16.2.2010, at 18.49, Werner wrote:
I'm currently running Tests for concurrent delivery via dovecot-deliver to a mailbox on a nfs-share with postal. And so far, i did not notice any problems when incoming SMTP is directed via LVS to two Mailservers which in parallel drop messages with dovecot-deliver to the users mailbox and updates dovecots index.files (btw. dovecot 1.2.10 in use, mail_nfs_storage = yes, mail_nfs_index = yes).
How heavily were you stress testing it?
The only thing I've found so far:
Feb 16 17:33:46 cmx2 postfix/pipe[24221]: DD3F118A22A: to=<werner@example.com>, relay=dovecot, delay=3.4, delays=0.75/0/0/2.6, dsn=4.3.0, status=deferred (temporary failure. Command output: Internal error occurred. Refer to server log for more information. [2010-02-16 17:33:43])
That looks like exactly the kind of error I was talking about. Looking at Dovecot's log would show what the internal error was.
Any notes/annotations from you guys ? Why don't I experience NFS/Index-issues?
They're not that common. With the mail_nfs_* settings Dovecot tries to avoid them. But with heavy enough load they'll keep happening randomly.
Hi Timo,
I'm currently running Tests for concurrent delivery via dovecot-deliver to a mailbox on a nfs-share with postal. And so far, i did not notice any problems when incoming SMTP is directed via LVS to two Mailservers which in parallel drop messages with dovecot-deliver to the users mailbox and updates dovecots index.files (btw. dovecot 1.2.10 in use, mail_nfs_storage = yes, mail_nfs_index = yes).
How heavily were you stress testing it?
Well, I've initated the test with the double amount of email that's currently handeled by the current system. Tests have been initiated from external servers with postal:
srv1: postal -t 20 -r 100 -m 4192 -s 10 lvsmail.example.com user-list srv2: postal -t 20 -r 100 -m 4192 -s 10 lvsmail.example.com user-list
(20 parallel sessions, 100 messages per Minute, maximal size of one email 4MB, one Mailbox) After the Test I ran pflogsumm on the Mailserver-Nodes behind the LVS (before I've started the Benchmark, Logs have been cleared). Bandwith on the Storage-Side was quite heavy (83Mbit/s IN, 37MBit/s OUT) but LOAD (max 0.5) and CPU-USAGE (max 10%) of the Storage was really OK.
In summary, I've admitted about 19841 Messages over 2 Nodes to the users mailbox. Dovecot Deliver only had 5 times a problem (like mentioned) and deferred those messages. More Detailed Pflogsumm-Output for the two MX-Nodes is found under [1] and [2].
The only thing I've found so far:
Feb 16 17:33:46 cmx2 postfix/pipe[24221]: DD3F118A22A: to=<werner@example.com>, relay=dovecot, delay=3.4, delays=0.75/0/0/2.6, dsn=4.3.0, status=deferred (temporary failure. Command output: Internal error occurred. Refer to server log for more information. [2010-02-16 17:33:43])
That looks like exactly the kind of error I was talking about. Looking at Dovecot's log would show what the internal error was.
sadly, I did not find any more specific hints in the Maillog (mail_debug=yes) than this line :-( So, I assume it could work to have to servers for incoming SMTP active, even if dovecot-deliver is used as LDA?
kind regards, Werner
[1] Node1: (pflogsumm counts incoming mails twice because of smtpd_proxy_filter) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Grand Totals
messages
9656 received 4821 delivered 0 forwarded 4 deferred (4 deferrals) 9898m bytes received 9902m bytes delivered
Per-Hour Traffic Summary time received delivered deferred bounced rejected
1700-1800 2872 1434 2 0 64 1800-1900 3622 1809 1 0 42 1900-2000 3162 1578 1 0 37
message deferral detail
pipe (total: 4) 1 04:03] 1 33:43] 1 43:46] 1 49:00]
[2] Node2: (pflogsumm counts incoming mails twice because of smtpd_proxy_filter) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Grand Totals
messages
30052 received 15020 delivered 0 forwarded 1 deferred (1 deferrals) 30576m bytes received 30614m bytes delivered
Per-Hour Traffic Summary time received delivered deferred bounced rejected
1700-1800 9166 4573 0 0 46 1800-1900 11297 5651 0 0 98 1900-2000 9589 4796 1 0 79
message deferral detail
pipe (total: 1) 1 24:10]
On 17.2.2010, at 16.05, Werner wrote:
Feb 16 17:33:46 cmx2 postfix/pipe[24221]: DD3F118A22A: to=<werner@example.com>, relay=dovecot, delay=3.4, delays=0.75/0/0/2.6, dsn=4.3.0, status=deferred (temporary failure. Command output: Internal error occurred. Refer to server log for more information. [2010-02-16 17:33:43])
That looks like exactly the kind of error I was talking about. Looking at Dovecot's log would show what the internal error was.
sadly, I did not find any more specific hints in the Maillog (mail_debug=yes) than this line :-(
It really should have logged something. http://wiki.dovecot.org/LDA#logging and/or http://wiki.dovecot.org/Logging may give hints.
So, I assume it could work to have to servers for incoming SMTP active, even if dovecot-deliver is used as LDA?
Well, it's not error free, as you said you got 5 errors already. But if you can live with random errors, sure..
Hi,
It really should have logged something. http://wiki.dovecot.org/LDA#logging and/or http://wiki.dovecot.org/Logging may give hints.
I will set "log_path = /var/log/dovecot-deliver-errors.log" within the LDA-Section and will produce the temporary failure again. I will get back to you.
So, I assume it could work to have to servers for incoming SMTP active, even if dovecot-deliver is used as LDA?
Well, it's not error free, as you said you got 5 errors already. But if you can live with random errors, sure..
I think, it's not really a "bad, nasty" error - the email is just deferred - and by the next delivery-attempt it's successfully delivered.
Regards, Werner
Hi,
I will set "log_path = /var/log/dovecot-deliver-errors.log" within the LDA-Section and will produce the temporary failure again. I will get back to you.
Done - I've reproduced the error, here's the output from dovecot-deliver for those temporary Failures in the maillog:
2010-02-17 17:52:21 deliver(werner@example.com): Error: Corrupted transaction log file /mailhome/wernertest/dovecot.index.log seq 24: Invalid transaction log size (67988 vs 68080): /mailhome/wernertest/dovecot.index.log (sync_offset=67988)
Is this something to worry about ?
Thanks, Werner
On Wed, 2010-02-17 at 17:55 +0100, Werner wrote:
2010-02-17 17:52:21 deliver(werner@example.com): Error: Corrupted transaction log file /mailhome/wernertest/dovecot.index.log seq 24: Invalid transaction log size (67988 vs 68080): /mailhome/wernertest/dovecot.index.log (sync_offset=67988)
Is this something to worry about ?
Probably not. You might lose the latest change from index, but then again since you're using Maildir, Dovecot finds out about the change soon anyway.
Hi Timo,
2010-02-17 17:52:21 deliver(werner@example.com): Error: Corrupted transaction log file /mailhome/wernertest/dovecot.index.log seq 24: Invalid transaction log size (67988 vs 68080): /mailhome/wernertest/dovecot.index.log (sync_offset=67988)
Is this something to worry about ?
Probably not. You might lose the latest change from index, but then again since you're using Maildir, Dovecot finds out about the change soon anyway.
Great news - I think all problems/points are now solved so far and we can start the migration =))
Seize the day, Werner
On 01/02/2010 10:05, Werner wrote:
Hi everybody,
we're currently in the process of drafting our new mailserver-setup. Instead of a single-server-setup we'd like to have two equal servers behind a loadbalancer like LVS and shared mailhomes on NFS.
We'd like to use dovecot for POP/IMAP, dovecot-deliver as LDA.
It's probably the best idea to direct SMTP and POP/IMAP always to the same server behind the loadbalancer (because dovecot-deliver is used which updates indexes?)
If we think of a "active/passive" setup: dovecot index-files locally or on the nfs-share?
At least one other user on the list had success using Dovecot proxy and a "backend servers are the frontend servers" setup. Basically the user comes into a random frontend server, the dovecot proxy has a 50:50 chance to discover they are already on the right machine and gets out of the way, otherwise it proxy's the connection to the other machine.
I guess this will waste 25% internal bandwidth on average (external b/w should remain the same). Apparently cpu requirements are very low for proxying and additional memory requirements can be measured, but may be satisfactory
If one server fails then you have to update the loadbalancer to redirect only to the working server AND update the proxy not to try and send users to the other machine. Depending how you configure things this extra step can be done very easily though.
Good luck
Ed W
participants (3)
-
Ed W
-
Timo Sirainen
-
Werner