[Dovecot] Proxy login failures
Hi,
I'm using two dovecot pop3/imap proxies in front of our dovecot servers. Since some days I see many of the following errors in the logs of the two proxy-servers:
... dovecot: pop3-login: Error: proxy: Remote "IPV6-IP":110 disconnected: Connection closed: Connection reset by peer (state=0): user=<myuser>, method=PLAIN, rip=remote-ip, lip=localip ... dovecot: imap-login: Error: proxy: Remote "IPV6-IP":143 disconnected: Connection closed: Connection reset by peer (state=0): user=<myuser>, method=PLAIN, rip=remote-ip, lip=localip
...
When this happens the Client gets the following error from the proxy: -ERR [IN-USE] Account is temporarily unavailable.
System-details: OS: Debian Linux Proxy: 2.0.5-0~auto+23 Backend: 2.0.13-0~auto+54
Have you any idea what could cause this type of error?
Thanks and regards Urban Loesch
doveconf -n from one of our backendservers:
# 2.0.13 (02d97fb66047): /etc/dovecot/dovecot.conf # OS: Linux 2.6.38.8-vs2.3.0.37-rc17-rol-em64t-timerp x86_64 Debian 6.0.2 ext4 auth_cache_negative_ttl = 0 auth_cache_size = 40 M auth_cache_ttl = 12 hours auth_mechanisms = plain login auth_username_format = %Lu auth_verbose = yes deliver_log_format = msgid=%m: %$ %p %w disable_plaintext_auth = no login_trusted_networks = our Proxy IP's (v4 and v6) mail_gid = mailstore mail_location = mdbox:/home/vmail/%d/%n:INDEX=/home/dovecotindex/%d/%n mail_plugins = " quota mail_log notify zlib" mail_uid = mailstore managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date ihave imapflags notify mdbox_rotate_size = 5 M passdb { args = /etc/dovecot/dovecot-sql-account.conf driver = sql } plugin { mail_log_events = delete undelete expunge copy mailbox_delete mailbox_rename mail_log_fields = uid box msgid size from mail_log_group_events = no quota = dict:Storage used::file:%h/dovecot-quota sieve = ~/.dovecot.sieve sieve_dir = ~/sieve sieve_extensions = +notify +imapflags sieve_max_redirects = 10 zlib_save = gz zlib_save_level = 5 } protocols = imap pop3 lmtp sieve service imap-login { inet_listener imap { port = 143 } service_count = 0 vsz_limit = 256 M } service lmtp { inet_listener lmtp { address = * port = 24 } unix_listener /var/spool/postfix/private/dovecot-lmtp { group = postfix mode = 0666 user = postfix } vsz_limit = 512 M } service pop3-login { inet_listener pop3 { port = 110 } service_count = 0 vsz_limit = 256 M } ssl = no ssl_cert =
On 9.1.2012, at 12.18, Urban Loesch wrote:
I'm using two dovecot pop3/imap proxies in front of our dovecot servers. Since some days I see many of the following errors in the logs of the two proxy-servers:
dovecot: pop3-login: Error: proxy: Remote "IPV6-IP":110 disconnected: Connection closed: Connection reset by peer (state=0): user=<myuser>, method=PLAIN, rip=remote-ip, lip=localip
When this happens the Client gets the following error from the proxy: -ERR [IN-USE] Account is temporarily unavailable.
The connection to remote server dies before authentication finishes. The reason for why that happens should be logged by the backend server. Sounds like it crashes. Check for ANY error messages in backend servers.
Am 09.01.2012 19:40, schrieb Timo Sirainen:
On 9.1.2012, at 12.18, Urban Loesch wrote:
I'm using two dovecot pop3/imap proxies in front of our dovecot servers. Since some days I see many of the following errors in the logs of the two proxy-servers:
dovecot: pop3-login: Error: proxy: Remote "IPV6-IP":110 disconnected: Connection closed: Connection reset by peer (state=0): user=<myuser>, method=PLAIN, rip=remote-ip, lip=localip
When this happens the Client gets the following error from the proxy: -ERR [IN-USE] Account is temporarily unavailable. The connection to remote server dies before authentication finishes. The reason for why that happens should be logged by the backend server. Sounds like it crashes. Check for ANY error messages in backend servers.
I still did that, but I found nothing in the logs. The only thing I could think about is that all 7 backend servers are virtual servers (using technology from http://linux-vserver.org) and they all are running on the same physical machine (DELL PER610 with 32GB RAM, RAID 10 SAS - load between 0.5 and 2.0, iowait about 1-5%). So they are sharing the same kernel. Also all servers are connected to a mysql server, running on a different machine in the same subnet. Could it be that either the kernel needs some tcp tuning ore perhaps the answers from the remote mysql server could be to slow in some cases?
Now I switched 2 of the 7 backend servers to the backup mysql slave server. Should be no problem because dovecot is only reading from it. If it helps I will see tomorrow an I let you know.
thanks Urban
On 9.1.2012, at 22.23, Urban Loesch wrote:
I'm using two dovecot pop3/imap proxies in front of our dovecot servers. Since some days I see many of the following errors in the logs of the two proxy-servers:
dovecot: pop3-login: Error: proxy: Remote "IPV6-IP":110 disconnected: Connection closed: Connection reset by peer (state=0): user=<myuser>, method=PLAIN, rip=remote-ip, lip=localip
When this happens the Client gets the following error from the proxy: -ERR [IN-USE] Account is temporarily unavailable. The connection to remote server dies before authentication finishes. The reason for why that happens should be logged by the backend server. Sounds like it crashes. Check for ANY error messages in backend servers.
I still did that, but I found nothing in the logs.
It's difficult to guess then. At the very least there should be an "Info" message about a new connection at the time when this failure happened. If there's not even that, then maybe the problem is network related.
The only thing I could think about is that all 7 backend servers are virtual servers (using technology from http://linux-vserver.org) and they all are running on the same physical machine (DELL PER610 with 32GB RAM, RAID 10 SAS - load between 0.5 and 2.0, iowait about 1-5%). So they are sharing the same kernel.
For testing, or what's the point in doing that? :) But the load is low enough that I doubt it has anything to do with it.
Also all servers are connected to a mysql server, running on a different machine in the same subnet. Could it be that either the kernel needs some tcp tuning ore perhaps the answers from the remote mysql server could be to slow in some cases?
MySQL server problem would show up with a different error message. TCP tuning is also unlikely to help, since the connection probably dies within a second. Actually it would be a good idea to log the duration. This patch adds it: http://hg.dovecot.org/dovecot-2.0/raw-rev/8438f66433a6
These are the only explanations that I can think of for the error:
- Remote Dovecot crashes / kills the connection (it would log an error message)
- Remote Dovecot server is full of handling existing connections (It would log a warning)
- Network trouble, something in the middle disconnecting the connection
- Source/destination OS trouble, disconnecting the connection
- Some hang that results in eventual disconnection. The duration patch would show if this is the case.
On 09.01.2012 23:39, Timo Sirainen wrote:
On 9.1.2012, at 22.23, Urban Loesch wrote:
I'm using two dovecot pop3/imap proxies in front of our dovecot servers. Since some days I see many of the following errors in the logs of the two proxy-servers:
dovecot: pop3-login: Error: proxy: Remote "IPV6-IP":110 disconnected: Connection closed: Connection reset by peer (state=0): user=<myuser>, method=PLAIN, rip=remote-ip, lip=localip
When this happens the Client gets the following error from the proxy: -ERR [IN-USE] Account is temporarily unavailable. The connection to remote server dies before authentication finishes. The reason for why that happens should be logged by the backend server. Sounds like it crashes. Check for ANY error messages in backend servers.
I still did that, but I found nothing in the logs.
It's difficult to guess then. At the very least there should be an "Info" message about a new connection at the time when this failure happened. If there's not even that, then maybe the problem is network related.
No, there is nothing.
The only thing I could think about is that all 7 backend servers are virtual servers (using technology from http://linux-vserver.org) and they all are running on the same physical machine (DELL PER610 with 32GB RAM, RAID 10 SAS - load between 0.5 and 2.0, iowait about 1-5%). So they are sharing the same kernel.
For testing, or what's the point in doing that? :) But the load is low enough that I doubt it has anything to do with it.
This because the hardware is fast enough for handling about 40.000 Mailaccounts (both IMAP and POP). That tells me that dovecot is a really good piece of software. Very performant in my eyes.
Also all servers are connected to a mysql server, running on a different machine in the same subnet. Could it be that either the kernel needs some tcp tuning ore perhaps the answers from the remote mysql server could be to slow in some cases?
MySQL server problem would show up with a different error message. TCP tuning is also unlikely to help, since the connection probably dies within a second. Actually it would be a good idea to log the duration. This patch adds it: http://hg.dovecot.org/dovecot-2.0/raw-rev/8438f66433a6
I installed the patch on my proxies and I got this:
... Jan 10 09:30:45 imap2 dovecot: pop3-login: Error: proxy: Remote "IPV6-IP":110 disconnected: Connection closed: Connection reset by peer (state=0, duration=0s): user=<myuser>, method=PLAIN, rip=remote-ip, lip=local-ip
Jan 10 09:45:21 imap2 dovecot: pop3-login: Error: proxy: Remote "IPV6-IP":110 disconnected: Connection closed: Connection reset by peer (state=0, duration=1s): user=<myuser>, method=PLAIN, rip=remote-ip, lip=local-ip ...
As you can see the duration is between 0 and 1 seconds.
During this errors there was a tcpdump running on proxy #2 (imap2 in the above logs). In the time range of "09:30:45:00 - 09:30:46:00" I got an error that the backend server has resetted the connection (RST Flag set). The fact that dovecot on the backend server writes nothing in the log I think that the connection will be resetted on a lower level.
Here is what whireshark tells me about that:
No. Source Time Destination Protocol Info 101235 IPv6-Proxy-Server 2012-01-10 09:29:38.015073 IPv6-Backend-Server TCP 35341 > pop3 [SYN] Seq=0 Win=14400 Len=0 MSS=1440 SACK_PERM=1 TSV=1925901864 TSER=0 WS=7 101236 IPv6-Backend-Server 2012-01-10 09:29:38.015157 IPv6-Proxy-Server TCP pop3 > 35341 [SYN, ACK] Seq=0 Ack=1 Win=14280 Len=0 MSS=1440 SACK_PERM=1 TSV=309225565 TSER=1925901864 WS=7 101248 IPv6-Proxy-Server 2012-01-10 09:29:38.233046 IPv6-Backend-Server POP [TCP ACKed lost segment] [TCP Previous segment lost] C: UIDL 101249 IPv6-Backend-Server 2012-01-10 09:29:38.233312 IPv6-Proxy-Server POP S: +OK 101250 IPv6-Proxy-Server 2012-01-10 09:29:38.233328 IPv6-Backend-Server TCP 35341 > pop3 [ACK] Seq=57 Ack=50 Win=14464 Len=0 TSV=1925901886 TSER=309225587 101263 IPv6-Proxy-Server 2012-01-10 09:29:38.452210 IPv6-Backend-Server POP C: LIST 101264 IPv6-Backend-Server 2012-01-10 09:29:38.452403 IPv6-Proxy-Server POP S: +OK 0 messages: 101265 IPv6-Proxy-Server 2012-01-10 09:29:38.452426 IPv6-Backend-Server TCP 35341 > pop3 [ACK] Seq=63 Ack=70 Win=14464 Len=0 TSV=1925901908 TSER=309225609 101324 IPv6-Proxy-Server 2012-01-10 09:29:38.671209 IPv6-Backend-Server POP C: QUIT 101325 IPv6-Backend-Server 2012-01-10 09:29:38.671566 IPv6-Proxy-Server POP S: +OK Logging out. 101326 IPv6-Proxy-Server 2012-01-10 09:29:38.671678 IPv6-Backend-Server TCP 35341 > pop3 [FIN, ACK] Seq=69 Ack=89 Win=14464 Len=0 TSV=1925901930 TSER=309225631 101327 IPv6-Backend-Server 2012-01-10 09:29:38.671759 IPv6-Proxy-Server TCP pop3 > 35341 [ACK] Seq=89 Ack=70 Win=14336 Len=0 TSV=309225631 TSER=1925901930
134205 IPv6-Proxy-Server 2012-01-10 09:30:45.477314 IPv6-Backend-Server TCP [TCP Port numbers reused] 35341 > pop3 [SYN] Seq=0 Win=14400 Len=0 MSS=1440 SACK_PERM=1 TSV=1925908610 TSER=0 WS=7 134206 IPv6-Backend-Server 2012-01-10 09:30:45.477458 IPv6-Proxy-Server TCP pop3 > 35341 [SYN, ACK] Seq=0 Ack=1 Win=14280 Len=0 MSS=1440 SACK_PERM=1 TSV=309232311 TSER=1925908610 WS=7 134207 IPv6-Proxy-Server 2012-01-10 09:30:45.477499 IPv6-Backend-Server TCP 35341 > pop3 [ACK] Seq=1 Ack=1 Win=14464 Len=0 TSV=1925908610 TSER=309232311 134208 IPv6-Backend-Server 2012-01-10 09:30:45.477589 IPv6-Proxy-Server TCP pop3 > 35341 [RST] Seq=1 Win=0 Len=0 136052 IPv6-Backend-Server 2012-01-10 09:30:49.477950 IPv6-Proxy-Server TCP pop3 > 35341 [SYN, ACK] Seq=0 Ack=1 Win=14280 Len=0 MSS=1440 SACK_PERM=1 TSV=309232712 TSER=1925908610 WS=7 136053 IPv6-Proxy-Server 2012-01-10 09:30:49.477978 IPv6-Backend-Server TCP 35341 > pop3 [RST] Seq=1 Win=0 Len=0 138363 IPv6-Backend-Server 2012-01-10 09:30:55.877899 IPv6-Proxy-Server TCP pop3 > 35341 [SYN, ACK] Seq=0 Ack=1 Win=14280 Len=0 MSS=1440 SACK_PERM=1 TSV=309233352 TSER=1925908610 WS=7 138364 IPv6-Proxy-Server 2012-01-10 09:30:55.877925 IPv6-Backend-Server TCP 35341 > pop3 [RST] Seq=1 Win=0 Len=0 143154 IPv6-Backend-Server 2012-01-10 09:31:08.678005 IPv6-Proxy-Server TCP pop3 > 35341 [SYN, ACK] Seq=0 Ack=1 Win=14280 Len=0 MSS=1440 SACK_PERM=1 TSV=309234632 TSER=1925908610 WS=7 152353 IPv6-Backend-Server 2012-01-10 09:31:32.678103 IPv6-Proxy-Server TCP pop3 > 35341 [SYN, ACK] Seq=0 Ack=1 Win=14280 Len=0 MSS=1440 SACK_PERM=1 TSV=309237032 TSER=1925908610 WS=7 165891 IPv6-Backend-Server 2012-01-10 09:32:20.688324 IPv6-Proxy-Server TCP pop3 > 35341 [SYN, ACK] Seq=0 Ack=1 Win=14280 Len=0 MSS=1440 SACK_PERM=1 TSV=309241833 TSER=1925908610 WS=7
From Seq-No. 101235 - 101327 the session looks ok for me. But on Seq-No. 134205 whireshark tells me that the TCP port with source port number "35341" will be reused and on Seq-No. 34208 (after the TCP Session has been established correctly - see Seq-No. 134205 to 134207) the backend server sends a RST Packet for the session and the proxy logs the error message, that the connection has been resetted by the peer.
I have no idea if dovecot is sendig the TCP reset or the kernel by himself.
About 1,5 hours ago I changed the kernel flag "/proc/sys/net/ipv4/tcp_tw_recycle" to "1" on the physical backend machine. Since that I got no more error messages on the proxies.
Changing the default values in "tcp_fin-timeout" or "tcp_tw_reuse" have had no effect. Only "tcp_tw_recycle" seems to help.
Thanks Urban
These are the only explanations that I can think of for the error:
- Remote Dovecot crashes / kills the connection (it would log an error message)
- Remote Dovecot server is full of handling existing connections (It would log a warning)
- Network trouble, something in the middle disconnecting the connection
- Source/destination OS trouble, disconnecting the connection
- Some hang that results in eventual disconnection. The duration patch would show if this is the case.
participants (2)
-
Timo Sirainen
-
Urban Loesch