Proxy problem: "imap-login: Error: proxy(USERNAME): connect(10.x.x.178, 993) failed: Cannot assign requested address (after 0 secs, local=10.x.x.104)"
After fixing the "duplicate compression" problem, we're now encountering
"imap-login: Error: proxy(USERNAME): connect(10.x.x.178, 993) failed: Cannot assign requested address (after 0 secs, local=10.x.x.100)"
in the logs. We already tried raising the ulimit, the max number of open files. Once we reach about 25k connections, we're getting the error above... for all local addresses. It seems as if the system cannot create any more outgoing connections.
We already optimized:
---- snip ---- net.ipv4.tcp_fin_timeout=5 # down from 30s
net.ipv4.tcp_tw_reuse=1 net.ipv4.tcp_tw_recycle=1 # http://redis4you.com/articles.php?id=012&name=Redis+high+traffic+connection+issue
net.ipv4.ip_local_port_range=10000 65000 # http://www.fromdual.com/huge-amount-of-time-wait-connections
net.ipv4.netfilter.ip_conntrack_max=524288 ---- snip ----
But still we get (once the load is rising beyond some point):
# fgrep "Cannot assign requested address" /var/log/dovecot/dovecot.log |awk '{print $NF}' | sort | uniq -c | sort -n 142 local=10.x.x.100) 147 local=10.x.x.107) 148 local=10.x.x.106) 151 local=10.x.x.104) 151 local=10.x.x.109) 152 local=10.x.x.105) 156 local=10.x.x.110) 162 local=10.x.x.102) 165 local=10.x.x.101) 178 local=10.x.x.103) 189 local=10.x.x.108)
We're using multiple local addresses when proxying to the backends
-- [*] sys4 AG
http://sys4.de, +49 (89) 30 90 46 64 Franziskanerstraße 15, 81669 München
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein
On 16 Oct 2014, at 06:26, Ralf Hildebrandt <r@sys4.de> wrote:
After fixing the "duplicate compression" problem, we're now encountering
"imap-login: Error: proxy(USERNAME): connect(10.x.x.178, 993) failed: Cannot assign requested address (after 0 secs, local=10.x.x.100)"
in the logs. We already tried raising the ulimit, the max number of open files. Once we reach about 25k connections, we're getting the error above... for all local addresses. It seems as if the system cannot create any more outgoing connections.
I'd guess you're running out of TCP ports.
We're using multiple local addresses when proxying to the backends
How are you doing the multiple local addresses? In v2.2.14 there's login_source_ips setting intended to solve this problem. http://wiki2.dovecot.org/PasswordDatabase/ExtraFields/Proxy
in the logs. We already tried raising the ulimit, the max number of open files. Once we reach about 25k connections, we're getting the error above... for all local addresses. It seems as if the system cannot create any more outgoing connections.
I'd guess you're running out of TCP ports.
I think so too, but it's somewhat unlikely! We're using 10 outbound IP adresses! In a random fashion (and I'm seeing this on the backend server!)
We're using multiple local addresses when proxying to the backends
How are you doing the multiple local addresses? In v2.2.14 there's login_source_ips setting intended to solve this problem.
Exactly like that!
-- [*] sys4 AG
http://sys4.de, +49 (89) 30 90 46 64 Franziskanerstraße 15, 81669 München
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein
- Ralf Hildebrandt <r@sys4.de>:
I'd guess you're running out of TCP ports.
I think so too, but it's somewhat unlikely! We're using 10 outbound IP adresses! In a random fashion (and I'm seeing this on the backend server!)
FYI: It was a kernel bug.
-- [*] sys4 AG
http://sys4.de, +49 (89) 30 90 46 64 Franziskanerstraße 15, 81669 München
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein
Zitat von Ralf Hildebrandt <r@sys4.de>:
- Ralf Hildebrandt <r@sys4.de>:
I'd guess you're running out of TCP ports.
I think so too, but it's somewhat unlikely! We're using 10 outbound IP adresses! In a random fashion (and I'm seeing this on the backend server!)
FYI: It was a kernel bug.
Do you mind sharing the actual technical background. Which kernel was
affected and how?
Thanks
Andreas
FYI: It was a kernel bug.
Do you mind sharing the actual technical background. Which kernel was affected and how?
We didn't track it down to a specific bug, but we finally decided that our setup using multiple IPs for source and destination was OK and should work as intended - but it didn't.
So we switched from Debians 3.2 kernel to a 3.14 from backports and - surprise - it worked as designed. Something in the 3.2 kernel limited the number of established connections to the frame given by the local port range, even incoming ones. I still don't know what, but I am quite sure it's not a new feature in 3.14 which makes our setup work, as it should be possible to have many connects from different source IPs on basically every linux kernel.
There are other limits (as some TCP hashtable sizes) which can be tuned, but that was not the limit we were hitting...
Old (not working): linux-image-3.2.0-4-amd64 3.2.63-2
New (working OK): linux-image-3.14-0.bpo.2-rt-amd64 3.14.15-2~bpo70+1
-- [*] sys4 AG
http://sys4.de, +49 (89) 30 90 46 64 Franziskanerstraße 15, 81669 München
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein
Ralf Hildebrandt wrote: [...]
We already optimized: ---- snip ---- net.ipv4.tcp_fin_timeout=5 # down from 30s net.ipv4.tcp_tw_reuse=1 net.ipv4.tcp_tw_recycle=1 # http://redis4you.com/articles.php?id=012&name=Redis+high+traffic+connection+issue
just a note on enabling tcp_tw_recycle, it is known to have side-effects and issues when you have lots of connections from the same source IP, such as many clients behind same NAT IP or a reverse proxy
see http://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html
-brd
- brd <barraudu@tiscali.it>:
Ralf Hildebrandt wrote: [...]
We already optimized: ---- snip ---- net.ipv4.tcp_fin_timeout=5 # down from 30s net.ipv4.tcp_tw_reuse=1 net.ipv4.tcp_tw_recycle=1 # http://redis4you.com/articles.php?id=012&name=Redis+high+traffic+connection+issue
just a note on enabling tcp_tw_recycle, it is known to have side-effects and issues when you have lots of connections from the same source IP, such as many clients behind same NAT IP or a reverse proxy
see http://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html
Yes, we might want to disable that again.
-- [*] sys4 AG
http://sys4.de, +49 (89) 30 90 46 64 Franziskanerstraße 15, 81669 München
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein
participants (4)
-
brd
-
lst_hoe02@kwsoft.de
-
Ralf Hildebrandt
-
Timo Sirainen