[Dovecot] v2.0.13 problems after kernel patch for CVE-2011-1083 applied on Centos 5
Greetings,
This email is both a request for assistance/help and a heads-up.
[8irgehuq] CVE-2011-1083: Algorithmic denial of service in epoll.
After ksplice automatically installed the above patch on our mail servers, most/all IMAP/POP3 connections began experiencing time-outs trying to connect, or extreme timeouts in the auth procedure.
dovecot: imap-login: Disconnected (no auth attempts): rip=a.a.a.a, lip=b.b.b.b, TLS handshaking: Disconnected dovecot: pop3-login: Disconnected (no auth attempts): rip=a.a.a.a, lip=b.b.b.b, TLS handshaking: Disconnected dovecot: pop3-login: Panic: epoll_ctl(add, 6) failed: Invalid argument dovecot: pop3-login: Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0 [0x3cb543baa0] -> /usr/lib64/dovecot/libdovecot.so.0 [0x3cb543baf6] -> /usr/lib64/dovecot/libdovecot.so.0 [0x3cb543afb3] -> /usr/lib64/dovecot/libdovecot.so.0(io_loop_handle_add+0x118) [0x3cb5447708] -> /usr/lib64/dovecot/libdovecot.so.0(io_add+0xa5) [0x3cb5446e15] -> /usr/lib64/dovecot/libdovecot.so.0(master_service_init_finish+0x1c6) [0x3cb54355a6] -> /usr/lib64/dovecot/libdovecot-login.so.0(main+0x136) [0x37a000bdf6] -> /lib64/libc.so.6(__libc_start_main+0xf4) [0x3cb301d994] -> dovecot/pop3-login(main+0x49) [0x401b99] dovecot: master: Error: service(pop3-login): child 27603 killed with signal 6 (core not dumped - add -D parameter to service pop3-login { executable } dovecot: master: Error: service(pop3-login): command startup failed, throttling dovecot: imap-login: Panic: epoll_ctl(add, 6) failed: Invalid argument dovecot: imap-login: Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0 [0x3cb543baa0] -> /usr/lib64/dovecot/libdovecot.so.0 [0x3cb543baf6] -> /usr/lib64/dovecot/libdovecot.so.0 [0x3cb543afb3] -> /usr/lib64/dovecot/libdovecot.so.0(io_loop_handle_add+0x118) [0x3cb5447708] -> /usr/lib64/dovecot/libdovecot.so.0(io_add+0xa5) [0x3cb5446e15] -> /usr/lib64/dovecot/libdovecot.so.0(master_service_init_finish+0x1c6) [0x3cb54355a6] -> /usr/lib64/dovecot/libdovecot-login.so.0(main+0x136) [0x37a000bdf6] -> /lib64/libc.so.6(__libc_start_main+0xf4) [0x3cb301d994] -> dovecot/imap-login(main+0x39) [0x402069] dovecot: master: Error: service(imap-login): child 27604 killed with signal 6 (core not dumped - add -D parameter to service imap-login { executable }
Once this patch was removed, everything started working again.
Is it possible that dovecot is trying to re-add already-added connections to the polling list - which this specific 'patch' prevents?
We haven't dug deeper yet, but the error is being thrown from the method io_loop_handle_add in ioloop-epoll.c
http://hg.dovecot.org/dovecot-2.0/file/aa8dfa085a99/src/lib/ioloop-epoll.c
Thanks Doug
On 25.2.2012, at 0.49, Doug Henderson wrote:
[8irgehuq] CVE-2011-1083: Algorithmic denial of service in epoll.
After ksplice automatically installed the above patch on our mail servers, most/all IMAP/POP3 connections began experiencing time-outs trying to connect, or extreme timeouts in the auth procedure.
I'd guess this patch is already in new Linux kernel versions, so other people should have seen any problems caused by it?
dovecot: pop3-login: Panic: epoll_ctl(add, 6) failed: Invalid argument .. Once this patch was removed, everything started working again.
Is it possible that dovecot is trying to re-add already-added connections to the polling list - which this specific 'patch' prevents?
It shouldn't be possible .. EPOLL_CTL_ADD is done only once, EPOLL_CTL_MOD is done afterwards. And if the same fd is attempted to be added/modded twice, Dovecot should assert-crash first in ioloop_iolist_add().
On Feb 24, 2012, at 4:39 PM, Timo Sirainen wrote:
On 25.2.2012, at 0.49, Doug Henderson wrote:
[8irgehuq] CVE-2011-1083: Algorithmic denial of service in epoll.
After ksplice automatically installed the above patch on our mail servers, most/all IMAP/POP3 connections began experiencing time-outs trying to connect, or extreme timeouts in the auth procedure.
I'd guess this patch is already in new Linux kernel versions, so other people should have seen any problems caused by it?
Actually, it was only released a couple of days ago (2/21) by redhat for EL 5.8 see: https://rhn.redhat.com/errata/RHSA-2012-0150.html
"A flaw was found in the way the Linux kernel's Event Poll (epoll) subsystem handled large, nested epoll structures. A local, unprivileged user could use this flaw to cause a denial of service. (CVE-2011-1083, Moderate)"
Our automated patching (ksplice) installed it at around 10am PST today.
Other distributions may vary.
dovecot: pop3-login: Panic: epoll_ctl(add, 6) failed: Invalid argument .. Once this patch was removed, everything started working again.
Is it possible that dovecot is trying to re-add already-added connections to the polling list - which this specific 'patch' prevents?
It shouldn't be possible .. EPOLL_CTL_ADD is done only once, EPOLL_CTL_MOD is done afterwards. And if the same fd is attempted to be added/modded twice, Dovecot should assert-crash first in ioloop_iolist_add().
We haven't spent enough time investigating to be sure, but epoll_ctl was certainly "in the thick of it". The only outward evidence (in logs, even with debug turned on) that there was anything wrong with Dovecot at all was the Panic shown for that method.
Dovecot may have been an innocent bystander in this case - but something was causing it to fail on inbound IMAP/POP3 connections, and when the patch was removed everything started working again.
On 25.2.2012, at 8.32, Doug Henderson wrote:
[8irgehuq] CVE-2011-1083: Algorithmic denial of service in epoll.
After ksplice automatically installed the above patch on our mail servers, most/all IMAP/POP3 connections began experiencing time-outs trying to connect, or extreme timeouts in the auth procedure.
I'd guess this patch is already in new Linux kernel versions, so other people should have seen any problems caused by it?
Actually, it was only released a couple of days ago (2/21) by redhat for EL 5.8 see: https://rhn.redhat.com/errata/RHSA-2012-0150.html
Yes, but CVE-2011-1083 shows it was reported almost a year ago, so I'd think it was fixed in upstream kernel a long time ago. I'm running in my desktop about 2 months old kernel (from git) and I don't see any problems with it. But yeah, maybe Redhat's patches did it differently than upstream kernel and it broke because of that..
On 25.02.2012 07:32, Doug Henderson wrote:
On Feb 24, 2012, at 4:39 PM, Timo Sirainen wrote:
On 25.2.2012, at 0.49, Doug Henderson wrote:
[8irgehuq] CVE-2011-1083: Algorithmic denial of service in epoll.
After ksplice automatically installed the above patch on our mail servers, most/all IMAP/POP3 connections began experiencing time-outs trying to connect, or extreme timeouts in the auth procedure.
I'd guess this patch is already in new Linux kernel versions, so other people should have seen any problems caused by it?
Actually, it was only released a couple of days ago (2/21) by redhat for EL 5.8 see: https://rhn.redhat.com/errata/RHSA-2012-0150.html
"A flaw was found in the way the Linux kernel's Event Poll (epoll) subsystem handled large, nested epoll structures. A local, unprivileged user could use this flaw to cause a denial of service. (CVE-2011-1083, Moderate)"
Our automated patching (ksplice) installed it at around 10am PST today.
Other distributions may vary.
Try it without ksplice. (yum update and reboot)
Which kernel is running exactly?
Best regards,
Morten
On Feb 25, 2012, at 3:15 AM, Morten Stevens wrote:
Try it without ksplice. (yum update and reboot)
I don't know if I'll be permitted to do that in a production environment - possibly a test one. I'll need to get some opinions from our Ops people as to if/how they might want to go about it.
Which kernel is running exactly?
2.6.18-274.3.1.el5
Best regards,
Morten
On 26.02.2012 03:55, Doug Henderson wrote:
On Feb 25, 2012, at 3:15 AM, Morten Stevens wrote:
Try it without ksplice. (yum update and reboot)
I don't know if I'll be permitted to do that in a production environment - possibly a test one. I'll need to get some opinions from our Ops people as to if/how they might want to go about it.
Which kernel is running exactly?
2.6.18-274.3.1.el5
That is probably the problem. The current RHEL 5.8 kernel is 2.6.18-308.el5. There are many changes between 2.6.18-274 (EL 5.7) and 2.6.18-308 (EL 5.8). So I do not know if it is a good idea to apply ksplice patches between minor 5.x releases.
Best regards,
Morten
On Feb 26, 2012, at 2:44 AM, Morten Stevens wrote:
On 26.02.2012 03:55, Doug Henderson wrote:
On Feb 25, 2012, at 3:15 AM, Morten Stevens wrote:
Try it without ksplice. (yum update and reboot)
I don't know if I'll be permitted to do that in a production environment - possibly a test one. I'll need to get some opinions from our Ops people as to if/how they might want to go about it.
Which kernel is running exactly?
2.6.18-274.3.1.el5
That is probably the problem. The current RHEL 5.8 kernel is 2.6.18-308.el5. There are many changes between 2.6.18-274 (EL 5.7) and 2.6.18-308 (EL 5.8). So I do not know if it is a good idea to apply ksplice patches between minor 5.x releases.
Best regards,
Morten
Thanks Morten, We'll install the latest kernel on a test machine tomorrow and see how things go - we'll probably also attempt to reinstall the patch (if appropriate) and see if it still breaks things.
Doug
participants (3)
-
Doug Henderson
-
Morten Stevens
-
Timo Sirainen