[Dovecot] RE: epoll error when running as proxy
Also, for testing in the error message I patched src/lib/ioloop-epoll.c to show the "ret" and "errno" variables as well as the variable names, which is why there is a bit of extra data in this error message...
imap-login: io_loop_handle_add: epoll_ctl(op=3, fd=10, ret=-1, errno=2): No such file or directory
Bill
On Thursday, February 9, 2006 16:32, Bill Boebel said:
I am running dovecot 1.0 beta3 as a front-end proxy, and getting the following error when using epoll. The error goes away if I switch to poll. On the backend IMAP server I see that the login was successful. The error occurs when I try to retrieve an email or get a folder listing via IMAP after logging in successfully...
Feb 9 15:35:57 director4 dovecot: imap-login: proxy(bboebel@webmail.us): started: user=<bboebel@webmail.us>, method=plain, rip=204.119.252.7, lip=192.168.1.68 Feb 9 15:09:50 director4 dovecot: imap-login: io_loop_handle_add: epoll_ctl(op=3, fd=10, ret=-1, errno=2): No such file or directory Feb 9 14:58:23 director4 dovecot: child 9688 (login) returned error 89
I can't find anything in the mailing list archives about this error, so here are some details. I hope somebody can point me in the right direction to fix this.
- I am running this on Red Hat ES4 (32-bit x86), compiling with the following options:
./configure
--prefix=/usr
--sysconfdir=/etc
--localstatedir=/var
--with-ssldir=/usr/share/ssl
--disable-ipv6
--with-file-offset-size=32
--with-mem-align=4
--with-ioloop=epoll
--without-passwd
--without-passwd-file
--without-shadow
--without-pam
--without-checkpassword
--without-bsdauth
--without-gssapi
--with-ldap
--without-vpopmail
--with-static-userdb
--with-prefetch-userdb
--without-pgsql
--with-mysql
--without-sqlite
--without-ssl
--with-storages=maildir
- I have also complied with these options and got the same error:
./configure
--prefix=/usr
--sysconfdir=/etc
--localstatedir=/var
--with-ioloop=epoll
--with-mysql
I have also tried the latest CSV snapshot (dovecot-20060209.tar.gz) and 1.0 beta2, and I get the same error.
If I switch to poll instead of epoll, everything works fine.
Any ideas?
Bill
Also, for testing in the error message I patched src/lib/ioloop-epoll.c to show the "ret" and "errno" variables as well as the variable names, which is why there is a bit of extra data in this error message...
imap-login: io_loop_handle_add: epoll_ctl(op=3, fd=10, ret=-1, errno=2): No such file or directory
Bill
On Thursday, February 9, 2006 16:32, Bill Boebel said:
I am running dovecot 1.0 beta3 as a front-end proxy, and getting the following error when using epoll. The error goes away if I switch to poll. On the backend IMAP server I see that the login was successful. The error occurs when I try to retrieve an email or get a folder listing via IMAP after logging in successfully...
Feb 9 15:35:57 director4 dovecot: imap-login: proxy(bboebel@webmail.us): started: user=<bboebel@webmail.us>, method=plain, rip=204.119.252.7, lip=192.168.1.68 Feb 9 15:09:50 director4 dovecot: imap-login: io_loop_handle_add: epoll_ctl(op=3, fd=10, ret=-1, errno=2): No such file or directory Feb 9 14:58:23 director4 dovecot: child 9688 (login) returned error 89
I can't find anything in the mailing list archives about this error, so here are some details. I hope somebody can point me in the right direction to fix this.
- I am running this on Red Hat ES4 (32-bit x86), compiling with the following options:
./configure
--prefix=/usr
--sysconfdir=/etc
--localstatedir=/var
--with-ssldir=/usr/share/ssl
--disable-ipv6
--with-file-offset-size=32
--with-mem-align=4
--with-ioloop=epoll
--without-passwd
--without-passwd-file
--without-shadow
--without-pam
--without-checkpassword
--without-bsdauth
--without-gssapi
--with-ldap
--without-vpopmail
--with-static-userdb
--with-prefetch-userdb
--without-pgsql
--with-mysql
--without-sqlite
--without-ssl
--with-storages=maildir
- I have also complied with these options and got the same error:
./configure
--prefix=/usr
--sysconfdir=/etc
--localstatedir=/var
--with-ioloop=epoll
--with-mysql
I have also tried the latest CSV snapshot (dovecot-20060209.tar.gz) and 1.0 beta2, and I get the same error.
If I switch to poll instead of epoll, everything works fine.
Any ideas? Could it be that similarly to kqueue on *BSD the Linux kernel unregisters/removes the handle automatically when it gets closed? And
Bill Boebel wrote: that dovecot closes it before unregistering it from epoll?
Bill
Vaclav Haisman
imap-login: io_loop_handle_add: epoll_ctl(op=3, fd=10, ret=-1, errno=2): No such file or directory
Could it be that similarly to kqueue on *BSD the Linux kernel unregisters/removes the handle automatically when it gets closed? And that dovecot closes it before unregistering it from epoll?
I don't know enough about epoll to answer that. I do see that in src/lib/ioloop-epoll.c, the "io_loop_handler_init()" function calls this, which sounds related.:
fd_close_on_exec(ctx->epfd, TRUE);
But again, I am not very familiar with the specifics of epoll. Maybe somebody else on this list knows?
Thanks, Bill
Bill Boebel wrote:
imap-login: io_loop_handle_add: epoll_ctl(op=3, fd=10, ret=-1, errno=2): No such file or directory
Could it be that similarly to kqueue on *BSD the Linux kernel unregisters/removes the handle automatically when it gets closed? And that dovecot closes it before unregistering it from epoll?
I don't know enough about epoll to answer that. I do see that in src/lib/ioloop-epoll.c, the "io_loop_handler_init()" function calls this, which sounds related.:
fd_close_on_exec(ctx->epfd, TRUE);
But again, I am not very familiar with the specifics of epoll. Maybe somebody else on this list knows? This means the fd will be closed on exec() but I mean the case where the fd is closed by hand without exec().
And to answer my own question (question/answer 6): http://www.die.net/doc/linux/man/man4/epoll.4.html
I think it might be the cause of the error message.
Thanks, Bill
VH
Interesting. The error is occurring during an EPOLL_CTL_MOD operation within the io_loop_handle_add() function. So I am not positive that what you point out is in fact the problem. If it was occurring within the io_loop_handle_remove() function then what you point out might be the case, but thats not where its happening. Unless maybe it is referencing the wrong file descriptor in it's io_list because of a previous file descriptor getting closed unexpectedly. Hmm...
I am surprised nobody else has mentioned this error on the list before. It seems like it would be rather common.
Bill
On Thursday, February 9, 2006 17:56, Václav Haisman said:
Bill Boebel wrote:
imap-login: io_loop_handle_add: epoll_ctl(op=3, fd=10, ret=-1, errno=2): No such file or directory
Could it be that similarly to kqueue on *BSD the Linux kernel unregisters/removes the handle automatically when it gets closed? And that dovecot closes it before unregistering it from epoll?
I don't know enough about epoll to answer that. I do see that in src/lib/ioloop-epoll.c, the "io_loop_handler_init()" function calls this, which sounds related.:
fd_close_on_exec(ctx->epfd, TRUE);
But again, I am not very familiar with the specifics of epoll. Maybe somebody else on this list knows? This means the fd will be closed on exec() but I mean the case where the fd is closed by hand without exec().
And to answer my own question (question/answer 6): http://www.die.net/doc/linux/man/man4/epoll.4.html
I think it might be the cause of the error message.
Thanks, Bill
VH
It has been mentioned I think:
http://dovecot.org/list/dovecot/2005-October/thread.html#9626 with the thread "[Dovecot] errors after 1.0a3 -> 1.0a4"
But it was implicated with SSL at the time, although, you've suggested that it may not in fact be anything to do with SSL at all..
[I'm seeing it after 5 or so mins using Thunderbird, but I'm just building without epoll for now]
reuben
On 10/02/2006 12:55 p.m., Bill Boebel wrote:
Interesting. The error is occurring during an EPOLL_CTL_MOD operation within the io_loop_handle_add() function. So I am not positive that what you point out is in fact the problem. If it was occurring within the io_loop_handle_remove() function then what you point out might be the case, but thats not where its happening. Unless maybe it is referencing the wrong file descriptor in it's io_list because of a previous file descriptor getting closed unexpectedly. Hmm...
I am surprised nobody else has mentioned this error on the list before. It seems like it would be rather common.
Bill
On Thursday, February 9, 2006 17:56, Václav Haisman said:
Bill Boebel wrote:
imap-login: io_loop_handle_add: epoll_ctl(op=3, fd=10, ret=-1, errno=2): No such file or directory
Bill Boebel wrote:
- If I switch to poll instead of epoll, everything works fine. Any ideas?
epoll is broken since 1alpha4. By chance, I looked into the code tonight and found that io_loop_handle_add calls epoll_ctl sometimes with MOD on fds that were just DELeted before. This is caused by iolist_del, which always returns TRUE. I don't fully understand the code (and it looks a little overcomplicated to me, considering the epoll code I wrote myself), but the attached mini patch makes epoll work again in my setup. Don't know if this is a complete fix, though. Timo, could you look into this and confirm/negate, please? --- dovecot/src/lib/ioloop-epoll.c +++ dovecot.epoll/src/lib/ioloop-epoll.c @@ -139,7 +139,7 @@ if (list->ios[i] == io) list->ios[i] = NULL; else - last = TRUE; + last = FALSE; } } return last;
On Thursday, February 9, 2006 20:39, Jakob Hirsch said:
By chance, I looked into the code tonight and found that io_loop_handle_add calls epoll_ctl sometimes with MOD on fds that were just DELeted before. This is caused by iolist_del, which always returns TRUE. I don't fully understand the code (and it looks a little overcomplicated to me, considering the epoll code I wrote myself), but the attached mini patch makes epoll work again in my setup. Don't know if this is a complete fix, though.
Timo, could you look into this and confirm/negate, please?
This patch worked in our environment too - thanks Jakob. The "io_loop_handle_add: epoll_ctl" errors went away and the IMAP proxy works as expected now.
I don't fully understand what this patch does to the epoll data structures though... Can somebody confirm that this change is safe and its not going to create any leaks or other weirdness?
Bill
On 10.2.2006 03:39, "Jakob Hirsch" <jh@plonk.de> wrote:
By chance, I looked into the code tonight and found that io_loop_handle_add calls epoll_ctl sometimes with MOD on fds that were just DELeted before. This is caused by iolist_del, which always returns TRUE. I don't fully understand the code (and it looks a little overcomplicated to me, considering the epoll code I wrote myself), but the attached mini patch makes epoll work again in my setup. Don't know if this is a complete fix, though.
Timo, could you look into this and confirm/negate, please?
Looks fine, committed to CVS.
participants (5)
-
Bill Boebel
-
Jakob Hirsch
-
Reuben Farrelly
-
Timo Sirainen
-
Václav Haisman