Auth process sometimes stop responding after upgrade

Timo Sirainen tss at iki.fi
Fri Sep 7 20:11:24 EEST 2018


On 7 Sep 2018, at 19.43, Timo Sirainen <tss at iki.fi> wrote:
> 
> On 7 Sep 2018, at 16.50, Simone Lazzaris <s.lazzaris at interactive.eu <mailto:s.lazzaris at interactive.eu>> wrote:
>> 
>> Some more information: the issue has just occurred, again on an instance without the "service_count = 0" configuration directive on pop3-login.
>>  
>> I've observed that while the issue is occurring, the director process goes 100% CPU. I've straced the process. It is seemingly looping:
>>  
>> ...
>> ...
>> epoll_ctl(13, EPOLL_CTL_ADD, 78, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=149035320, u64=149035320}}) = 0
>> epoll_ctl(13, EPOLL_CTL_DEL, 78, {0, {u32=149035320, u64=149035320}}) = 0
>> epoll_ctl(13, EPOLL_CTL_ADD, 78, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=149035320, u64=149035320}}) = 0
>> epoll_ctl(13, EPOLL_CTL_DEL, 78, {0, {u32=149035320, u64=149035320}}) = 0
>> epoll_ctl(13, EPOLL_CTL_ADD, 78, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=149035320, u64=149035320}}) = 0
>> epoll_ctl(13, EPOLL_CTL_DEL, 78, {0, {u32=149035320, u64=149035320}}) = 0
> 
> Nothing else but these epoll_ctl() calls? So it's gone to some loop where it keeps calling io_add() and io_remove(). 

I'm guessing it's because of doveadm command handling issues, since there's some weirdness in the code. Although I couldn't figure out exactly why it would go to infinite loop there. But attached a patch that may fix it, if you're able to test. We haven't noticed such infinite looping in other installations or automated director stresstests though..


>> FD 13 is "anon_inode:[eventpoll]"
> 
> What about fd 78? I guess some socket.
> 
> Could you also try two more things when it happens again:
> 
> ltrace -tt -e '*' -o ltrace.log -p <pid>
> (My guess this isn't going to be very useful, but just in case it might be..)
> 
> gdb -p <pid>
> bt full
> quit
> 
> Preferably install dovecot-dbg package also so the gdb backtrace output will be better.

These would still be useful to verify whether I'm even on the right track.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20180907/14a8118c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2417.patch
Type: application/octet-stream
Size: 1481 bytes
Desc: not available
URL: <https://dovecot.org/pipermail/dovecot/attachments/20180907/14a8118c/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20180907/14a8118c/attachment-0001.html>


More information about the dovecot mailing list