Hi,

We've experienced this too, happens once in a few months
i'm reply-ing now because it happend this morning on one of our multiple nodes, other nodes were fine.

debian 9.4
dovecot-core                         1:2.2.34-2~bpo9+1

Feb  1 06:24:58 machine61 dovecot: auth: Error: plain(EDITED@EDITED.nl,10.0.10.182,<zHU8VM6ATrIKAAq2>): Request 890.37948303 timed out after 150 secs, state=1

Restart of the daemon fixed it.
unable to reproduce.

I'm willing to enable specific debug options to help debug this.
if you could point me in the correct settings for the dovecot: auth: daemon.


On Wed, May 30, 2018 at 3:32 PM Hajo Locke <Hajo.Locke@gmx.de> wrote:
Hello,


Am 29.05.2018 um 11:30 schrieb Gerald Galster:
>
>> Am 29.05.2018 um 11:00 schrieb Aki Tuomi <aki.tuomi@dovecot.fi>:
>>
>>
>>
>> On 29.05.2018 11:35, Hajo Locke wrote:
>>> Hello,
>>>
>>>
>>> Am 29.05.2018 um 09:22 schrieb Aki Tuomi:
>>>> On 29.05.2018 09:54, Hajo Locke wrote:
>>>>> Hello List,
>>>>>
>>>>> i use dovecot 2.2.22 and have the same problem described here:
>>>>> https://dovecot.org/pipermail/dovecot/2017-November/110020.html
>>>>>
>>>>> I can confirm that sometimes there is a problem with connection to
>>>>> mysql-db, but sometimes not.
>>>>> Reasons for failing are still under investigation by my mates.
>>>>>
>>>>> My current main problem is, that this fail seems to be a one way
>>>>> ticket for dovecot. Even if mysql is verifyable working again and
>>>>> waiting for connection dovecot stucks furthermore with errors like
>>>>> this:
>>>>>
>>>>> May 29 07:00:49 hostname dovecot: auth: Error:
>>>>> plain(mail@example.com,xxx.xxx.xx.xxx,<HGI4FVFt2Oy80j3P>): Request
>>>>> 999.7 timed out after 150 secs, state=1
>>>>>
>>>>> When restarting dovecot all is immediately working again.
>>>>> Is there a way to tell dovecot to restart auth services or
>>>>> reinitialize mysql-connection after these hard fails? I could insert
>>>>> "idle_kill = 1 mins" into service auth and service auth-worker, but i
>>>>> dont know if this would work. Unfortunately i am not able to reproduce
>>>>> this error and there are always a couple of days between errors.
>>>>>
>>>>> Thanks,
>>>>> Hajo
>>>>>
>>>>>
>>>> Hi!
>>>>
>>>> I was not able to repeat this problem using 2.2.36. Can you provide
>>>> steps to reproduce?
>>>>
>>>> May 29 10:20:24 auth: Debug: client in: AUTH    1    PLAIN
>>>> service=imap    secured    session=XtpgEVNtQeUAAAAAAAAAAAAAAAAAAAAB
>>>> lip=::1    rip=::1    lport=143    rport=58689    resp=<hidden>
>>>> May 29 10:20:24 auth-worker(31098): Debug:
>>>> sql(test@domain.org,::1,<XtpgEVNtQeUAAAAAAAAAAAAAAAAAAAAB>): query:
>>>> SELECT userid AS username, domain, password FROM users WHERE userid =
>>>> 'test' AND domain = 'domain.org'
>>>> May 29 10:20:54 auth-worker(31098): Warning: mysql: Query failed,
>>>> retrying: Lost connection to MySQL server during query (idled for 28
>>>> secs)
>>>> May 29 10:20:59 auth-worker(31098): Error: mysql(127.0.0.1): Connect
>>>> failed to database (dovecot): Can't connect to MySQL server on
>>>> '127.0.0.1' (4) - waiting for 5 seconds before retry
>>>> May 29 10:21:04 auth-worker(31098): Error: mysql(127.0.0.1): Connect
>>>> failed to database (dovecot): Can't connect to MySQL server on
>>>> '127.0.0.1' (4) - waiting for 5 seconds before retry
>>>> May 29 10:21:14 auth: Debug: auth client connected (pid=31134)
>>>> May 29 10:21:14 imap-login: Warning: Growing pool 'imap login commands'
>>>> with: 1024
>>>> May 29 10:21:14 auth-worker(31098): Error: mysql(127.0.0.1): Connect
>>>> failed to database (dovecot): Can't connect to MySQL server on
>>>> '127.0.0.1' (4) - waiting for 25 seconds before retry
>>>>
>>>> This is what it looks like for me and after restoring connectivity, it
>>>> started working normally.
>>> Unfortunately i can not reproduce. Servers running well for days or
>>> sometimes weeks and then it happens one time. I can provide some more
>>> logs.
>>>
>>> This is an error with mysql involvement:
>>>
>>> May 29 06:56:59 hostname dovecot: auth-worker(1099): Error:
>>> mysql(xx.xx.xx.xx): Connect failed to database (mysql): Can't connect
>>> to MySQL server on 'xx.xx.xx.xx' (111) - waiting for 1 seconds before
>>> retry
>>> .
>>> . some more of above line
>>> .
>>> May 29 06:56:59 hostname dovecot: auth-worker(1110): Error:
>>> sql(mail@example.com,xx.xx.xx.xx): Password query failed: Not
>>> connected to database
>>> May 29 06:56:59 hostname dovecot: auth: Error: auth worker: Aborted
>>> PASSV request for mail@example.com: Internal auth worker failure
>>> May 29 06:57:59 hostname dovecot: auth-worker(1099): Error: mysql:
>>> Query timed out (no free connections for 60 secs): SELECT `inbox` as
>>> `user`, `password` FROM `mail_users` WHERE `login` = 'username' AND
>>> `active`='Y'
>>> May 29 06:59:30 hostname dovecot: auth: Error:
>>> plain(username,xx.xx.xx.xx,<jc9yEFFtXcJXmx0F>): Request 999.2 timed
>>> out after 151 secs, state=1
>>> .
>>> . much more of these lines with Request timed out
>>> .
>>>
>>> At this point my mates restartet dovecot and all worked well
>>> immediately. Mysql performed a short restart at 6:56 and dovecot was
>>> not able to reconnect for about 10 mins until my mates did the
>>> restart. I could not reproduce the problem by manually restarting of
>>> mysql, this worked well.
>>>
>>> This is an error without visible mysql involvement:
>>> .
>>> . lines of normal imap/pop activity
>>> .
>>> May 29 05:43:03 hostname dovecot: imap-login: Error: master(imap):
>>> Auth request timed out (received 0/12 bytes)
>>> May 29 05:43:03 hostname dovecot: imap-login: Internal login failure
>>> (pid=1014 id=16814) (internal failure, 1 successful auths):
>>> user=<username>, method=PLAIN, rip=xx.xx.xx.xx, lip=xx.xx.xx.xx, TLS
>>> May 29 05:43:03 hostname dovecot: imap: Error: Login client
>>> disconnected too early
>>> May 29 05:44:03 hostname dovecot: auth: Error:
>>> plain(username,xx.xx.xx.xx,<Wry1AlBt7couU7FV>): Request 1014.16815
>>> timed out after 150 secs, state=1
>>> May 29 05:44:08 hostname dovecot: imap: Error: Auth server request
>>> timed out after 155 secs (client-pid=1014 client-id=16814)
>>> May 29 05:44:33 hostname dovecot: imap-login: Disconnected: Inactivity
>>> during authentication (disconnected while authenticating, waited 180
>>> secs): user=<>, method=PLAIN, rip=xx.xx.xx.xx, lip=xx.xx.xx.xx, TLS:
>>> Disconnected
>>> May 29 05:46:07 hostname dovecot: auth: Error:
>>> plain(mail@example.com,xx.xx.xx.xx,<j2cWClBtVf3Z+t+p>): Request
>>> 1014.16819 timed out after 150 secs, state=1
>>> .
>>> . and so on until restart
>>> .
>>>
>>> There is no errorlog in relation to mysql written by dovecot.
>>>
>>> We use 2.2.22 because this version is provided with Ubuntu 16.04 and
>>> Ubuntu Team is maintaining software and providing security patches,
>>> which happens 2017 and 2018 a couple of times. By manually compiling
>>> and using newer versions you have to check updates by yourself and you
>>> are forced to run every single update. Our goal is reliability which
>>> contradicts to use of bleeding edge software.
>>>
>>>
>>> This is output of current doveconf -n
>>>
>>> # 2.2.22 (fe789d2): /etc/dovecot/dovecot.conf
>>> # Pigeonhole version 0.4.13 (7b14904)
>>> # OS: Linux 4.4.0-127-generic x86_64 Ubuntu 16.04.4 LTS
>>> auth_mechanisms = plain login
>>> auth_username_chars =
>>> abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890.-_@üÜöÖäÄ
>>> auth_worker_max_count = 60
>>> disable_plaintext_auth = no
>>> imap_capability = +XLIST
>>> lock_method = flock
>>> login_greeting = mailserver ready.
>>> login_log_format_elements = user=<%u> method=%m rip=%r lip=%l %c
>>> mail_cache_min_mail_count = 30
>>> mail_location = mbox:~/mail:INBOX=/var/mail/%u
>>> mail_max_userip_connections = 30
>>> mbox_read_locks = dotlock
>>> mbox_very_dirty_syncs = yes
>>> namespace inbox {
>>>    inbox = yes
>>>    location =
>>>    mailbox Archiv {
>>>      auto = no
>>>      special_use =
>>>    }
>>>    mailbox Drafts {
>>>      auto = no
>>>      special_use =
>>>    }
>>>    mailbox Junk {
>>>      auto = no
>>>      special_use =
>>>    }
>>>    mailbox Sent {
>>>      auto = no
>>>      special_use =
>>>    }
>>>    mailbox Trash {
>>>      auto = no
>>>      special_use =
>>>    }
>>>    prefix =
>>> }
>>> passdb {
>>>    args = /etc/dovecot/dovecot-sql.conf
>>>    driver = sql
>>> }
>>> protocols = imap pop3
>>> service auth {
>>>    unix_listener /var/spool/postfix/private/auth {
>>>      group = postfix
>>>      mode = 0660
>>>      user = postfix
>>>    }
>>>    user = root
>>> }
>>> service imap-login {
>>>    executable = /usr/lib/dovecot/imap-login
>>>    process_min_avail = 1
>>>    service_count = 0
>>>    user = dovecot
>>> }
>>> service imap {
>>>    process_limit = 1024
>>> }
>>> service pop3-login {
>>>    executable = /usr/lib/dovecot/pop3-login
>>>    process_min_avail = 1
>>>    service_count = 0
>>>    user = dovecot
>>> }
>>> service pop3 {
>>>    executable = /usr/lib/dovecot/pop3
>>>    process_limit = 1024
>>> }
>>> shutdown_clients = no
>>> ssl_ca = </path/path/cert.pem
>>> ssl_cert = </path/path/cert.pem
>>> ssl_cipher_list =
>>> ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:!DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA
>>> ssl_dh_parameters_length = 2048
>>> ssl_key = </path/path/cert.pem
>>> ssl_prefer_server_ciphers = yes
>>> ssl_protocols = !SSLv2 !SSLv3
>>> userdb {
>>>    args = /etc/dovecot/dovecot-sql.conf
>>>    driver = sql
>>> }
>>> verbose_proctitle = yes
>>> protocol imap {
>>>    imap_client_workarounds = delay-newmail
>>>    mail_plugins = mail_log notify
>>>    ssl = yes
>>> }
>>> protocol pop3 {
>>>    pop3_client_workarounds = oe-ns-eoh outlook-no-nuls
>>>    pop3_no_flag_updates = no
>>>    pop3_uidl_format = %08Xu%08Xv
>>>    ssl = yes
>>> }
>>>
>>> I hope you have some hints. I dislike to monitor this with fail2ban
>>> and do hard restarts.
>>>> Aki
>>>>
>>> Thanks,
>>> Hajo
>> Can you try doing successful auth MySQL, then, for short while, blocking
>> access to MySQL server using iptables (no idea how to do this with
>> socket based), and try authenticating while access is blocked. It
>> should, hopefully, reproduce this error.
> You could try to remove the socket file when mysql is running and then using socat:
>
> socat UNIX-LISTEN:/var/lib/mysql/mysql.sock,fork,reuseaddr,unlink-early,mode=777 TCP:127.0.0.1:3306
>
> Best regards
> Gerald
>
>
Thanks for your help. Unfortunately iam not able to reproduce this
manually. Bei blocking port with iptables or manipulating the socket
dovecot always log correctly the disconnection.
After unblock port or reestablishing the socket dovecot reconnects
succsessful and auth works immediately.
I think my report must have other nature of invalid connection/bug.

A few minutes ago i noticed one server having this problem.  dovecot was
logging just lines about timed out connections but in parallel i was
able to connect to local mysql and run userdb/passdb queries.

It starts like this:

May 30 14:51:44 hostname dovecot: imap-login: Error: master(imap): Auth
request timed out (received 0/12 bytes)
May 30 14:51:44 hostname dovecot: imap-login: Internal login failure
(pid=29595 id=1556) (internal failure, 1 successful auths):
user=<username>, method=PLAIN, rip=xx.xx.xx.xx, lip=xx.xx.xx.xx, TLS
May 30 14:51:44 hostname dovecot: imap: Error: Login client disconnected
too early

After that only lines with error, timeouts or other fails:

May 30 14:52:44 hostname dovecot: auth: Error:
plain(username,xx.xx.xx.xx,<3ObCymttnsZNu+bd>): Request 29595.1557 timed
out after 150 secs, state=1
May 30 14:52:49 hostname dovecot: imap: Error: Auth server request timed
out after 155 secs (client-pid=29595 client-id=1556)
May 30 14:52:59 hostname dovecot: auth: Error:
plain(mail@example.com,xx.xx.xx.xx,<dqOoy2ttm/pe2WEH>): Request
29595.1558 timed out after 150 secs, state=1
May 30 14:53:13 hostname dovecot: imap-login: Disconnected: Inactivity
during authentication (disconnected while authenticating, waited 179
secs): user=<>, method=PLAIN, rip=xx.xx.xx.xx, lip=xx.xx.xx.xx, TLS:
Disconnected

Unfortunately dovecot was not able to reestablish connection to local
mysql.
ps axu showed stack of jamed passdb queries:

root      1055  0.0  0.0  40984  6120 ?        S    Mai29   0:09
dovecot/auth [0 wait, 84 passdb, 1 userdb]
root      9057  0.0  0.0  40084  5144 ?        S    15:03   0:00
dovecot/auth [0 wait, 0 passdb, 0 userdb]
root     11823  0.0  0.0  40084  5036 ?        S    15:04   0:00
dovecot/auth [0 wait, 0 passdb, 0 userdb]
root     12975  0.0  0.0  40084  5080 ?        S    15:05   0:00
dovecot/auth worker: idling


By killing PID 1055 dovecot was spawning new auth process and normal
processing startet again. Until restart no Login was possible for any user.

Other people reported limitproblems with systemd, but affected servers
have no higher load than usual. Hmm, iam running out of ideas now.

Thanks,
Hajo