[Dovecot] MySQL server has gone away

Mark Moseley moseleymark at gmail.com
Fri Jan 13 22:45:03 EET 2012


On Fri, Jan 13, 2012 at 11:38 AM, Robert Schetterer
<robert at schetterer.org> wrote:
> Am 13.01.2012 19:29, schrieb Mark Moseley:
>> On Fri, Jan 13, 2012 at 1:36 AM, Timo Sirainen <tss at iki.fi> wrote:
>>> On 13.1.2012, at 4.00, Mark Moseley wrote:
>>>
>>>> I'm running 2.0.17 and I'm still seeing a decent amount of "MySQL
>>>> server has gone away" errors, despite having multiple hosts defined in
>>>> my auth userdb 'connect'. This is Debian Lenny 32-bit and I'm seeing
>>>> the same thing with 2.0.16 on Debian Squeeze 64-bit.
>>>>
>>>> E.g.:
>>>>
>>>> Jan 12 20:30:33 auth-worker: Error: mysql: Query failed, retrying:
>>>> MySQL server has gone away
>>>>
>>>> Our mail mysql servers are busy enough that wait_timeout is set to a
>>>> whopping 30 seconds. On my regular boxes, I see a good deal of these
>>>> in the logs. I've been doing a lot of mucking with doveadm/dsync
>>>> (working on maildir->mdbox migration finally, yay!) on test boxes
>>>> (same dovecot package & version) and when I get this error, despite
>>>> the log saying it's retrying, it doesn't seem to be. Instead I get:
>>>>
>>>> dsync(root): Error: user ...: Auth USER lookup failed
>>>
>>> Try with only one host in the "connect" string? My guess: Both the connections have timed out, and the retrying fails as well (there is only one retry). Although if the retrying lookup fails, there should be an error logged about it also (you don't see one?)
>>>
>>> Also another idea to avoid them in the first place:
>>>
>>> service auth-worker {
>>>  idle_kill = 20
>>> }
>>>
>>
>> With just one 'connect' host, it seems to reconnect just fine (using
>> the same tests as above) and I'm not seeing the same error. It worked
>> every time that I tried, with no complaints of "MySQL server has gone
>> away".
>>
>> If there are multiple hosts, it seems like the most robust thing to do
>> would be to exhaust the existing connections and if none of those
>> succeed, then start a new connection to one of them. It will probably
>> result in much more convoluted logic but it'd probably match better
>> what people expect from a retry.
>>
>> Alternatively, since in all my tests, the mysql server has closed the
>> connection prior to this, is the auth worker not recognizing its
>> connection is already half-closed (in which case, it probably
>> shouldn't even consider it a legitimate connection and just
>> automatically reconnect, i.e. try #1, not the retry, which would
>> happen after another failure).
>>
>> I'll give the idle_kill a try too. I kind of like the idea of
>> idle_kill for auth processes anyway, just to free up some connections
>> on the mysql server.
>
> by the way , if you use sql for auth have you tried auth caching ?
>
> http://wiki.dovecot.org/Authentication/Caching
>
> i.e.
>
> # Authentication cache size (e.g. 10M). 0 means it's disabled. Note that
> # bsdauth, PAM and vpopmail require cache_key to be set for caching to
> be used.
>
> auth_cache_size = 10M
>
> # Time to live for cached data. After TTL expires the cached record is no
> # longer used, *except* if the main database lookup returns internal
> failure.
> # We also try to handle password changes automatically: If user's previous
> # authentication was successful, but this one wasn't, the cache isn't used.
> # For now this works only with plaintext authentication.
>
> auth_cache_ttl = 1 hour
>
> # TTL for negative hits (user not found, password mismatch).
> # 0 disables caching them completely.
>
> auth_cache_negative_ttl = 0


Yup, we have caching turned on for our production boxes. On this
particular box, I'd just shut off caching so that I could work on a
script for converting from maildir->mdbox and run it repeatedly on the
same mailbox. I got tired of restarting dovecot between each test :)



More information about the dovecot mailing list