On 1/20/2012 1:06 PM, Timo Sirainen wrote:
Hmh. Still doesn't work 100%:
auth-worker(28788): Error: mysql: Query failed, retrying: MySQL server has gone away (idled for 181 secs) auth-worker(7413): Error: mysql: Query failed, retrying: MySQL server has gone away (idled for 298 secs)
I'm not really sure why it's not killing itself after 60 seconds of idling. Probably related to how mysql code tracks idle time and how idle_kill tracks it.. Anyway, those errors are much more rare now.
The mysql server starts tracking idle time as beginning after the last network communication with the client. So presumably if the auth worker gets marked as not idle by anything not involving interaction with the mysql server, they could get out of sync.
Before you posted a potential fix to the idle timeout, I was looking at other possible ways to resolve the issue. Currently, an authentication request is tried exactly twice -- one initial try, and one retry.
Looking at driver-sqlpool.c:
if (result->failed_try_retry && !request->retried) {
Currently, retried is a boolean. What if retried was an integer instead, and a new configuration variable allowed you to specify how many times an authentication attempt should be retried? The default could be 2, which would result in exactly the same behavior. But then you could set it to 3 or 4 to prevent a request from hitting a timed out connection twice and failing completely.
Ideally, a better fix would be for the client not to consider a "MySQL server has gone away" return as a failure, but instead immediately reconnect and try again without marking it as a retry. However, from reviewing the code, that would be a much more difficult and invasive change. Changing the existing retried variable to an integer count rather than a boolean is pretty simple.
-- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson@csupomona.edu California State Polytechnic University | Pomona CA 91768