[Dovecot] Ring SYNC appears to have got lost, resending after upgrade
Hi @all,
on Saturday I upgraded two dovecot servers from squeeze to wheezy and dovecot from 2.1.x to 2.2.5 (compiled from sources). After the upgrade everything worked fine at first.
On Sunday Morning I recognized these errors (they occured after a reload for logging purpose on midnight) on one server:
director: Error: Ring SYNC appears to have got lost, resending
After reloading/restarting both dovecot services the error occured on both servers. After some research I deleted some "zlib"-File which isn't needed anymore in dovecot 2.2.x and reinstalled dovecot. The error message disappeared.
Today the error occured again (after the reload on midnight) and again on one node only until reloading/restarting the other node too. However, there is an additional error message:
Sep 09 10:27:07 director: Error: Ring SYNC appears to have got lost, resending Sep 09 10:27:10 director: Panic: file login-connection.c: line 88 (login_host_callback): assertion failed: (strncmp(request->line, "OK\t", 3) == 0)
Any ideas?
Patrick
node1: # 2.2.5: /usr/local/etc/dovecot/dovecot.conf # OS: Linux 3.2.0-4-amd64 x86_64 Debian 7.1 auth_mechanisms = plain login director_mail_servers = 172.17.1.2 172.17.1.1 director_servers = 172.17.1.3 172.17.1.4 director_user_expire = 5 mins lmtp_proxy = yes log_path = /var/log/dovecot.log managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date ihave protocols = imap pop3 lmtp sieve service auth { unix_listener /var/spool/postfix/private/auth { group = postfix mode = 0666 user = postfix } unix_listener auth-userdb { user = dovecot } } service director { fifo_listener login/proxy-notify { mode = 0666 } inet_listener { address = 172.17.1.3 port = 9090 } unix_listener director-userdb { mode = 0600 } unix_listener login/director { mode = 0666 } } service imap-login { executable = imap-login director } service lmtp { inet_listener lmtp { address = 172.17.1.3 port = 24 } } service managesieve-login { executable = managesieve-login director inet_listener sieve { port = 4190 } } service pop3-login { executable = pop3-login director } ssl_cert = </etc/ssl/certs/wildcard.xxx.crt ssl_key = </etc/ssl/private/wildcard.xxx.key protocol !smtp { passdb { args = proxy=y nopassword=y starttls=any-cert driver = static } } protocol smtp { passdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } userdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } } protocol lmtp { auth_socket_path = director-userdb }
node2: # 2.2.5: /usr/local/etc/dovecot/dovecot.conf # OS: Linux 3.2.0-4-amd64 x86_64 Debian 7.1 auth_mechanisms = plain login director_mail_servers = 172.17.1.2 172.17.1.1 director_servers = 172.17.1.4 172.17.1.3 director_user_expire = 5 mins lmtp_proxy = yes log_path = /var/log/dovecot.log managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date ihave protocols = imap pop3 lmtp sieve service auth { unix_listener /var/spool/postfix/private/auth { group = postfix mode = 0666 user = postfix } unix_listener auth-userdb { user = dovecot } } service director { fifo_listener login/proxy-notify { mode = 0666 } inet_listener { address = 172.17.1.4 port = 9090 } unix_listener director-userdb { mode = 0600 } unix_listener login/director { mode = 0666 } } service imap-login { executable = imap-login director process_min_avail = 2 service_count = 0 vsz_limit = 128 M } service lmtp { inet_listener lmtp { address = 172.17.1.4 port = 24 } } service managesieve-login { executable = managesieve-login director inet_listener sieve { port = 4190 } } service pop3-login { executable = pop3-login director } ssl_cert = </etc/ssl/certs/wildcard.xxx.crt ssl_key = </etc/ssl/private/wildcard.xxx.key protocol !smtp { passdb { args = proxy=y nopassword=y starttls=any-cert driver = static } } protocol smtp { passdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } userdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } } protocol lmtp { auth_socket_path = director-userdb }
On Mon, Sep 09, 2013 at 11:13:36AM +0200, Patrick Westenberg wrote:
on Saturday I upgraded two dovecot servers from squeeze to wheezy and dovecot from 2.1.x to 2.2.5 (compiled from sources). After the upgrade everything worked fine at first.
On Sunday Morning I recognized these errors (they occured after a reload for logging purpose on midnight) on one server:
director: Error: Ring SYNC appears to have got lost, resending
After reloading/restarting both dovecot services the error occured on both servers. After some research I deleted some "zlib"-File which isn't needed anymore in dovecot 2.2.x and reinstalled dovecot. The error message disappeared.
Today the error occured again (after the reload on midnight) and again on one node only until reloading/restarting the other node too. However, there is an additional error message:
Sep 09 10:27:07 director: Error: Ring SYNC appears to have got lost, resending Sep 09 10:27:10 director: Panic: file login-connection.c: line 88 (login_host_callback): assertion failed: (strncmp(request->line, "OK\t", 3) == 0)
I had the same issue (CentOS 6.4 upgraded with third-party RPMs) on Thu/Fri, and I asked Timo about it in IRC. Apparently a 2.2.6 release is due soon. He gave me two hg links claimed to fix it:
http://hg.dovecot.org/dovecot-2.2/rev/f7a37b169f4a http://hg.dovecot.org/dovecot-2.2/rev/9531ec8afe8b
However I did have the lost ring SYNC error recur after the cluster was upgraded to the RPM packages currently in Dovecot's EE repo (non-free, pay for access) which does include these fixes.
Restart of all director instances worked for me. Actually I stopped all, then started all.
So far so good. We're going to go live with this cluster soon, I hope.
node1: # 2.2.5: /usr/local/etc/dovecot/dovecot.conf # OS: Linux 3.2.0-4-amd64 x86_64 Debian 7.1 auth_mechanisms = plain login director_mail_servers = 172.17.1.2 172.17.1.1 director_servers = 172.17.1.3 172.17.1.4 director_user_expire = 5 mins lmtp_proxy = yes log_path = /var/log/dovecot.log managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date ihave protocols = imap pop3 lmtp sieve service auth { unix_listener /var/spool/postfix/private/auth { group = postfix mode = 0666 user = postfix } unix_listener auth-userdb { user = dovecot } } service director { fifo_listener login/proxy-notify { mode = 0666 } inet_listener { address = 172.17.1.3 port = 9090 } unix_listener director-userdb { mode = 0600 } unix_listener login/director { mode = 0666 } } service imap-login { executable = imap-login director } service lmtp { inet_listener lmtp { address = 172.17.1.3 port = 24 } } service managesieve-login { executable = managesieve-login director inet_listener sieve { port = 4190 } } service pop3-login { executable = pop3-login director } ssl_cert = </etc/ssl/certs/wildcard.xxx.crt ssl_key = </etc/ssl/private/wildcard.xxx.key protocol !smtp { passdb { args = proxy=y nopassword=y starttls=any-cert driver = static } } protocol smtp { passdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } userdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } } protocol lmtp { auth_socket_path = director-userdb }
node2: # 2.2.5: /usr/local/etc/dovecot/dovecot.conf # OS: Linux 3.2.0-4-amd64 x86_64 Debian 7.1 auth_mechanisms = plain login director_mail_servers = 172.17.1.2 172.17.1.1 director_servers = 172.17.1.4 172.17.1.3 director_user_expire = 5 mins lmtp_proxy = yes log_path = /var/log/dovecot.log managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date ihave protocols = imap pop3 lmtp sieve service auth { unix_listener /var/spool/postfix/private/auth { group = postfix mode = 0666 user = postfix } unix_listener auth-userdb { user = dovecot } } service director { fifo_listener login/proxy-notify { mode = 0666 } inet_listener { address = 172.17.1.4 port = 9090 } unix_listener director-userdb { mode = 0600 } unix_listener login/director { mode = 0666 } } service imap-login { executable = imap-login director process_min_avail = 2 service_count = 0 vsz_limit = 128 M } service lmtp { inet_listener lmtp { address = 172.17.1.4 port = 24 } } service managesieve-login { executable = managesieve-login director inet_listener sieve { port = 4190 } } service pop3-login { executable = pop3-login director } ssl_cert = </etc/ssl/certs/wildcard.xxx.crt ssl_key = </etc/ssl/private/wildcard.xxx.key protocol !smtp { passdb { args = proxy=y nopassword=y starttls=any-cert driver = static } } protocol smtp { passdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } userdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } } protocol lmtp { auth_socket_path = director-userdb }
-- http://rob0.nodns4.us/ -- system administration and consulting Offlist GMX mail is seen only if "/dev/rob0" is in the Subject:
This issue still occurs. It varies which of the four director instances gets it, but it seems that once one of them does, the only fix is to restart all four.
On Mon, Sep 09, 2013 at 07:41:10AM -0500, /dev/rob0 wrote:
On Mon, Sep 09, 2013 at 11:13:36AM +0200, Patrick Westenberg wrote:
on Saturday I upgraded two dovecot servers from squeeze to wheezy and dovecot from 2.1.x to 2.2.5 (compiled from sources). After the upgrade everything worked fine at first.
On Sunday Morning I recognized these errors (they occured after a reload for logging purpose on midnight) on one server:
director: Error: Ring SYNC appears to have got lost, resending
After reloading/restarting both dovecot services the error occured on both servers. After some research I deleted some "zlib"-File which isn't needed anymore in dovecot 2.2.x and reinstalled dovecot. The error message disappeared.
Today the error occured again (after the reload on midnight) and again on one node only until reloading/restarting the other node too. However, there is an additional error message:
Sep 09 10:27:07 director: Error: Ring SYNC appears to have got lost, resending Sep 09 10:27:10 director: Panic: file login-connection.c: line 88 (login_host_callback): assertion failed: (strncmp(request->line, "OK\t", 3) == 0)
I had the same issue (CentOS 6.4 upgraded with third-party RPMs) on Thu/Fri, and I asked Timo about it in IRC. Apparently a 2.2.6 release is due soon. He gave me two hg links claimed to fix it:
http://hg.dovecot.org/dovecot-2.2/rev/f7a37b169f4a http://hg.dovecot.org/dovecot-2.2/rev/9531ec8afe8b
However I did have the lost ring SYNC error recur after the cluster was upgraded to the RPM packages currently in Dovecot's EE repo (non-free, pay for access) which does include these fixes.
Restart of all director instances worked for me. Actually I stopped all, then started all.
So far so good. We're going to go live with this cluster soon, I hope.
-- http://rob0.nodns4.us/ -- system administration and consulting Offlist GMX mail is seen only if "/dev/rob0" is in the Subject:
On 20.9.2013, at 9.07, /dev/rob0 <rob0@gmx.co.uk> wrote:
This issue still occurs. It varies which of the four director instances gets it, but it seems that once one of them does, the only fix is to restart all four.
Do you see any other errors or warnings besides this? Are any of the directors restarted or do they for any reason get disconnected from each others before this happens? Are the clocks synchronized in all the directors? I'm guessing the directors did get restarted at some point and some of the other directors didn't notice this because of a bug:
http://hg.dovecot.org/dovecot-2.2/rev/b78c705bbb8d
I think with 3 directors this error wouldn't happen, because all directors have direct connections to each others and this bug doesn't affect them.
I'm fixing bugs for the rest of this week and I'll make new Dovecot release hopefully next week. And -ee release with this fix probably sooner. Maybe I'll even have time to go through this mailing list at some point. :)
participants (3)
-
/dev/rob0
-
Patrick Westenberg
-
Timo Sirainen