On 29.05.2012 12:23, Cor Bosman wrote:
At first I thought maybe one of our 35 imap servers was having issues sending data, but all individual servers show this patters. Here is a bunch of individual servers: http://grab.by/dReC Anyone have any idea what could cause such a pattern? Maybe dovecot does some cleaning up of idle sessions at specific intervals? Or maybe our loadbalancers do, or the imapdirectors.
A shot in the dark ...
Maybe some kind of TCP or session timeout on a packet filtering device or loadbalancer? Maybe that time is shorter than the IMAP idle timeout. So TCP connections are "killed". Such a TCP stateful device may not send any active RST packets to the client. This way it's up to the client to recognize a broken TCP connection. This may then only occur when the client believes it's time to renew the IMAP idle and then finds the TCP connection gone.
Check the config/logs of any >= layer 4 devices for "session teardown" or "session timeout"
Check the dovecot logs and sort certain patterns by the minute. Maybe you find dovecot logging more "client timeout" or "connection reset by peer" at certain minutes than others. Maybe also group by other parts of the log entries such as usernames to find any patterns.
Connect to the imap server yourself and sniff an IMAP IDLE session with wireshark. Make sure you use the same path as the users ou there and not bypass the loadbalancer or whatever.
Regards
Christian