[Dovecot] interesting stats pattern
Hey all, im experimenting with dovecot stats service, and graphing the result. My initial results are kind of interesting. Check out this graph showing connected sessions and users:
At first I thought maybe one of our 35 imap servers was having issues sending data, but all individual servers show this patters. Here is a bunch of individual servers:
Anyone have any idea what could cause such a pattern? Maybe dovecot does some cleaning up of idle sessions at specific intervals? Or maybe our loadbalancers do, or the imapdirectors.
regards,
Cor
On 29.5.2012, at 13.23, Cor Bosman wrote:
Hey all, im experimenting with dovecot stats service, and graphing the result. My initial results are kind of interesting. Check out this graph showing connected sessions and users:
How do you get the list? Are you periodically just getting list of sessions/users with doveadm stats dump?
Anyone have any idea what could cause such a pattern? Maybe dovecot does some cleaning up of idle sessions at specific intervals? Or maybe our loadbalancers do, or the imapdirectors.
doveadm stats dump by default dumps a lot of historic data as well. If you want to see only the currently connected sessions/users, add "connected" parameter.
Note that I'm not entirely sure what would be the best API for getting the stats. I'm also thinking that for best behavior the stats process should simply be dumping the data to some permanent database, and you'd do the lookups from there. Otherwise data is lost when Dovecot restarts. The dumping to db could already be done with a "doveadm stats dump" cronjob that runs e.g. once a minute.
And/or perhaps stats process should be saving its state permanently to /var/lib/dovecot/ and loading it at startup. Still, a permanent DB would probably be better for some purposes.
On May 29, 2012, at 2:21 PM, Timo Sirainen wrote:
On 29.5.2012, at 13.23, Cor Bosman wrote:
Hey all, im experimenting with dovecot stats service, and graphing the result. My initial results are kind of interesting. Check out this graph showing connected sessions and users:
How do you get the list? Are you periodically just getting list of sessions/users with doveadm stats dump?
Anyone have any idea what could cause such a pattern? Maybe dovecot does some cleaning up of idle sessions at specific intervals? Or maybe our loadbalancers do, or the imapdirectors.
doveadm stats dump by default dumps a lot of historic data as well. If you want to see only the currently connected sessions/users, add "connected" parameter.
Note that I'm not entirely sure what would be the best API for getting the stats. I'm also thinking that for best behavior the stats process should simply be dumping the data to some permanent database, and you'd do the lookups from there. Otherwise data is lost when Dovecot restarts. The dumping to db could already be done with a "doveadm stats dump" cronjob that runs e.g. once a minute.
And/or perhaps stats process should be saving its state permanently to /var/lib/dovecot/ and loading it at startup. Still, a permanent DB would probably be better for some purposes.
Yes, I am getting a list of sessions/users every 5 minutes through cron. Im already using "doveadm stats dump session/user connected"
It's not a big deal or anything, just wondering about the weird patterns. If it's really dropping/gaining connections, id like to figure out why.
Cor
On 29.5.2012, at 21.03, Cor Bosman wrote:
es, I am getting a list of sessions/users every 5 minutes through cron. Im already using "doveadm stats dump session/user connected"
Actually that's not really correct behavior either, since it ignores all the connections that happened during the 5 minutes if they don't exist at the time when you're asking for them. I'm not sure what the most correct way to do this kind of a graph would be :)
It's not a big deal or anything, just wondering about the weird patterns. If it's really dropping/gaining connections, id like to figure out why.
Are you only counting imap/pop3 sessions or also others? Anything that touches mailboxes are counted as sessions (lda, lmtp, doveadm, indexer, ..)
On 29.5.2012, at 21.03, Cor Bosman wrote:
es, I am getting a list of sessions/users every 5 minutes through cron. Im already using "doveadm stats dump session/user connected"
Actually that's not really correct behavior either, since it ignores all the connections that happened during the 5 minutes if they don't exist at the time when you're asking for them. I'm not sure what the most correct way to do this kind of a graph would be :)
I dont really need to know how many total connections ive had. More an idea of amount of concurrent sessions/users. Wouldnt this be pretty accurate?
It's not a big deal or anything, just wondering about the weird patterns. If it's really dropping/gaining connections, id like to figure out why.
Are you only counting imap/pop3 sessions or also others? Anything that touches mailboxes are counted as sessions (lda, lmtp, doveadm, indexer, .)
We only have imap connections for now, nothing else.
Cor
On 29/05/2012 19:13, Timo Sirainen wrote:
On 29.5.2012, at 21.03, Cor Bosman wrote:
yes, I am getting a list of sessions/users every 5 minutes through cron. Im already using "doveadm stats dump session/user connected"
Actually that's not really correct behaviour either, since it ignores all the connections that happened during the 5 minutes if they don't exist at the time when you're asking for them. I'm not sure what the most correct way to do this kind of a graph would be :)
Just to share in case it helps, what we do in telecommunications equipment is to count both:
sessions (a simple increment, sometimes termed 'peg-counts')
session-seconds (a cumulative measure of the number of seconds that all sessions have endured during the reporting period)
Sometimes, for an analysis, the 'transaction rate' is important - so the session counts are helpful.
Other times, we want to know the overall demand for 'active presence' (actually, telephone call occupancy in our industry) so that we deploy sufficient equipment for the total 'presence' needed, irrespective of whatever the join/logoff rate might be.
If the Doveadm report is sufficiently frequent, then might these two measures help to capture the picture?
Ron
On 29/05/2012 19:13, Timo Sirainen wrote:
On 29.5.2012, at 21.03, Cor Bosman wrote:
es, I am getting a list of sessions/users every 5 minutes through cron. Im already using "doveadm stats dump session/user connected" Actually that's not really correct behavior either, since it ignores all the connections that happened during the 5 minutes if they don't exist at the time when you're asking for them. I'm not sure what the most correct way to do this kind of a graph would be :)
I muttered about some ideas for enhanced login/logout tracking some months back. Perhaps this would be another example of a motivation to use it for something? Could either the login scripting or a plugin be used to build this type of login tracking?
(My goal is to eventually do per user "are you logged in" tracking)
Just a thought
Ed W
On 29.05.2012 12:23, Cor Bosman wrote:
At first I thought maybe one of our 35 imap servers was having issues sending data, but all individual servers show this patters. Here is a bunch of individual servers: http://grab.by/dReC Anyone have any idea what could cause such a pattern? Maybe dovecot does some cleaning up of idle sessions at specific intervals? Or maybe our loadbalancers do, or the imapdirectors.
A shot in the dark ...
Maybe some kind of TCP or session timeout on a packet filtering device or loadbalancer? Maybe that time is shorter than the IMAP idle timeout. So TCP connections are "killed". Such a TCP stateful device may not send any active RST packets to the client. This way it's up to the client to recognize a broken TCP connection. This may then only occur when the client believes it's time to renew the IMAP idle and then finds the TCP connection gone.
Check the config/logs of any >= layer 4 devices for "session teardown" or "session timeout"
Check the dovecot logs and sort certain patterns by the minute. Maybe you find dovecot logging more "client timeout" or "connection reset by peer" at certain minutes than others. Maybe also group by other parts of the log entries such as usernames to find any patterns.
Connect to the imap server yourself and sniff an IMAP IDLE session with wireshark. Make sure you use the same path as the users ou there and not bypass the loadbalancer or whatever.
Regards
Christian
participants (5)
-
Christian Rohmann
-
Cor Bosman
-
Ed W
-
Ron Leach
-
Timo Sirainen