Hi all,

Because I think this is a race condition so instead of using only one current_ioloop:
I tested my patch with both "doveadm quota get -A" and "doveadm quota get -u xxx" many times. No error occurs (timeout leak, segment fault... etc).
If you are interested in my patch, any comment is highly appreciated. I modified source files which might be shared with other doveadm commands so I'm not sure it's safe 100%.

Thanks,
Anh Do

On Wed, 27 Jan 2021 at 16:20, Duc Anh Do <doducanh2710@gmail.com> wrote:
Hi all,

I have one Dovecot Director, two Dovecot Backends and one LDAP server with about 500 users. I would like to run doveadm quota get -A from the Director.
In each Backend, there is no problem when run the command:
# doveadm quota get -A
user1     User quota STORAGE     0 1048576000000000                                          0
user1     User quota MESSAGE     0                -                                          0

user500   User quota STORAGE     0 1048576000000000                                          0
user500   User quota MESSAGE     0                -                                          0


However, when I run from the Director, the command might stuck in an infinity loop (I have to terminate to quit):
# doveadm quota get -A
user1     User quota STORAGE     0 1048576000000000                                          0
user1     User quota MESSAGE     0                -                                          0

user49    User quota STORAGE     0 1048576000000000                   0
user49    User quota MESSAGE     0                -                   0
user66    User quota STORAGE     0 1048576000000000                   0
user66    User quota MESSAGE     0                -                   0
^Cdoveadm(user86): Error: doveadm server failure
doveadm: Error: Failed to iterate through some users
doveadm: Error: backend2.local:24245: Command quota get failed for user53: EOF
doveadm: Error: backend1.local:24245: Command quota get failed for user66: EOF
doveadm: Error: Aborted


This problem occurs in both Dovecot 2.2.36 and Dovecot 2.3.11, 2.3.13 (I build Dovecot from source). It's ok for me to get quota of one user from the Director:
# doveadm quota get -u user1
Quota name Type    Value            Limit                        %
User quota STORAGE     0 1048576000000000                        0
User quota MESSAGE     0                -                        0
And if there's only one Backend, doveadm quota get -A from the Director works well too.

After investigating, I found the infinity loop:
File src/doveadm/doveadm-mail-server.c:
static void doveadm_server_flush_one(struct doveadm_server *server)
{
   unsigned int count = array_count(&server->queue);

   do {
     io_loop_run(current_ioloop);
   } while (array_count(&server->queue) == count &&
     doveadm_server_have_used_connections(server) &&
     !DOVEADM_MAIL_SERVER_FAILED());
}


In case there're many Backends, I see only global variable current_ioloop is used to notify in the callback function. Might this be a race condition?
I understand there's a workaround to do my work:
  • Run doveadm user '*' to get all users
  • Loop through all users and run doveadm quota get -u xxx

Thanks,
Anh Do


--
Thanks,
Duc Anh
----------------------------------------------------------------------------------------------------
Email: doducanh2710@gmail.com
Skype: ducanh.do88
Mobile: +84975730526