Authentication/Penalty disabled (socket mode=0) introduces constant 5 sec delays (2.27 on debian 9)

Stephan Bosch stephan at rename-it.nl
Sun Jan 20 15:05:27 EET 2019



Op 08/01/2019 om 11:05 schreef Ludovic Pouzenc:
>
> Hi,
>
> I can confirm that in the bad behavior, the 5 second delay occurs at 
> each AUTH in our case. I think the configuration we have kill auth 
> process at each end of AUTH (and fork a new one for next next AUTH). I 
> think the "disable" flag is local to the process that is killed / 
> respawned.
>

Can you explain this further? The auth process is supposed to be a 
long-running process. It is not killed after the first client and I 
don't see anything in your config that would do that.





> A collegue of mine, Laurent Guerby, has found a workaround for us and 
> it's findings seems very valuable for this ticket.
>
> We now know how to not having the troubles in our setup : by manual 
> removing the auth-penalty socket with "rm" (or by rebooting the whole 
> VM because the socket is in a tmpfs). The rm is needed only one time.
>
> I think now a clean reproducer of the problem is :
>
> (I go through a fairly similar one but not this exact one)
>
>   * take a vanilla Debian 9 (probably not limited this particular
>     version/distro)
>   * make sure use have non-root local unix user account (adduser
>     testimap for instance)
>   * apt install dovecot-imapd
>   * keep default config (it may need TLS certificate addition, default
>     config make uses of auth-penalty and system auth, don't touch that
>     now)
>   * let it start at least one time (should be already started after
>     install)
>   * try at least 1 auth via IMAP (against a local unix user account
>   * verify that auth-penalty socket exists
>   * check with lsof if some process have opened it
>       o Should see 1 dovecot process
>   * change dovecot config to not use penalty
>       o service anvil {    unix_listener anvil-auth-penalty { mode =
>         00    } }
>   * restart dovecot
>   * check if the auth-penalty socket exists
>       o For me it persists and it seems now to be the root problem
>   * (here you can do anything like dovecot stop / start, dovecot will
>     never remove the auth-penalty socket by itself)
>   * check with lsof if some process have opened it
>       o No process
>   * try at least 2 auth (against a a local unix user account) via IMAP
>       o You should have 5 second penalty twice
>
> The socket open retries loop happens if the socket exists but noone 
> "listen" on it.
> The socket open retries loop don't happen if the socket does not 
> exists at all.
>
> (I think there is 1 try, the system returns non existing error, this 
> bails out)
>
> Regards,
> Ludovic
>
> On 06/01/2019 20:04, Stephan Bosch wrote:
>>
>> Op 20/12/2018 om 18:09 schreef Ludovic Pouzenc:
>>> Hi,
>>>
>>> I hit a bizare problem with dovecot 2.2.7 on debian 9 with LMTP 
>>> enabled and auth/penalty disabled as documented here :
>>> https://wiki.dovecot.org/Authentication/Penalty
>>>
>>> Use case : I run a swaks command to send an email to an exim4 that 
>>> tries to make a callout to dovecot-lmtp.
>>> At RCPT TO: swaks hangs 5.0<something-small> seconds then process 
>>> normally (exim is waiting for callout completion).
>>>
>>> with strace, I see 5 second delay with many tries to socket connection.
>>
>> The sources confirm this behavior, but only for the first connection 
>> attempt (i.e. the first 5 seconds of reconnections), which happens at 
>> auth startup (which then takes 5 seconds longer). After that, it will 
>> mark auth-penalty as disabled and it will not try connecting to it 
>> again. Is that what you're referring to?
>>
>>
>> Regards,
>>
>> Stephan.
>>
>>
>>
>>
>>>
>>> (PID 9652 was an auth process freshly forked)
>>>
>>> [pid  9652] socket(AF_UNIX, SOCK_STREAM, 0) = 14
>>> [pid  9652] fcntl(14, F_GETFL)          = 0x2 (flags O_RDWR)
>>> [pid  9652] fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0
>>> [pid  9652] connect(14, {sa_family=AF_UNIX, 
>>> sun_path="anvil-auth-penalty"}, 110) = -1 ECONNREFUSED (Connection 
>>> refused)
>>> [pid  9652] close(14)                   = 0
>>> [pid  9652] nanosleep({tv_sec=0, tv_nsec=20000000}, NULL) = 0
>>> [pid  9652] socket(AF_UNIX, SOCK_STREAM, 0) = 14
>>> [pid  9652] fcntl(14, F_GETFL)          = 0x2 (flags O_RDWR)
>>> [pid  9652] fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0
>>> [pid  9652] connect(14, {sa_family=AF_UNIX, 
>>> sun_path="anvil-auth-penalty"}, 110) = -1 ECONNREFUSED (Connection 
>>> refused)
>>> [pid  9652] close(14)                   = 0
>>> [pid  9652] nanosleep({tv_sec=0, tv_nsec=50000000}, NULL) = 0
>>> [pid  9652] socket(AF_UNIX, SOCK_STREAM, 0) = 14
>>> [pid  9652] fcntl(14, F_GETFL)          = 0x2 (flags O_RDWR)
>>> [pid  9652] fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0
>>> [pid  9652] connect(14, {sa_family=AF_UNIX, 
>>> sun_path="anvil-auth-penalty"}, 110) = -1 ECONNREFUSED (Connection 
>>> refused)
>>> [pid  9652] close(14)                   = 0
>>> [pid  9652] nanosleep({tv_sec=0, tv_nsec=90000000}, NULL) = 0
>>>
>>>
>>> with ddd I get to:
>>> src/lib-master/anvil-client.c: int anvil_client_connect(struct 
>>> anvil_client *client, bool retry) {
>>> //[...]
>>> fd = retry ? net_connect_unix_with_retries(client->path, 5000) : 
>>> net_connect_unix(client->path);
>>> //[...]
>>>
>>> and retry is forced to TRUE by the caller and 
>>> net_connect_unix_with_retries retries in case of ECONNREFUSED.
>>>
>>>
>>> How I get into it :
>>>
>>> ssh -X root at telegraphe-test
>>>
>>> apt install dovecot-dbg
>>> cd /dev/shm
>>> apt source dovecot
>>> cd dovecot-2.2.27
>>> pidof dovecot
>>> ddd /usr/sbin/dovecot
>>> set follow-fork-mode child
>>> attach 11833
>>> cont
>>>
>>> # Send an e-mail from swaks to exim that use transport=smtp in LMTP 
>>> mode to dovecot-lmtp, press interrupt while exim is blocking after 
>>> RCPT TO:
>>>
>>>
>>> (gdb) bt
>>> #0  0x00007f81528af270 in __nanosleep_nocancel () at 
>>> ../sysdeps/unix/syscall-template.S:84
>>> #1  0x00007f81528d8b84 in usleep (useconds=<optimized out>) at 
>>> ../sysdeps/posix/usleep.c:32
>>> #2  0x00007f815329368e in net_connect_unix_with_retries 
>>> (path=0x562dc200ce10 <error: Cannot access memory at address 
>>> 0x562dc200ce10>, msecs=msecs at entry=5000) at net.c:347
>>> #3  0x00007f815320ce0d in anvil_client_connect 
>>> (client=0x562dc200cda0, retry=retry at entry=true) at anvil-client.c:136
>>> #4  0x0000562dc1429f5d in auth_penalty_init (path=0x562dc1451846 
>>> <error: Cannot access memory at address 0x562dc1451846>) at 
>>> auth-penalty.c:37
>>> #5  0x0000562dc1424166 in main_preinit () at main.c:202
>>> #6  main (argc=<optimized out>, argv=<optimized out>) at main.c:396
>>>
>>> (gdb) bt full
>>> #1  0x00007f81528d8b84 in usleep (useconds=<optimized out>) at 
>>> ../sysdeps/posix/usleep.c:32
>>>          ts = {tv_sec = 0, tv_nsec = 80000000}
>>> #2  0x00007f815329368e in net_connect_unix_with_retries 
>>> (path=0x562dc200ce10 <error: Cannot access memory at address 
>>> 0x562dc200ce10>, msecs=msecs at entry=5000) at net.c:347
>>>          start = {tv_sec = 1545319486, tv_usec = 241864}
>>>          now = {tv_sec = 1545319490, tv_usec = 198061}
>>>          fd = -1
>>> #3  0x00007f815320ce0d in anvil_client_connect 
>>> (client=0x562dc200cda0, retry=retry at entry=true) at anvil-client.c:136
>>>          fd = <optimized out>
>>>          __FUNCTION__ = <error reading variable __FUNCTION__ (Cannot 
>>> access memory at address 0x7f81532a8260)>
>>> #4  0x0000562dc1429f5d in auth_penalty_init (path=0x562dc1451846 
>>> <error: Cannot access memory at address 0x562dc1451846>) at 
>>> auth-penalty.c:37
>>> No locals.
>>> #5  0x0000562dc1424166 in main_preinit () at main.c:202
>>>          mod_set = <error reading variable mod_set (Cannot access 
>>> memory at address 0x7ffda885db80)>
>>> #6  main (argc=<optimized out>, argv=<optimized out>) at main.c:396
>>>          c = <optimized out>
>>> Backtrace stopped: Cannot access memory at address 0x7ffda885dbf8
>>> (gdb)
>>>
>>> Code I've read :
>>>
>>> src/auth/main.c: static void main_preinit(void)
>>>
>>>          if (!worker)
>>>                  auth_penalty = 
>>> auth_penalty_init(AUTH_PENALTY_ANVIL_PATH);
>>>          auth_request_stats_init();
>>>
>>> src/auth/auth-penalty.c: struct auth_penalty 
>>> *auth_penalty_init(const char *path)
>>>
>>>          if (anvil_client_connect(penalty->client, TRUE) < 0)
>>>                  penalty->disabled = TRUE;
>>>          else {
>>>
>>> src/lib-master/anvil-client.c: int anvil_client_connect(struct 
>>> anvil_client *client, bool retry)
>>>
>>>          fd = retry ? net_connect_unix_with_retries(client->path, 
>>> 5000) :
>>>                  net_connect_unix(client->path);
>>>
>>> src/lib/net.c : int net_connect_unix_with_retries(const char *path, 
>>> unsigned int msecs) {
>>>          struct timeval start, now;
>>>          int fd;
>>>               if (gettimeofday(&start, NULL) < 0)
>>>                  i_panic("gettimeofday() failed: %m");
>>>               do {
>>>                  fd = net_connect_unix(path);
>>>                  if (fd != -1 || (errno != EAGAIN && errno != 
>>> ECONNREFUSED))  // <------------------ HERE, I go into ECONNREFUSED 
>>> case during 5 seconds
>>>                          break;
>>>                       /* busy. wait for a while. */
>>>                  usleep(((rand() % 10) + 1) * 10000);
>>>                  if (gettimeofday(&now, NULL) < 0)
>>>                          i_panic("gettimeofday() failed: %m");
>>>          } while (timeval_diff_msecs(&now, &start) < (int)msecs);
>>>          return fd;
>>> }
>>>
>>> Additionnal environment infos (mailboxes are NFS-backed) :
>>>
>>> lsb_release -a
>>> No LSB modules are available.
>>> Distributor ID:    Debian
>>> Description:    Debian GNU/Linux 9.2 (stretch)
>>> Release:    9.2
>>> Codename:    stretch
>>>
>>>
>>> mount | grep ^/
>>> /dev/sda1 on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)
>>>
>>> mount | grep nfs
>>> nfs1:/mail/maildir on /var/maildir type nfs 
>>> (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.2.28,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=172.16.2.28)
>>> nfs1:/mail on /var/mail type nfs 
>>> (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.2.28,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=172.16.2.28)
>>> cifs1:/homepermanents/info/lpouzenc on /home/lpouzenc type nfs 
>>> (rw,nosuid,nodev,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.2.29,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=172.16.2.29)
>>>
>>>
>>>
>>> telegraphe-test:~# apt-cache policy dovecot-core
>>> dovecot-core:
>>>    Installé : 1:2.2.27-3+deb9u2
>>>    Candidat : 1:2.2.27-3+deb9u2
>>>   Table de version :
>>>   *** 1:2.2.27-3+deb9u2 500
>>>          500http://miroir/debian  stretch/main amd64 Packages
>>>          500http://security.debian.org/debian-security 
>>> stretch/updates/main amd64 Packages
>>>          100 /var/lib/dpkg/status
>>>
>>> telegraphe-test:~# dovecot -n
>>> # 2.2.27 (c0f36b0): /etc/dovecot/dovecot.conf
>>> # Pigeonhole version 0.4.16 (fed8554)
>>> doveconf: Warning: service auth { client_limit=10000 } is lower than 
>>> required under max. load (10240)
>>> # OS: Linux 4.9.0-4-amd64 x86_64 Debian 9.2 nfs
>>> auth_verbose = yes
>>> default_client_limit = 10000
>>> default_process_limit = 2048
>>> lda_mailbox_autocreate = yes
>>> lda_mailbox_autosubscribe = yes
>>> mail_debug = yes
>>> mail_fsync = always
>>> mail_location = maildir:/var/mail/maildir/%u
>>> managesieve_notify_capability = mailto
>>> managesieve_sieve_capability = fileinto reject envelope 
>>> encoded-character vacation subaddress comparator-i;ascii-numeric 
>>> relational regex imap4flags copy include variables body enotify 
>>> environment mailbox date index ihave duplicate mime foreverypart 
>>> extracttext
>>> mmap_disable = yes
>>> namespace inbox {
>>>    inbox = yes
>>>    location =
>>>    mailbox Drafts {
>>>      special_use = \Drafts
>>>    }
>>>    mailbox Junk {
>>>      special_use = \Junk
>>>    }
>>>    mailbox Sent {
>>>      special_use = \Sent
>>>    }
>>>    mailbox "Sent Messages" {
>>>      special_use = \Sent
>>>    }
>>>    mailbox Trash {
>>>      special_use = \Trash
>>>    }
>>>    prefix =
>>>    separator = /
>>> }
>>> passdb {
>>>    driver = pam
>>> }
>>> plugin {
>>>    sieve 
>>> =file:/var/mail/mailconf/%1u/%u/sieve;active=/var/mail/mailconf/%1u/%u/.dovecot.sieve
>>>    sieve_before = /var/mail/mailconf/global/sieve/before.d
>>>    sieve_global = /var/mail/mailconf/global/sieve
>>>    sieve_vacation_dont_check_recipient = yes
>>> }
>>> protocols = " imap lmtp sieve sieve"
>>> service anvil {
>>>    unix_listener anvil-auth-penalty {
>>>      mode = 00
>>>    }
>>> }
>>> service lmtp {
>>>    inet_listener lmtp {
>>>      port = 24
>>>    }
>>> }
>>> service managesieve {
>>>    process_limit = 16
>>> }
>>> ssl_cert = </etc/ssl/certs/telegraphe7_cert.pem
>>> ssl_key =  # hidden, use -P to show it
>>> syslog_facility = local0
>>> userdb {
>>>    driver = passwd
>>> }
>>> verbose_proctitle = yes
>>> protocol lmtp {
>>>    auth_username_format = %n
>>>    mail_plugins = " sieve"
>>> }
>>> protocol imap {
>>>    mail_max_userip_connections = 20
>>> }
>>>
>>>
>>>
>>> Could you see a config tuning that keeps auth-penalty out of the way 
>>> (it seems to behaves badly here, because we have perdition IMAP 
>>> proxy in front, and it keeps track only of his IP and adds 2/4/8/15s 
>>> delays to "random" legit clients.
>>>
>>> Could it be saved with a dedicated config entry to disable 
>>> auth-penalty ?
>>>
>>> Is there solution to have auth-penalty working while having 
>>> perdition IMAP proxy ?
>>>
>>> Regards,
>>> Ludovic Pouzenc
>>> -- 
>>> Ludovic POUZENC - Ingénieur Système
>>> IMT Mines Albi  (http://www.imt-mines-albi.fr)
>>> 81013 ALBI CT Cedex 09 - Tel : 05 63 49 33 56
> -- 
> Ludovic POUZENC - Ingénieur Système
> IMT Mines Albi  (http://www.imt-mines-albi.fr)
> 81013 ALBI CT Cedex 09 - Tel : 05 63 49 33 56



More information about the dovecot mailing list