Op 08/01/2019 om 11:05 schreef Ludovic Pouzenc:
Hi,
I can confirm that in the bad behavior, the 5 second delay occurs at each AUTH in our case. I think the configuration we have kill auth process at each end of AUTH (and fork a new one for next next AUTH). I think the "disable" flag is local to the process that is killed / respawned.
Can you explain this further? The auth process is supposed to be a long-running process. It is not killed after the first client and I don't see anything in your config that would do that.
A collegue of mine, Laurent Guerby, has found a workaround for us and it's findings seems very valuable for this ticket.
We now know how to not having the troubles in our setup : by manual removing the auth-penalty socket with "rm" (or by rebooting the whole VM because the socket is in a tmpfs). The rm is needed only one time.
I think now a clean reproducer of the problem is :
(I go through a fairly similar one but not this exact one)
- take a vanilla Debian 9 (probably not limited this particular version/distro)
- make sure use have non-root local unix user account (adduser testimap for instance)
- apt install dovecot-imapd
- keep default config (it may need TLS certificate addition, default config make uses of auth-penalty and system auth, don't touch that now)
- let it start at least one time (should be already started after install)
- try at least 1 auth via IMAP (against a local unix user account
- verify that auth-penalty socket exists
- check with lsof if some process have opened it o Should see 1 dovecot process
- change dovecot config to not use penalty o service anvil { unix_listener anvil-auth-penalty { mode = 00 } }
- restart dovecot
- check if the auth-penalty socket exists o For me it persists and it seems now to be the root problem
- (here you can do anything like dovecot stop / start, dovecot will never remove the auth-penalty socket by itself)
- check with lsof if some process have opened it o No process
- try at least 2 auth (against a a local unix user account) via IMAP o You should have 5 second penalty twice
The socket open retries loop happens if the socket exists but noone "listen" on it. The socket open retries loop don't happen if the socket does not exists at all.
(I think there is 1 try, the system returns non existing error, this bails out)
Regards, Ludovic
On 06/01/2019 20:04, Stephan Bosch wrote:
Op 20/12/2018 om 18:09 schreef Ludovic Pouzenc:
Hi,
I hit a bizare problem with dovecot 2.2.7 on debian 9 with LMTP enabled and auth/penalty disabled as documented here : https://wiki.dovecot.org/Authentication/Penalty
Use case : I run a swaks command to send an email to an exim4 that tries to make a callout to dovecot-lmtp. At RCPT TO: swaks hangs 5.0<something-small> seconds then process normally (exim is waiting for callout completion).
with strace, I see 5 second delay with many tries to socket connection.
The sources confirm this behavior, but only for the first connection attempt (i.e. the first 5 seconds of reconnections), which happens at auth startup (which then takes 5 seconds longer). After that, it will mark auth-penalty as disabled and it will not try connecting to it again. Is that what you're referring to?
Regards,
Stephan.
(PID 9652 was an auth process freshly forked)
[pid 9652] socket(AF_UNIX, SOCK_STREAM, 0) = 14 [pid 9652] fcntl(14, F_GETFL) = 0x2 (flags O_RDWR) [pid 9652] fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0 [pid 9652] connect(14, {sa_family=AF_UNIX, sun_path="anvil-auth-penalty"}, 110) = -1 ECONNREFUSED (Connection refused) [pid 9652] close(14) = 0 [pid 9652] nanosleep({tv_sec=0, tv_nsec=20000000}, NULL) = 0 [pid 9652] socket(AF_UNIX, SOCK_STREAM, 0) = 14 [pid 9652] fcntl(14, F_GETFL) = 0x2 (flags O_RDWR) [pid 9652] fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0 [pid 9652] connect(14, {sa_family=AF_UNIX, sun_path="anvil-auth-penalty"}, 110) = -1 ECONNREFUSED (Connection refused) [pid 9652] close(14) = 0 [pid 9652] nanosleep({tv_sec=0, tv_nsec=50000000}, NULL) = 0 [pid 9652] socket(AF_UNIX, SOCK_STREAM, 0) = 14 [pid 9652] fcntl(14, F_GETFL) = 0x2 (flags O_RDWR) [pid 9652] fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0 [pid 9652] connect(14, {sa_family=AF_UNIX, sun_path="anvil-auth-penalty"}, 110) = -1 ECONNREFUSED (Connection refused) [pid 9652] close(14) = 0 [pid 9652] nanosleep({tv_sec=0, tv_nsec=90000000}, NULL) = 0
with ddd I get to: src/lib-master/anvil-client.c: int anvil_client_connect(struct anvil_client *client, bool retry) { //[...] fd = retry ? net_connect_unix_with_retries(client->path, 5000) : net_connect_unix(client->path); //[...]
and retry is forced to TRUE by the caller and net_connect_unix_with_retries retries in case of ECONNREFUSED.
How I get into it :
ssh -X root@telegraphe-test
apt install dovecot-dbg cd /dev/shm apt source dovecot cd dovecot-2.2.27 pidof dovecot ddd /usr/sbin/dovecot set follow-fork-mode child attach 11833 cont
# Send an e-mail from swaks to exim that use transport=smtp in LMTP mode to dovecot-lmtp, press interrupt while exim is blocking after RCPT TO:
(gdb) bt #0 0x00007f81528af270 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84 #1 0x00007f81528d8b84 in usleep (useconds=<optimized out>) at ../sysdeps/posix/usleep.c:32 #2 0x00007f815329368e in net_connect_unix_with_retries (path=0x562dc200ce10 <error: Cannot access memory at address 0x562dc200ce10>, msecs=msecs@entry=5000) at net.c:347 #3 0x00007f815320ce0d in anvil_client_connect (client=0x562dc200cda0, retry=retry@entry=true) at anvil-client.c:136 #4 0x0000562dc1429f5d in auth_penalty_init (path=0x562dc1451846 <error: Cannot access memory at address 0x562dc1451846>) at auth-penalty.c:37 #5 0x0000562dc1424166 in main_preinit () at main.c:202 #6 main (argc=<optimized out>, argv=<optimized out>) at main.c:396
(gdb) bt full #1 0x00007f81528d8b84 in usleep (useconds=<optimized out>) at ../sysdeps/posix/usleep.c:32 ts = {tv_sec = 0, tv_nsec = 80000000} #2 0x00007f815329368e in net_connect_unix_with_retries (path=0x562dc200ce10 <error: Cannot access memory at address 0x562dc200ce10>, msecs=msecs@entry=5000) at net.c:347 start = {tv_sec = 1545319486, tv_usec = 241864} now = {tv_sec = 1545319490, tv_usec = 198061} fd = -1 #3 0x00007f815320ce0d in anvil_client_connect (client=0x562dc200cda0, retry=retry@entry=true) at anvil-client.c:136 fd = <optimized out> __FUNCTION__ = <error reading variable __FUNCTION__ (Cannot access memory at address 0x7f81532a8260)> #4 0x0000562dc1429f5d in auth_penalty_init (path=0x562dc1451846 <error: Cannot access memory at address 0x562dc1451846>) at auth-penalty.c:37 No locals. #5 0x0000562dc1424166 in main_preinit () at main.c:202 mod_set = <error reading variable mod_set (Cannot access memory at address 0x7ffda885db80)> #6 main (argc=<optimized out>, argv=<optimized out>) at main.c:396 c = <optimized out> Backtrace stopped: Cannot access memory at address 0x7ffda885dbf8 (gdb)
Code I've read :
src/auth/main.c: static void main_preinit(void)
if (!worker) auth_penalty = auth_penalty_init(AUTH_PENALTY_ANVIL_PATH); auth_request_stats_init();
src/auth/auth-penalty.c: struct auth_penalty *auth_penalty_init(const char *path)
if (anvil_client_connect(penalty->client, TRUE) < 0) penalty->disabled = TRUE; else {
src/lib-master/anvil-client.c: int anvil_client_connect(struct anvil_client *client, bool retry)
fd = retry ? net_connect_unix_with_retries(client->path, 5000) : net_connect_unix(client->path);
src/lib/net.c : int net_connect_unix_with_retries(const char *path, unsigned int msecs) { struct timeval start, now; int fd; if (gettimeofday(&start, NULL) < 0) i_panic("gettimeofday() failed: %m"); do { fd = net_connect_unix(path); if (fd != -1 || (errno != EAGAIN && errno != ECONNREFUSED)) // <------------------ HERE, I go into ECONNREFUSED case during 5 seconds break; /* busy. wait for a while. */ usleep(((rand() % 10) + 1) * 10000); if (gettimeofday(&now, NULL) < 0) i_panic("gettimeofday() failed: %m"); } while (timeval_diff_msecs(&now, &start) < (int)msecs); return fd; }
Additionnal environment infos (mailboxes are NFS-backed) :
lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 9.2 (stretch) Release: 9.2 Codename: stretch
mount | grep ^/ /dev/sda1 on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)
mount | grep nfs nfs1:/mail/maildir on /var/maildir type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.2.28,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=172.16.2.28) nfs1:/mail on /var/mail type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.2.28,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=172.16.2.28) cifs1:/homepermanents/info/lpouzenc on /home/lpouzenc type nfs (rw,nosuid,nodev,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.2.29,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=172.16.2.29)
telegraphe-test:~# apt-cache policy dovecot-core dovecot-core: Installé : 1:2.2.27-3+deb9u2 Candidat : 1:2.2.27-3+deb9u2 Table de version : *** 1:2.2.27-3+deb9u2 500 500http://miroir/debian stretch/main amd64 Packages 500http://security.debian.org/debian-security stretch/updates/main amd64 Packages 100 /var/lib/dpkg/status
telegraphe-test:~# dovecot -n # 2.2.27 (c0f36b0): /etc/dovecot/dovecot.conf # Pigeonhole version 0.4.16 (fed8554) doveconf: Warning: service auth { client_limit=10000 } is lower than required under max. load (10240) # OS: Linux 4.9.0-4-amd64 x86_64 Debian 9.2 nfs auth_verbose = yes default_client_limit = 10000 default_process_limit = 2048 lda_mailbox_autocreate = yes lda_mailbox_autosubscribe = yes mail_debug = yes mail_fsync = always mail_location = maildir:/var/mail/maildir/%u managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext mmap_disable = yes namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = separator = / } passdb { driver = pam } plugin { sieve =file:/var/mail/mailconf/%1u/%u/sieve;active=/var/mail/mailconf/%1u/%u/.dovecot.sieve sieve_before = /var/mail/mailconf/global/sieve/before.d sieve_global = /var/mail/mailconf/global/sieve sieve_vacation_dont_check_recipient = yes } protocols = " imap lmtp sieve sieve" service anvil { unix_listener anvil-auth-penalty { mode = 00 } } service lmtp { inet_listener lmtp { port = 24 } } service managesieve { process_limit = 16 } ssl_cert = </etc/ssl/certs/telegraphe7_cert.pem ssl_key = # hidden, use -P to show it syslog_facility = local0 userdb { driver = passwd } verbose_proctitle = yes protocol lmtp { auth_username_format = %n mail_plugins = " sieve" } protocol imap { mail_max_userip_connections = 20 }
Could you see a config tuning that keeps auth-penalty out of the way (it seems to behaves badly here, because we have perdition IMAP proxy in front, and it keeps track only of his IP and adds 2/4/8/15s delays to "random" legit clients.
Could it be saved with a dedicated config entry to disable auth-penalty ?
Is there solution to have auth-penalty working while having perdition IMAP proxy ?
Regards, Ludovic Pouzenc
Ludovic POUZENC - Ingénieur Système IMT Mines Albi (http://www.imt-mines-albi.fr) 81013 ALBI CT Cedex 09 - Tel : 05 63 49 33 56
-- Ludovic POUZENC - Ingénieur Système IMT Mines Albi (http://www.imt-mines-albi.fr) 81013 ALBI CT Cedex 09 - Tel : 05 63 49 33 56