Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable
Eirik Rye
rye at domeneshop.no
Thu Jan 9 16:56:49 EET 2020
As a workaround for the titular issue, I have tried enabling the
"imap-hibernate" service on a couple of servers to reduce the number of
running imap processes.
Since ~50-60% of clients are in IDLE at any one time, this allows us to
reduce the number of running imap processes to less than half of what it
was.
After this I have yet to see the "net_connect_unix(imap) failed" errors
on these servers, however instead we are now instead seeing a few of these:
imap([redacted])<30266><[redacted]>: Error: imap-master: Failed to
import client state: Mailbox INBOX: Couldn't get recently expunged UIDs
(uidnext=13479 highest_modseq=1716)
and
imap([redacted])<20055><[redacted]>: Error: imap-master: Failed to
import client state: Mailbox INBOX: Couldn't send flag changes
imap([redacted])<20055><[redacted]>: NOTIFY failed in the middle of
FETCH reply in=0 out=0 deleted=0 expunged=0 trashed=0 hdr_count=0
hdr_bytes=0 body_count=0 body_bytes=0
The former I suspect may be related to our NFS-backed store and
simultaneous access from different, since many of the errors are
accompanied by other corruption related errors:
imap([redacted])<30266><[redacted]>: Error: Mailbox INBOX: vsize-hdr
has invalid size: 36
This whole upgrade is part of a migration to a director-backed system,
because we are seeing a too large amount of corruption errors
(locking/indexes, etc) assumed to be caused by NFS.
Once we introduce the director cluster in front, these errors should
hopefully be strongly reduced. :-)
--
Eirik
On 08/01/2020 15:14, Eirik Rye wrote:
> With proctitles below.
>
> Strangely, Dovecot is reporting 12679 imap-login sockets in "pre-login"
> state, which is about 50% more than there are active TCP connections to
> the server (8271, of which 7546 are in "imap-login" process according to
> `ss -tp`).
>
> On another server, not currently experiencing this issue, the number of
> TCP sockets appears to more closely match the number of imap-login
> sockets used (either in pre-login or TLS proxing).
>
> imap:~# ps -f --ppid `pidof dovecot` | egrep -v "dovecot/(imap|pop3) "
> UID PID PPID C STIME TTY TIME CMD
> dovenull 5262 5260 0 11:48 ? 00:00:00 dovecot/pop3-login
> dovenull 5263 5260 2 11:48 ? 00:03:53 dovecot/imap-login [724
> pre-login + 113 TLS proxies]
> dovecot 5264 5260 0 11:48 ? 00:01:15 dovecot/anvil [21
> connections]
> root 5265 5260 1 11:48 ? 00:02:18 dovecot/log
> dovenull 5266 5260 0 11:48 ? 00:00:00 dovecot/pop3-login
> dovenull 5267 5260 0 11:48 ? 00:00:00 dovecot/pop3-login
> dovenull 5268 5260 0 11:48 ? 00:00:00 dovecot/pop3-login
> dovenull 5269 5260 20 11:48 ? 00:35:34 dovecot/imap-login [1073
> pre-login + 1199 TLS proxies]
> dovenull 5270 5260 1 11:48 ? 00:02:25 dovecot/imap-login [615
> pre-login + 67 TLS proxies]
> dovenull 5271 5260 0 11:48 ? 00:01:39 dovecot/imap-login [489
> pre-login + 44 TLS proxies]
> dovenull 5272 5260 14 11:48 ? 00:24:44 dovecot/imap-login [938
> pre-login + 720 TLS proxies]
> dovenull 5273 5260 6 11:48 ? 00:10:30 dovecot/imap-login [845
> pre-login + 242 TLS proxies]
> dovenull 5274 5260 4 11:48 ? 00:06:54 dovecot/imap-login [817
> pre-login + 209 TLS proxies]
> dovenull 5275 5260 0 11:48 ? 00:01:23 dovecot/imap-login [445
> pre-login + 36 TLS proxies]
> dovenull 5276 5260 25 11:48 ? 00:43:14 dovecot/imap-login [1064
> pre-login + 1434 TLS proxies]
> dovenull 5277 5260 1 11:48 ? 00:02:19 dovecot/imap-login [523
> pre-login + 58 TLS proxies]
> dovenull 5278 5260 9 11:48 ? 00:16:19 dovecot/imap-login [937
> pre-login + 462 TLS proxies]
> dovenull 5279 5260 19 11:48 ? 00:33:24 dovecot/imap-login [937
> pre-login + 823 TLS proxies]
> dovenull 5280 5260 1 11:48 ? 00:03:04 dovecot/imap-login [655
> pre-login + 92 TLS proxies]
> dovenull 5281 5260 26 11:48 ? 00:45:20 dovecot/imap-login [969
> pre-login + 1450 TLS proxies]
> dovenull 5282 5260 6 11:48 ? 00:10:55 dovecot/imap-login [917
> pre-login + 303 TLS proxies]
> dovenull 5283 5260 4 11:48 ? 00:08:36 dovecot/imap-login [731
> pre-login + 128 TLS proxies]
> root 5284 5260 2 11:48 ? 00:03:55 dovecot/config
> dovecot 5285 5260 1 11:48 ? 00:02:19 dovecot/stats [7968
> connections]
> dovecot 5286 5260 3 11:48 ? 00:05:53 dovecot/auth [138 wait,
> 0 passdb, 0 userdb]
> 528246 10322 5260 0 14:36 ? 00:00:00 [imap]
> root 16688 5260 0 14:40 ? 00:00:00 dovecot/imap
> imap02:~# doveadm process status | grep "^imap-login " | awk "{sum +=
> 2500-\$3} END {print sum}"
> 20081
> imap:~# ps -f --ppid `pidof dovecot` | grep "dovecot/imap " | wc -l
> 7937
> imap:~# doveadm process status | grep "^imap " | wc -l
> 7937
> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap\"" |
> wc -l
> 559
> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep
> "\"imap-login\"" | wc -l
> 7546
> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | wc -l
> 8271
>
> On 08/01/2020 11:33, Eirik Rye wrote:
>> On 08/01/2020 11:09, Aki Tuomi wrote:
>>> Can you enable 'verbose_proctitle=yes' and maybe compare with ss -tp
>>> output?
>>
>> Thanks for the response!
>>
>> What exactly did you want to compare? `ss -tp` does not appear to show
>> cmdline/process title. I enabled it for one server, but will have to
>> wait for users to reconnect and for the issue to reappear there.
>>
>> In the meantime, on another server experiencing the same issue at this
>> moment (same configuration), but no proctitles:
>>
>> root at imap03:~# doveadm process status | grep "^imap " | wc -l
>> 7564
>> root at imap03:~# ps aux | grep "dovecot/imap$" | wc -l
>> 7570
>> root at imap03:~# ss -tp | grep '"imap-login"' | wc -l
>> 8009
>> root at imap03:~# ss -tp | grep '"imap"' | wc -l
>> 536
>> root at imap03:~# ss -nt "( sport = :143 or sport = :993 )" | wc -l
>> 8739
>> root at imap03:~# doveadm who -1 | wc -l
>> 7581
>>
>> The `ss -tp` output appears normal (e.g. 8009+536 ~= 8739).
>>
>
>
More information about the dovecot
mailing list