dsync replication fails with No space left on device / Out of memory
Hi All
Since I configured dsync replication I get strange errors in the maillog on my two mail dovecot nodes:
PRIMARY: Jul 2 01:21:42 mx01.example.com dovecot: doveadm: Error: read(mx02.example.com) failed: read(size=3148) failed: Connection reset by peer (last sent=mail, last recv=mail (EOL))
The secondary is more interesting:
SECONDARY Jul 2 01:21:42 mx02 dovecot: doveadm: Error: close(-1[istream-seekable.c:237]) failed: No space left on device Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: pool_system_realloc(268435456): Out of memory Jul 2 01:21:43 mx02 dovecot: doveadm: Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7f2e9be4c92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7f2e9be4ca0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7f2e9bddc3d3] -> /usr/lib64/dovecot/libdo Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: master: service(doveadm): child 2876 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump) Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7fd56e17e92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7fd56e17ea0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7fd56e10e3d3] -> /us Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Fatal: master: service(doveadm): child 2882 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump)
The error messages state that disk space and/or memory is a problem, but disk space and memory is enough available:
mx02 [~] # df -h /srv/mail/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/system-mail 10G 5.7G 4.3G 58% /srv/mail
mx02 [~] # free -m total used free shared buff/cache available Mem: 3789 1602 1088 199 1097 1759 Swap: 471 93 378
I also tried to increase vsz_limit from 256 MB to 512 MB, which did not help.
And for the sake of completness also the connection to the doveadm port works well from both nodes:
mx01-prod [~] # telnet mx02 14310 Trying 172.20.19.225... Connected to mx02. Escape character is '^]'. ^]
mx02 [~] # telnet mx01 14310 Trying 172.20.19.251... Connected to mx01. Escape character is '^]'. ^]
Although mail replication seems to be working properly and mails are in sync on both nodes (as what I could see), I would like to find the cause of this messages, as this does definetely don’t look normal…
I’m grateful for any help, since I’m quite on a struggle now…
Steven
Here’s my config
# doveconf -n # 2.2.36 (1f10bfa63): /etc/dovecot/dovecot.conf # Pigeonhole version 0.4.24 (124e06aa) # OS: Linux 3.10.0-1160.31.1.el7.x86_64 x86_64 CentOS Linux release 7.9.2009 (Core) # Hostname: mx01.example.com auth_mechanisms = plain login auth_verbose = yes dict { sqlquota = mysql:/etc/dovecot/dict-sqlquota.conf.ext } doveadm_password = # hidden, use -P to show it doveadm_port = 14310 first_valid_uid = 1000 mail_plugins = quota notify replication managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext mbox_write_locks = fcntl namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = separator = / type = private } passdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_replica = tcp:mx02.example.com quota = maildir:User quota quota_exceeded_message = Quota exceeded, please go to http://www.example.com/over_quota_help for instructions on how to fix this. quota_rule2 = INBOX.Trash:storage=+100M quota_status_nouser = DUNNO quota_status_overquota = 552 5.2.2 Mailbox is full / Mailbox ist voll quota_status_success = DUNNO quota_warning = storage=90%% quota-warning 90 %u quota_warning2 = -storage=90%% quota-warning below %u sieve = file:~/sieve;active=~/.dovecot.sieve } postmaster_address = postmaster@example.com protocols = imap pop3 lmtp sieve replication_dsync_parameters = -d -l 30 -U service aggregator { fifo_listener replication-notify-fifo { user = vmail } unix_listener replication-notify { user = vmail } } service auth { unix_listener /var/spool/postfix/private/auth { group = postfix mode = 0660 user = postfix } unix_listener auth-userdb { user = vmail } } service dict { unix_listener dict { user = vmail } } service doveadm { inet_listener { port = 14310 ssl = no } } service managesieve-login { inet_listener sieve { port = 4190 } } service quota-status { client_limit = 1 executable = quota-status -p postfix inet_listener { port = 14340 } } service quota-warning { executable = script /usr/local/libexec/dovecot/quota-warning.sh unix_listener quota-warning { user = vmail } user = vmail } service replicator { process_min_avail = 1 unix_listener replicator-doveadm { mode = 0600 user = vmail } } ssl = required ssl_cert = </etc/ssl/acme/certs/mail.example.com.chain.crt ssl_key = # hidden, use -P to show it userdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } verbose_proctitle = yes protocol lmtp { mail_plugins = quota notify replication sieve } protocol lda { mail_plugins = quota notify replication sieve } protocol imap { mail_max_userip_connections = 20 mail_plugins = quota notify replication imap_quota }
mx02.example.com has exact the same config, except of:
plugin { mail_replica = tcp:mx01.example.com
Inodes? df -i
On 7/1/2021 5:07 PM, Steven Varco wrote:
Hi All
Since I configured dsync replication I get strange errors in the maillog on my two mail dovecot nodes:
PRIMARY: Jul 2 01:21:42 mx01.example.com dovecot: doveadm: Error: read(mx02.example.com) failed: read(size=3148) failed: Connection reset by peer (last sent=mail, last recv=mail (EOL))
The secondary is more interesting:
SECONDARY Jul 2 01:21:42 mx02 dovecot: doveadm: Error: close(-1[istream-seekable.c:237]) failed: No space left on device Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: pool_system_realloc(268435456): Out of memory Jul 2 01:21:43 mx02 dovecot: doveadm: Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7f2e9be4c92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7f2e9be4ca0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7f2e9bddc3d3] -> /usr/lib64/dovecot/libdo Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: master: service(doveadm): child 2876 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump) Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7fd56e17e92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7fd56e17ea0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7fd56e10e3d3] -> /us Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Fatal: master: service(doveadm): child 2882 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump)
The error messages state that disk space and/or memory is a problem, but disk space and memory is enough available:
mx02 [~] # df -h /srv/mail/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/system-mail 10G 5.7G 4.3G 58% /srv/mail
mx02 [~] # free -m total used free shared buff/cache available Mem: 3789 1602 1088 199 1097 1759 Swap: 471 93 378
I also tried to increase vsz_limit from 256 MB to 512 MB, which did not help.
And for the sake of completness also the connection to the doveadm port works well from both nodes:
mx01-prod [~] # telnet mx02 14310 Trying 172.20.19.225... Connected to mx02. Escape character is '^]'. ^]
mx02 [~] # telnet mx01 14310 Trying 172.20.19.251... Connected to mx01. Escape character is '^]'. ^]
Although mail replication seems to be working properly and mails are in sync on both nodes (as what I could see), I would like to find the cause of this messages, as this does definetely don’t look normal…
I’m grateful for any help, since I’m quite on a struggle now…
Steven
Here’s my config
# doveconf -n # 2.2.36 (1f10bfa63): /etc/dovecot/dovecot.conf # Pigeonhole version 0.4.24 (124e06aa) # OS: Linux 3.10.0-1160.31.1.el7.x86_64 x86_64 CentOS Linux release 7.9.2009 (Core) # Hostname: mx01.example.com auth_mechanisms = plain login auth_verbose = yes dict { sqlquota = mysql:/etc/dovecot/dict-sqlquota.conf.ext } doveadm_password = # hidden, use -P to show it doveadm_port = 14310 first_valid_uid = 1000 mail_plugins = quota notify replication managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext mbox_write_locks = fcntl namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = separator = / type = private } passdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_replica = tcp:mx02.example.com quota = maildir:User quota quota_exceeded_message = Quota exceeded, please go to http://www.example.com/over_quota_help for instructions on how to fix this. quota_rule2 = INBOX.Trash:storage=+100M quota_status_nouser = DUNNO quota_status_overquota = 552 5.2.2 Mailbox is full / Mailbox ist voll quota_status_success = DUNNO quota_warning = storage=90%% quota-warning 90 %u quota_warning2 = -storage=90%% quota-warning below %u sieve = file:~/sieve;active=~/.dovecot.sieve } postmaster_address = postmaster@example.com protocols = imap pop3 lmtp sieve replication_dsync_parameters = -d -l 30 -U service aggregator { fifo_listener replication-notify-fifo { user = vmail } unix_listener replication-notify { user = vmail } } service auth { unix_listener /var/spool/postfix/private/auth { group = postfix mode = 0660 user = postfix } unix_listener auth-userdb { user = vmail } } service dict { unix_listener dict { user = vmail } } service doveadm { inet_listener { port = 14310 ssl = no } } service managesieve-login { inet_listener sieve { port = 4190 } } service quota-status { client_limit = 1 executable = quota-status -p postfix inet_listener { port = 14340 } } service quota-warning { executable = script /usr/local/libexec/dovecot/quota-warning.sh unix_listener quota-warning { user = vmail } user = vmail } service replicator { process_min_avail = 1 unix_listener replicator-doveadm { mode = 0600 user = vmail } } ssl = required ssl_cert = </etc/ssl/acme/certs/mail.example.com.chain.crt ssl_key = # hidden, use -P to show it userdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } verbose_proctitle = yes protocol lmtp { mail_plugins = quota notify replication sieve } protocol lda { mail_plugins = quota notify replication sieve } protocol imap { mail_max_userip_connections = 20 mail_plugins = quota notify replication imap_quota }
mx02.example.com has exact the same config, except of:
plugin { mail_replica = tcp:mx01.example.com
Hi,
the memory issue has already been reported, not resolved yet:
https://www.mail-archive.com/dovecot@dovecot.org/msg83763.html
the disk-free issue is something different. Increasing memory parameters doesn't help- the sync only crashes later.
Here, everything seems to be synced fine nevertheless.
Am 02.07.21 um 02:56 schrieb Harlan Stenn:
Inodes? df -i
On 7/1/2021 5:07 PM, Steven Varco wrote:
Hi All
Since I configured dsync replication I get strange errors in the maillog on my two mail dovecot nodes:
PRIMARY: Jul 2 01:21:42 mx01.example.com dovecot: doveadm: Error: read(mx02.example.com) failed: read(size=3148) failed: Connection reset by peer (last sent=mail, last recv=mail (EOL))
The secondary is more interesting:
SECONDARY Jul 2 01:21:42 mx02 dovecot: doveadm: Error: close(-1[istream-seekable.c:237]) failed: No space left on device Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: pool_system_realloc(268435456): Out of memory Jul 2 01:21:43 mx02 dovecot: doveadm: Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7f2e9be4c92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7f2e9be4ca0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7f2e9bddc3d3] -> /usr/lib64/dovecot/libdo Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: master: service(doveadm): child 2876 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump) Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7fd56e17e92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7fd56e17ea0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7fd56e10e3d3] -> /us Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Fatal: master: service(doveadm): child 2882 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump)
The error messages state that disk space and/or memory is a problem, but disk space and memory is enough available:
mx02 [~] # df -h /srv/mail/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/system-mail 10G 5.7G 4.3G 58% /srv/mail
mx02 [~] # free -m total used free shared buff/cache available Mem: 3789 1602 1088 199 1097 1759 Swap: 471 93 378
I also tried to increase vsz_limit from 256 MB to 512 MB, which did not help.
And for the sake of completness also the connection to the doveadm port works well from both nodes:
mx01-prod [~] # telnet mx02 14310 Trying 172.20.19.225... Connected to mx02. Escape character is '^]'. ^]
mx02 [~] # telnet mx01 14310 Trying 172.20.19.251... Connected to mx01. Escape character is '^]'. ^]
Although mail replication seems to be working properly and mails are in sync on both nodes (as what I could see), I would like to find the cause of this messages, as this does definetely don’t look normal…
I’m grateful for any help, since I’m quite on a struggle now…
Steven
Here’s my config
# doveconf -n # 2.2.36 (1f10bfa63): /etc/dovecot/dovecot.conf # Pigeonhole version 0.4.24 (124e06aa) # OS: Linux 3.10.0-1160.31.1.el7.x86_64 x86_64 CentOS Linux release 7.9.2009 (Core) # Hostname: mx01.example.com auth_mechanisms = plain login auth_verbose = yes dict { sqlquota = mysql:/etc/dovecot/dict-sqlquota.conf.ext } doveadm_password = # hidden, use -P to show it doveadm_port = 14310 first_valid_uid = 1000 mail_plugins = quota notify replication managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext mbox_write_locks = fcntl namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = separator = / type = private } passdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_replica = tcp:mx02.example.com quota = maildir:User quota quota_exceeded_message = Quota exceeded, please go to http://www.example.com/over_quota_help for instructions on how to fix this. quota_rule2 = INBOX.Trash:storage=+100M quota_status_nouser = DUNNO quota_status_overquota = 552 5.2.2 Mailbox is full / Mailbox ist voll quota_status_success = DUNNO quota_warning = storage=90%% quota-warning 90 %u quota_warning2 = -storage=90%% quota-warning below %u sieve = file:~/sieve;active=~/.dovecot.sieve } postmaster_address = postmaster@example.com protocols = imap pop3 lmtp sieve replication_dsync_parameters = -d -l 30 -U service aggregator { fifo_listener replication-notify-fifo { user = vmail } unix_listener replication-notify { user = vmail } } service auth { unix_listener /var/spool/postfix/private/auth { group = postfix mode = 0660 user = postfix } unix_listener auth-userdb { user = vmail } } service dict { unix_listener dict { user = vmail } } service doveadm { inet_listener { port = 14310 ssl = no } } service managesieve-login { inet_listener sieve { port = 4190 } } service quota-status { client_limit = 1 executable = quota-status -p postfix inet_listener { port = 14340 } } service quota-warning { executable = script /usr/local/libexec/dovecot/quota-warning.sh unix_listener quota-warning { user = vmail } user = vmail } service replicator { process_min_avail = 1 unix_listener replicator-doveadm { mode = 0600 user = vmail } } ssl = required ssl_cert = </etc/ssl/acme/certs/mail.example.com.chain.crt ssl_key = # hidden, use -P to show it userdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } verbose_proctitle = yes protocol lmtp { mail_plugins = quota notify replication sieve } protocol lda { mail_plugins = quota notify replication sieve } protocol imap { mail_max_userip_connections = 20 mail_plugins = quota notify replication imap_quota }
mx02.example.com has exact the same config, except of:
plugin { mail_replica = tcp:mx01.example.com
Hi!
The disk issue is likely that disk space on mail_temp_dir runs out, which is usually /tmp.
Aki
On 02/07/2021 08:43 Jörg Faudin Schulz <js@faudin.de> wrote:
Hi,
the memory issue has already been reported, not resolved yet:
https://www.mail-archive.com/dovecot@dovecot.org/msg83763.html
the disk-free issue is something different. Increasing memory parameters doesn't help- the sync only crashes later.
Here, everything seems to be synced fine nevertheless.
Am 02.07.21 um 02:56 schrieb Harlan Stenn:
Inodes? df -i
On 7/1/2021 5:07 PM, Steven Varco wrote:
Hi All
Since I configured dsync replication I get strange errors in the maillog on my two mail dovecot nodes:
PRIMARY: Jul 2 01:21:42 mx01.example.com dovecot: doveadm: Error: read(mx02.example.com) failed: read(size=3148) failed: Connection reset by peer (last sent=mail, last recv=mail (EOL))
The secondary is more interesting:
SECONDARY Jul 2 01:21:42 mx02 dovecot: doveadm: Error: close(-1[istream-seekable.c:237]) failed: No space left on device Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: pool_system_realloc(268435456): Out of memory Jul 2 01:21:43 mx02 dovecot: doveadm: Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7f2e9be4c92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7f2e9be4ca0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7f2e9bddc3d3] -> /usr/lib64/dovecot/libdo Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: master: service(doveadm): child 2876 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump) Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7fd56e17e92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7fd56e17ea0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7fd56e10e3d3] -> /us Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Fatal: master: service(doveadm): child 2882 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump)
The error messages state that disk space and/or memory is a problem, but disk space and memory is enough available:
mx02 [~] # df -h /srv/mail/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/system-mail 10G 5.7G 4.3G 58% /srv/mail
mx02 [~] # free -m total used free shared buff/cache available Mem: 3789 1602 1088 199 1097 1759 Swap: 471 93 378
I also tried to increase vsz_limit from 256 MB to 512 MB, which did not help.
And for the sake of completness also the connection to the doveadm port works well from both nodes:
mx01-prod [~] # telnet mx02 14310 Trying 172.20.19.225... Connected to mx02. Escape character is '^]'. ^]
mx02 [~] # telnet mx01 14310 Trying 172.20.19.251... Connected to mx01. Escape character is '^]'. ^]
Although mail replication seems to be working properly and mails are in sync on both nodes (as what I could see), I would like to find the cause of this messages, as this does definetely don’t look normal…
I’m grateful for any help, since I’m quite on a struggle now…
Steven
Here’s my config
# doveconf -n # 2.2.36 (1f10bfa63): /etc/dovecot/dovecot.conf # Pigeonhole version 0.4.24 (124e06aa) # OS: Linux 3.10.0-1160.31.1.el7.x86_64 x86_64 CentOS Linux release 7.9.2009 (Core) # Hostname: mx01.example.com auth_mechanisms = plain login auth_verbose = yes dict { sqlquota = mysql:/etc/dovecot/dict-sqlquota.conf.ext } doveadm_password = # hidden, use -P to show it doveadm_port = 14310 first_valid_uid = 1000 mail_plugins = quota notify replication managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext mbox_write_locks = fcntl namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = separator = / type = private } passdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_replica = tcp:mx02.example.com quota = maildir:User quota quota_exceeded_message = Quota exceeded, please go to http://www.example.com/over_quota_help for instructions on how to fix this. quota_rule2 = INBOX.Trash:storage=+100M quota_status_nouser = DUNNO quota_status_overquota = 552 5.2.2 Mailbox is full / Mailbox ist voll quota_status_success = DUNNO quota_warning = storage=90%% quota-warning 90 %u quota_warning2 = -storage=90%% quota-warning below %u sieve = file:~/sieve;active=~/.dovecot.sieve } postmaster_address = postmaster@example.com protocols = imap pop3 lmtp sieve replication_dsync_parameters = -d -l 30 -U service aggregator { fifo_listener replication-notify-fifo { user = vmail } unix_listener replication-notify { user = vmail } } service auth { unix_listener /var/spool/postfix/private/auth { group = postfix mode = 0660 user = postfix } unix_listener auth-userdb { user = vmail } } service dict { unix_listener dict { user = vmail } } service doveadm { inet_listener { port = 14310 ssl = no } } service managesieve-login { inet_listener sieve { port = 4190 } } service quota-status { client_limit = 1 executable = quota-status -p postfix inet_listener { port = 14340 } } service quota-warning { executable = script /usr/local/libexec/dovecot/quota-warning.sh unix_listener quota-warning { user = vmail } user = vmail } service replicator { process_min_avail = 1 unix_listener replicator-doveadm { mode = 0600 user = vmail } } ssl = required ssl_cert = </etc/ssl/acme/certs/mail.example.com.chain.crt ssl_key = # hidden, use -P to show it userdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } verbose_proctitle = yes protocol lmtp { mail_plugins = quota notify replication sieve } protocol lda { mail_plugins = quota notify replication sieve } protocol imap { mail_max_userip_connections = 20 mail_plugins = quota notify replication imap_quota }
mx02.example.com has exact the same config, except of:
plugin { mail_replica = tcp:mx01.example.com
Aki Tuomi aki.tuomi at open-xchange.com Fri Jul 2 09:14:47 EEST 2021
The disk issue is likely that disk space on mail_temp_dir runs out, which is usually /tmp.
Hi Aki
Many thanks for that hint, it actually lead me to the root cause of the problem! :)
As during the process the /tmp filesystem fills- and after empties so fast I could not even see the filesystem filling up when actively monitoring it with the watch command. It took like a microsecond when I could only see that /tmp increased somehow and immediately decreased again. Thats why I not noticed this in the first place.
I then increased the filesystem size and all the problems suddenly vanished. - Not just the "No space left on device“, suppringsly also the error log message: „Out of memory“ ist gone now, so they were somehow connected to eachother.
cheers, Steven
-- https://steven.varco.ch/ https://www.tech-island.com/
Am 02.07.2021 um 07:43 schrieb Jörg Faudin Schulz <js@faudin.de>:
Hi,
the memory issue has already been reported, not resolved yet:
https://www.mail-archive.com/dovecot@dovecot.org/msg83763.html
the disk-free issue is something different. Increasing memory parameters doesn't help- the sync only crashes later.
Here, everything seems to be synced fine nevertheless.
Am 02.07.21 um 02:56 schrieb Harlan Stenn:
Inodes? df -i
On 7/1/2021 5:07 PM, Steven Varco wrote:
Hi All
Since I configured dsync replication I get strange errors in the maillog on my two mail dovecot nodes:
PRIMARY: Jul 2 01:21:42 mx01.example.com dovecot: doveadm: Error: read(mx02.example.com) failed: read(size=3148) failed: Connection reset by peer (last sent=mail, last recv=mail (EOL))
The secondary is more interesting:
SECONDARY Jul 2 01:21:42 mx02 dovecot: doveadm: Error: close(-1[istream-seekable.c:237]) failed: No space left on device Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: pool_system_realloc(268435456): Out of memory Jul 2 01:21:43 mx02 dovecot: doveadm: Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7f2e9be4c92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7f2e9be4ca0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7f2e9bddc3d3] -> /usr/lib64/dovecot/libdo Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: master: service(doveadm): child 2876 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump) Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7fd56e17e92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7fd56e17ea0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7fd56e10e3d3] -> /us Jul 2 01:21:51 mx02 dovecot: dsync-local(user@example.com): Fatal: master: service(doveadm): child 2882 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump)
The error messages state that disk space and/or memory is a problem, but disk space and memory is enough available:
mx02 [~] # df -h /srv/mail/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/system-mail 10G 5.7G 4.3G 58% /srv/mail
mx02 [~] # free -m total used free shared buff/cache available Mem: 3789 1602 1088 199 1097 1759 Swap: 471 93 378
I also tried to increase vsz_limit from 256 MB to 512 MB, which did not help.
And for the sake of completness also the connection to the doveadm port works well from both nodes:
mx01-prod [~] # telnet mx02 14310 Trying 172.20.19.225... Connected to mx02. Escape character is '^]'. ^]
mx02 [~] # telnet mx01 14310 Trying 172.20.19.251... Connected to mx01. Escape character is '^]'. ^]
Although mail replication seems to be working properly and mails are in sync on both nodes (as what I could see), I would like to find the cause of this messages, as this does definetely don’t look normal…
I’m grateful for any help, since I’m quite on a struggle now…
Steven
Here’s my config
# doveconf -n # 2.2.36 (1f10bfa63): /etc/dovecot/dovecot.conf # Pigeonhole version 0.4.24 (124e06aa) # OS: Linux 3.10.0-1160.31.1.el7.x86_64 x86_64 CentOS Linux release 7.9.2009 (Core) # Hostname: mx01.example.com auth_mechanisms = plain login auth_verbose = yes dict { sqlquota = mysql:/etc/dovecot/dict-sqlquota.conf.ext } doveadm_password = # hidden, use -P to show it doveadm_port = 14310 first_valid_uid = 1000 mail_plugins = quota notify replication managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext mbox_write_locks = fcntl namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = separator = / type = private } passdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_replica = tcp:mx02.example.com quota = maildir:User quota quota_exceeded_message = Quota exceeded, please go to http://www.example.com/over_quota_help for instructions on how to fix this. quota_rule2 = INBOX.Trash:storage=+100M quota_status_nouser = DUNNO quota_status_overquota = 552 5.2.2 Mailbox is full / Mailbox ist voll quota_status_success = DUNNO quota_warning = storage=90%% quota-warning 90 %u quota_warning2 = -storage=90%% quota-warning below %u sieve = file:~/sieve;active=~/.dovecot.sieve } postmaster_address = postmaster@example.com protocols = imap pop3 lmtp sieve replication_dsync_parameters = -d -l 30 -U service aggregator { fifo_listener replication-notify-fifo { user = vmail } unix_listener replication-notify { user = vmail } } service auth { unix_listener /var/spool/postfix/private/auth { group = postfix mode = 0660 user = postfix } unix_listener auth-userdb { user = vmail } } service dict { unix_listener dict { user = vmail } } service doveadm { inet_listener { port = 14310 ssl = no } } service managesieve-login { inet_listener sieve { port = 4190 } } service quota-status { client_limit = 1 executable = quota-status -p postfix inet_listener { port = 14340 } } service quota-warning { executable = script /usr/local/libexec/dovecot/quota-warning.sh unix_listener quota-warning { user = vmail } user = vmail } service replicator { process_min_avail = 1 unix_listener replicator-doveadm { mode = 0600 user = vmail } } ssl = required ssl_cert = </etc/ssl/acme/certs/mail.example.com.chain.crt ssl_key = # hidden, use -P to show it userdb { args = /etc/dovecot/dovecot-sql.conf.ext driver = sql } verbose_proctitle = yes protocol lmtp { mail_plugins = quota notify replication sieve } protocol lda { mail_plugins = quota notify replication sieve } protocol imap { mail_max_userip_connections = 20 mail_plugins = quota notify replication imap_quota }
mx02.example.com has exact the same config, except of:
plugin { mail_replica = tcp:mx01.example.com
On 2021 Jul 05, at 02:00, Steven Varco <dovecot.org@bbs.varco.ch> wrote:
I then increased the filesystem size and all the problems suddenly vanished.
How large was your tmp before and after the change, out of curiosity?
-- -=> <http://xkcd.com/241/> <http://xkcd.com/304/> <http://xkcd.com/635/> <=-
Am 07.07.2021 um 10:34 schrieb @lbutlr <kremels@kreme.com>:
On 2021 Jul 05, at 02:00, Steven Varco <dovecot.org@bbs.varco.ch> wrote:
I then increased the filesystem size and all the problems suddenly vanished.
How large was your tmp before and after the change, out of curiosity?
Before it was 128 MB which is admittedly quite low. However, usually no compontent ever reachs this limit as temporary files are generally quite low in size, so I never had a problem with postfix, amavis, dovecot, or even a LAMP stack, and therefore I did not expect that in the first place.
After I have extended the /tmp volume to reasonably 1 GB which should be pretty fine for the future. :)
best regards, Steven
participants (5)
-
@lbutlr
-
Aki Tuomi
-
Harlan Stenn
-
Jörg Faudin Schulz
-
Steven Varco