Hey guys,
overall I have an working dovecot replication between 2 servers running on
amazon cloud. Sadly I had some messages that my server ran out of memory.
After investigating a little bit further I realized that some mails didn't
got replicated, but I'm not sure if this was related to the memory exhaust.
I was expecting that the full-sync would catch them up but sadly it's not.
Attached I'm adding:
* /etc/dovecot/dovecot.conf from both servers
* one sample of my memory-exhaust exception
* maildir directory listing of one mailbox on both servers
* commands + outpot of manual attempt for full-replication
* grep information of missing mail inside Maildir on both servers
Here is my configuration from both servers. The configugration is 1:1 the
same except the mail_replica server. Please note one server runs on debian
8.7 and the other one on 7.11.
---- SERVER A
> # dovecot -n
> # 2.2.13: /etc/dovecot/dovecot.conf
> # OS: Linux 3.2.0-4-amd64 x86_64 Debian 8.7
> ---- SERVER B
> # dovecot -n
> # 2.2.13: /etc/dovecot/dovecot.conf
> # OS: Linux 2.6.32-34-pve i686 Debian 7.11
> auth_mechanisms = plain login
> disable_plaintext_auth = no
> doveadm_password = ****
> doveadm_port = 12345
> listen = *,[::]
> log_timestamp = "%Y-%m-%d %H:%M:%S "
> mail_max_userip_connections = 100
> mail_plugins = notify replication quota
> mail_privileged_group = vmail
> passdb {
> args = /etc/dovecot/dovecot-sql.conf
> driver = sql
> }
> plugin {
> mail_replica = tcp:*.****.de
> quota = dict:user::file:/var/vmail/%d/%n/.quotausage
> replication_full_sync_interval = 1 hours
> sieve = /var/vmail/%d/%n/.sieve
> sieve_max_redirects = 25
> }
> protocols = imap
> replication_max_conns = 2
> service aggregator {
> fifo_listener replication-notify-fifo {
> mode = 0666
> user = vmail
> }
> unix_listener replication-notify {
> mode = 0666
> user = vmail
> }
> }
> service auth {
> unix_listener /var/spool/postfix/private/auth {
> group = postfix
> mode = 0660
> user = postfix
> }
> unix_listener auth-userdb {
> group = vmail
> mode = 0600
> user = vmail
> }
> user = root
> }
> service config {
> unix_listener config {
> user = vmail
> }
> }
> service doveadm {
> inet_listener {
> port = 12345
> }
> user = vmail
> }
> service imap-login {
> client_limit = 1000
> process_limit = 512
> }
> service lmtp {
> unix_listener /var/spool/postfix/private/dovecot-lmtp {
> group = postfix
> mode = 0600
> user = postfix
> }
> }
> service replicator {
> process_min_avail = 1
> unix_listener replicator-doveadm {
> mode = 0666
> }
> }
> ssl_cert = </etc/postfix/smtpd.cert
> ssl_key = </etc/postfix/smtpd.key
> ssl_protocols = !SSLv2 !SSLv3
> userdb {
> driver = prefetch
> }
> userdb {
> args = /etc/dovecot/dovecot-sql.conf
> driver = sql
> }
> protocol imap {
> mail_plugins = notify replication quota imap_quota
> }
> protocol pop3 {
> mail_plugins = quota
> pop3_uidl_format = %08Xu%08Xv
> }
> protocol lda {
> mail_plugins = notify replication quota sieve
> postmaster_address = webmaster@localhost
> }
> protocol lmtp {
> mail_plugins = notify replication quota sieve
> postmaster_address = webmaster@localhost
> }
This is the exception which I got several times:
Feb 26 16:16:39 mx dovecot: replicator: Panic: data stack: Out of memory
> when allocating 268435496 bytes
> Feb 26 16:16:39 mx dovecot: replicator: Error: Raw backtrace:
> /usr/lib/dovecot/libdovecot.so.0(+0x6b6fe) [0x7f7ca2b0a6fe] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x6b7ec) [0x7f7ca2b0a7ec] ->
> /usr/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7f7ca2ac18fb] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x6977e) [0x7f7ca2b0877e] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x699db) [0x7f7ca2b089db] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x82198) [0x7f7ca2b21198] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x6776d) [0x7f7ca2b0676d] ->
> /usr/lib/dovecot/libdovecot.so.0(buffer_write+0x6c) [0x7f7ca2b069dc] ->
> dovecot/replicator(replicator_queue_push+0x14e) [0x7f7ca2fa17ae] ->
> dovecot/replicator(+0x4f9e) [0x7f7ca2fa0f9e] -> dovecot/replicator(+0x4618)
> [0x7f7ca2fa0618] -> dovecot/replicator(+0x4805) [0x7f7ca2fa0805] ->
> /usr/lib/dovecot/libdovecot.so.0(io_loop_call_io+0x3f) [0x7f7ca2b1bd0f]
> -> /usr/lib/dovecot/libdovecot.so.0(io_loop_handler_run_internal+0xf9)
> [0x7f7ca2b1cd09] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x9)
> [0x7f7ca2b1bd79] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_run+0x38)
> [0x7f7ca2b1bdf8] -> /usr/lib/dovecot/libdovecot.so.0(master_service_run+0x13)
> [0x7f7ca2ac6dc3] -> dovecot/replicator(main+0x195) [0x7f7ca2f9f8b5] ->
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f7ca2715b45]
> -> dovecot/replicator(+0x395d) [0x7f7ca2f9f95d]
> Feb 26 16:16:39 mx dovecot: imap(***.com): Warning: replication(***.com):
> Sync failure:
> Feb 26 16:16:39 mx dovecot: replicator: Fatal: master:
> service(replicator): child 24012 killed with signal 6 (core dumps disabled)
This is the current maildir listing on Server A
# ls -la /var/vmail/*.eu/*h/Maildir/new/
> total 24
> drwx------ 2 vmail vmail 4096 Feb 27 18:12 .
> drwx------ 15 vmail vmail 4096 Feb 27 21:47 ..
> -rw------- 1 vmail vmail 3600 Feb 27 14:49 1488206976.M277562P25620.mail,
> S=3600,W=3671
> -rw------- 1 vmail vmail 4390 Feb 27 15:17 1488208642.M513542P27111.mail,
> S=4390,W=4478:2,S
> -rw------- 1 vmail vmail 3577 Feb 27 16:32 1488213157.M307300P30773.mail,
> S=3577,W=3648:2,S
This is the current maildir listing on Server B
# ls -la /var/vmail/*.eu/*h/Maildir/new/
> total 16
> drwx------ 2 vmail vmail 12288 Feb 27 16:45 .
> drwx------ 15 vmail vmail 4096 Feb 27 21:47 ..
This is how I tried to manually sync it
doveadm -v sync -u *h(a)*.eu -f tcp:mx.***.de:12345
This is the users sync status
# doveadm replicator status 'cheecoh(a)ragequit.eu'
> username priority fast sync full sync failed
> *h(a)*.eu none 00:24:47 10:57:04 -
Then I tried to lookup for the mail-id which is also the same on both
servers
# grep -ri "M277562P25620" /var/vmail/*.eu/*h/
> /var/vmail/*.eu/*h/Maildir/dovecot-uidlist:493 :1488206976.M277562P25620.
> mail,S=3600,W=3671
I have no idea what else I could do. I could also pass a "doveadm -Dv sync"
output but this one is really huge..
Best Regards
Christoph Kluge