Fwd: Some mails do not get replicated anymore after memory-exhaust

Christoph Kluge me at christoph-kluge.eu
Mon Feb 27 22:36:07 UTC 2017


Hey guys,

overall I have an working dovecot replication between 2 servers running on
amazon cloud. Sadly I had some messages that my server ran out of memory.
After investigating a little bit further I realized that some mails didn't
got replicated, but I'm not sure if this was related to the memory exhaust.
I was expecting that the full-sync would catch them up but sadly it's not.

Attached I'm adding:
* /etc/dovecot/dovecot.conf from both servers
* one sample of my memory-exhaust exception
* maildir directory listing of one mailbox on both servers
* commands + outpot of manual attempt for full-replication
* grep information of missing mail inside Maildir on both servers

Here is my configuration from both servers. The configugration is 1:1 the
same except the mail_replica server. Please note one server runs on debian
8.7 and the other one on 7.11.

---- SERVER A
> # dovecot -n
> # 2.2.13: /etc/dovecot/dovecot.conf
> # OS: Linux 3.2.0-4-amd64 x86_64 Debian 8.7
> ---- SERVER B
> # dovecot -n
> # 2.2.13: /etc/dovecot/dovecot.conf
> # OS: Linux 2.6.32-34-pve i686 Debian 7.11
> auth_mechanisms = plain login
> disable_plaintext_auth = no
> doveadm_password = ****
> doveadm_port = 12345
> listen = *,[::]
> log_timestamp = "%Y-%m-%d %H:%M:%S "
> mail_max_userip_connections = 100
> mail_plugins = notify replication quota
> mail_privileged_group = vmail
> passdb {
>   args = /etc/dovecot/dovecot-sql.conf
>   driver = sql
> }
> plugin {
>   mail_replica = tcp:*.****.de
>   quota = dict:user::file:/var/vmail/%d/%n/.quotausage
>   replication_full_sync_interval = 1 hours
>   sieve = /var/vmail/%d/%n/.sieve
>   sieve_max_redirects = 25
> }
> protocols = imap
> replication_max_conns = 2
> service aggregator {
>   fifo_listener replication-notify-fifo {
>     mode = 0666
>     user = vmail
>   }
>   unix_listener replication-notify {
>     mode = 0666
>     user = vmail
>   }
> }
> service auth {
>   unix_listener /var/spool/postfix/private/auth {
>     group = postfix
>     mode = 0660
>     user = postfix
>   }
>   unix_listener auth-userdb {
>     group = vmail
>     mode = 0600
>     user = vmail
>   }
>   user = root
> }
> service config {
>   unix_listener config {
>     user = vmail
>   }
> }
> service doveadm {
>   inet_listener {
>     port = 12345
>   }
>   user = vmail
> }
> service imap-login {
>   client_limit = 1000
>   process_limit = 512
> }
> service lmtp {
>   unix_listener /var/spool/postfix/private/dovecot-lmtp {
>     group = postfix
>     mode = 0600
>     user = postfix
>   }
> }
> service replicator {
>   process_min_avail = 1
>   unix_listener replicator-doveadm {
>     mode = 0666
>   }
> }
> ssl_cert = </etc/postfix/smtpd.cert
> ssl_key = </etc/postfix/smtpd.key
> ssl_protocols = !SSLv2 !SSLv3
> userdb {
>   driver = prefetch
> }
> userdb {
>   args = /etc/dovecot/dovecot-sql.conf
>   driver = sql
> }
> protocol imap {
>   mail_plugins = notify replication quota imap_quota
> }
> protocol pop3 {
>   mail_plugins = quota
>   pop3_uidl_format = %08Xu%08Xv
> }
> protocol lda {
>   mail_plugins = notify replication quota sieve
>   postmaster_address = webmaster at localhost
> }
> protocol lmtp {
>   mail_plugins = notify replication quota sieve
>   postmaster_address = webmaster at localhost
> }


This is the exception which I got several times:

Feb 26 16:16:39 mx dovecot: replicator: Panic: data stack: Out of memory
> when allocating 268435496 bytes
> Feb 26 16:16:39 mx dovecot: replicator: Error: Raw backtrace:
> /usr/lib/dovecot/libdovecot.so.0(+0x6b6fe) [0x7f7ca2b0a6fe] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x6b7ec) [0x7f7ca2b0a7ec] ->
> /usr/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7f7ca2ac18fb] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x6977e) [0x7f7ca2b0877e] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x699db) [0x7f7ca2b089db] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x82198) [0x7f7ca2b21198] ->
> /usr/lib/dovecot/libdovecot.so.0(+0x6776d) [0x7f7ca2b0676d] ->
> /usr/lib/dovecot/libdovecot.so.0(buffer_write+0x6c) [0x7f7ca2b069dc] ->
> dovecot/replicator(replicator_queue_push+0x14e) [0x7f7ca2fa17ae] ->
> dovecot/replicator(+0x4f9e) [0x7f7ca2fa0f9e] -> dovecot/replicator(+0x4618)
> [0x7f7ca2fa0618] -> dovecot/replicator(+0x4805) [0x7f7ca2fa0805] ->
> /usr/lib/dovecot/libdovecot.so.0(io_loop_call_io+0x3f) [0x7f7ca2b1bd0f]
> -> /usr/lib/dovecot/libdovecot.so.0(io_loop_handler_run_internal+0xf9)
> [0x7f7ca2b1cd09] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x9)
> [0x7f7ca2b1bd79] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_run+0x38)
> [0x7f7ca2b1bdf8] -> /usr/lib/dovecot/libdovecot.so.0(master_service_run+0x13)
> [0x7f7ca2ac6dc3] -> dovecot/replicator(main+0x195) [0x7f7ca2f9f8b5] ->
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f7ca2715b45]
> -> dovecot/replicator(+0x395d) [0x7f7ca2f9f95d]
> Feb 26 16:16:39 mx dovecot: imap(***.com): Warning: replication(***.com):
> Sync failure:
> Feb 26 16:16:39 mx dovecot: replicator: Fatal: master:
> service(replicator): child 24012 killed with signal 6 (core dumps disabled)


This is the current maildir listing on Server A

# ls -la /var/vmail/*.eu/*h/Maildir/new/
> total 24
> drwx------  2 vmail vmail 4096 Feb 27 18:12 .
> drwx------ 15 vmail vmail 4096 Feb 27 21:47 ..
> -rw-------  1 vmail vmail 3600 Feb 27 14:49 1488206976.M277562P25620.mail,
> S=3600,W=3671
> -rw-------  1 vmail vmail 4390 Feb 27 15:17 1488208642.M513542P27111.mail,
> S=4390,W=4478:2,S
> -rw-------  1 vmail vmail 3577 Feb 27 16:32 1488213157.M307300P30773.mail,
> S=3577,W=3648:2,S


This is the current maildir listing on Server B

# ls -la /var/vmail/*.eu/*h/Maildir/new/
> total 16
> drwx------  2 vmail vmail 12288 Feb 27 16:45 .
> drwx------ 15 vmail vmail  4096 Feb 27 21:47 ..


This is how I tried to manually sync it

doveadm -v sync -u *h@*.eu -f tcp:mx.***.de:12345


This is the users sync status

# doveadm replicator status 'cheecoh at ragequit.eu'
> username priority fast sync full sync failed
> *h@*.eu none     00:24:47  10:57:04  -


Then I tried to lookup for the mail-id which is also the same on both
servers

# grep -ri "M277562P25620" /var/vmail/*.eu/*h/
> /var/vmail/*.eu/*h/Maildir/dovecot-uidlist:493 :1488206976.M277562P25620.
> mail,S=3600,W=3671


I have no idea what else I could do. I could also pass a "doveadm -Dv sync"
output but this one is really huge..

Best Regards
Christoph Kluge


More information about the dovecot mailing list