Well,
Using replication_sync_timeout = 60 or 15 leads to crash, 10 doesn't.
It is Dovecot v2.3.21.1 (d492236fa0) on OpenBSD.
Inside mail logs:
Sep 24 15:58:20 mx2 dovecot: lmtp(70545): Connect from local Sep 24 15:58:25 mx2 dovecot: replicator: Panic: data stack: Out of memory when allocating 17179869224 bytes Sep 24 15:58:30 mx2 dovecot: lmtp(k@catap.net)<70545><4IQPAXzF8maREwEAU0lCkg>: Warning: replication(k@catap.net): Sync failure: Sep 24 15:58:30 mx2 dovecot: lmtp(k@catap.net)<70545><4IQPAXzF8maREwEAU0lCkg>: Warning: replication(k@catap.net): Remote sent invalid input: - Sep 24 15:58:30 mx2 dovecot: lmtp(k@catap.net)<70545><4IQPAXzF8maREwEAU0lCkg>: sieve: msgid=<2bb5ed0ad8291eed@mx2.catap.net>: stored mail into mailbox 'INBOX' Sep 24 15:58:30 mx2 dovecot: replicator: Fatal: master: service(replicator): child 21532 killed with signal 6 (core dumped) Sep 24 15:58:30 mx2 dovecot: lmtp(70545): Disconnect from local: Logged out (state=READY)
and stacktrace:
(gdb) bt f #0 thrkill () at /tmp/-:2 No locals. #1 0xfc8e9aa8d08a5c82 in ?? () No symbol table info available. #2 0x00000c7c2b59121b in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:51 sa = {__sigaction_u = {__sa_handler = 0x0, __sa_sigaction = 0x0}, sa_mask = 4294967295, sa_flags = 0} mask = 4294967263 #3 0x00000c7bec3b6015 in default_fatal_finish (type=LOG_TYPE_PANIC, status=0) at failures.c:465 recursed = 0 backtrace = <optimized out> #4 0x00000c7bec3b4487 in fatal_handler_real (ctx=0x7bec0b423480, format=<optimized out>, args=<optimized out>) at failures.c:477 status = 0 #5 0x00000c7bec3b53c5 in i_internal_fatal_handler (ctx=0x0, format=0x6 <error: Cannot access memory at address 0x6>, args=0x0) at failures.c:879 No locals. #6 0x00000c7bec3b46b0 in i_panic (format=0xc7bec2a1f7b "data stack: Out of memory when allocating %zu bytes") at failures.c:530 ctx = {type = LOG_TYPE_PANIC, exit_status = 0, timestamp = 0x0, timestamp_usecs = 0, log_prefix = 0x0, log_prefix_type_pos = 0} args = {{gp_offset = 16, fp_offset = 48, overflow_arg_area = 0x7bec0b4234e0, reg_save_area = 0x7bec0b4233d0}} #7 0x00000c7bec3ad250 in mem_block_alloc (min_size=8589934592) at data-stack.c:386 prev_size = <optimized out> alloc_size = 17179869184 block = <optimized out> #8 0x00000c7bec3aca34 in t_malloc_real (size=<optimized out>, permanent=<optimized out>) at data-stack.c:492 block = <optimized out> warn = <error reading variable warn (Cannot access memory at address 0x0)> alloc_size = 8589934592 ret = <optimized out> #9 0x00000c7bec3e1054 in pool_data_stack_realloc (pool=<optimized out>, mem=0xc7d0cfeb028, old_size=4294967296, new_size=8589934592) at mempool-datastack.c:173 dpool = <optimized out> new_mem = <optimized out> #10 0x00000c7bec3a775c in p_realloc (pool=0x0, mem=0x6, old_size=0, new_size=8589934592) at ./mempool.h:120 No locals. #11 buffer_alloc (buf=0xc7c601b18f8, size=8589934592) at buffer.c:40 No locals. #12 0x00000c7bec3a7a4b in buffer_check_limits (buf=0xc7c601b18f8, pos=<optimized out>, data_size=<optimized out>) at buffer.c:85 new_alloc_size = 6 new_size = 4294967296 #13 0x00000c7bec3a7b7a in buffer_check_append_limits (buf=0xc7c601b18f8, data_size=32) at buffer.c:117 No locals. #14 buffer_append (_buf=0xc7c601b18f8, data=0xc7c67be4000, data_size=32) at buffer.c:234 pos = 4294967264 buf = 0xc7c601b18f8 #15 0x00000c79cd739786 in array_append_i (data=0x6, count=1, array=<optimized out>) at ../../../src/lib/array.h:210 No locals. #16 replicator_queue_handle_sync_lookups (queue=0xc7c67bb0b40, user=0xc7c67bc70a0) at replicator-queue.c:334 callbacks = <error reading variable callbacks (Cannot access memory at address 0x20)> lookups = 0xc7c67be4000 i = 0 count = 1 success = <optimized out> lookups_end = <optimized out> #17 replicator_queue_push (queue=0xc7c67bb0b40, user=0xc7c67bc70a0) at replicator-queue.c:352 _data_stack_cur_id = <optimized out> #18 0x00000c79cd738b9e in dsync_callback (reply=DSYNC_REPLY_OK, state=0xc7c601b1740 "AQAAANnUUhPeOYtlDEgBAHnjHAEI1YplAwAAAA0", 'A' <repeats 25 times>, "rYoAEtk5i2WIdgAAeeMcAQbVimUmBwAA3Qw", 'A' <repeats 18 times>, "KoDAAAFBWQVA9WKZQaEAQB54xwBA9WKZccQAAA+M", 'A' <repeats 19 times>, "ZQAAABs4QRHMOotlNmABAHnj"..., context=0xc7c67be7980) at replicator-brain.c:134 ctx = 0xc7c67be7980 user = 0x0 #19 0x00000c79cd7382c8 in dsync_input_line (client=0xc7c67bbc000, line=<optimized out>) at dsync-client.c:64 state = <optimized out> #20 dsync_input (client=0xc7c67bbc000) at dsync-client.c:154 line = <optimized out> #21 0x00000c7bec3d1f3f in io_loop_call_io (io=0xc7c67bdb3c0) at ioloop.c:737 ioloop = 0xc7c67bb4be0 t_id = 0 #22 0x00000c7bec3d50a4 in io_loop_handler_run_internal (ioloop=0xc7c67bb4be0) at ioloop-kqueue.c:164 tv = {tv_sec = 188, tv_usec = 270946} ts = {tv_sec = 188, tv_nsec = 270946000} ctx = 0xc7c67bd4460 msecs = <optimized out> events = <optimized out> events_count = <optimized out> ret = <optimized out> i = 0 io = 0xc7c67bdb3c0 event = <optimized out> #23 0x00000c7bec3d26a0 in io_loop_handler_run (ioloop=0xc7c67bb4be0) at ioloop.c:789 No locals. #24 0x00000c7bec3d24a8 in io_loop_run (ioloop=0xc7c67bb4be0) at ioloop.c:762 No locals. #25 0x00000c7bec3276b9 in master_service_run (service=0xc7c67bca8c0, callback=0x6) at master-service.c:878 No locals. #26 0x00000c79cd738504 in main (argc=1, argv=0x7bec0b423918) at replicator.c:112 set_roots = {0xc79cd73c960 <replicator_setting_parser_info>, 0x0} service_flags = <error reading variable service_flags (Cannot access memory at address 0x80)> error = 0xc7c4b8b0290 <_dl_dtors> "\363\017\036\372UH\211\345\350\063\273\377\377\350\236\367\377\377H\307D$\370" (gdb)
-- wbr, Kirill