Bug - doveadm backup out-of-memory kill/crash when no separators set
Hi,
I hit a fun issue with doveadm when migrating from dovecot 2.2.36 (1f10bfa63) to 2.3.19.1 (9b53102964) (CentOS 7 to Debian 12).
When running doveadm -v -D backup -R -u "user@name" tcp:localhost:1234, I found that the first sync would always work, but subsequent runs of the command would cause doveadm to reach a subfolder (Archives/2008 in the example below) and then silently mmap() increasing powers of 2 before the OOM killer finally got it. Deleting the source folder caused it to remain stuck on the same folder, but for the deletion event.
I switched memory overcommit off to force a crash, and got a gdb backtrace:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44 #1 0x00007ffff786bd9f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78 #2 0x00007ffff781cf32 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007ffff7807472 in __GI_abort () at ./stdlib/abort.c:79 #4 0x00007ffff7b4ffae in default_fatal_finish (status=0, type=LOG_TYPE_PANIC) at ../lib/failures.c:465 #5 fatal_handler_real (ctx=<optimized out>, format=<optimized out>, args=<optimized out>) at ../lib/failures.c:477 #6 0x00007ffff7bfa081 in default_fatal_handler (ctx=<optimized out>, format=<optimized out>, args=<optimized out>) at ../lib/failures.c:485 #7 0x00007ffff7b5017c in i_panic (format=0x7ffff7c5d748 "data stack: Out of memory when allocating %zu bytes") at ../lib/failures.c:530 #8 0x00007ffff7b4f67f in mem_block_alloc (min_size=min_size@entry=16) at ../lib/data-stack.c:386 #9 0x00007ffff7bf8b60 in t_malloc_real (size=<optimized out>, permanent=<optimized out>) at ../lib/data-stack.c:492 #10 0x00007ffff7c348f1 in t_strdup_until (start=start@entry=0x55555565f440, end=end@entry=0x55555565f448) at ../lib/strfuncs.c:270 #11 0x00005555555adc62 in convert_name_to_remote_sep (name=0x55555565f440 "Archives/2008", tree=0x55555564e188) at dsync/dsync-mailbox-tree.c:270 #12 dsync_mailbox_tree_build_name128_remotesep_hash (tree=0x55555564e188) at dsync/dsync-mailbox-tree.c:315 #13 dsync_mailbox_tree_find_delete (tree=0x55555564e188, del=0x5555556469f0) at dsync/dsync-mailbox-tree.c:405 #14 0x00005555555a4195 in dsync_brain_mailbox_tree_add_delete (tree=0x55555564e188, other_tree=0x55555564f5f8, other_del=0x5555556469f0, node_r=0x7fffffffe350, status_r=0x7fffffffe348) at dsync/dsync-brain-mailbox-tree.c:504 #15 0x00005555555a44cd in dsync_brain_recv_mailbox_tree_deletes (brain=0x55555564b2d8) at dsync/dsync-brain-mailbox-tree.c:590 #16 0x00005555555a5365 in dsync_brain_run_real (brain=brain@entry=0x55555564b2d8, changed_r=changed_r@entry=0x7fffffffe453) at dsync/dsync-brain.c:709 #17 0x00005555555a59f9 in dsync_brain_run (changed_r=0x7fffffffe453, brain=0x55555564b2d8) at dsync/dsync-brain.c:752 #18 dsync_brain_run (changed_r=0x7fffffffe453, brain=0x55555564b2d8) at dsync/dsync-brain.c:740 #19 dsync_brain_run_io (context=<optimized out>) at dsync/dsync-brain.c:113 #20 dsync_brain_run_io (context=0x55555564b2d8) at dsync/dsync-brain.c:100 #21 0x00005555555b23df in dsync_ibc_stream_input (ibc=0x555555646720) at dsync/dsync-ibc-stream.c:232 #22 0x00007ffff7c11cd9 in io_loop_call_io (io=0x5555556418f0) at ../lib/ioloop.c:737 #23 0x00007ffff7c13aa2 in io_loop_handler_run_internal (ioloop=ioloop@entry=0x55555560bf40) at ../lib/ioloop-epoll.c:222 #24 0x00007ffff7c13b50 in io_loop_handler_run (ioloop=ioloop@entry=0x55555560bf40) at ../lib/ioloop.c:789 #25 0x00007ffff7c13d10 in io_loop_run (ioloop=0x55555560bf40) at ../lib/ioloop.c:762 #26 0x000055555558b22e in cmd_dsync_run_remote (user=0x555555637248) at ./src/doveadm/doveadm-dsync.c:543 #27 cmd_dsync_run (_ctx=0x55555561f288, user=0x555555637248) at ./src/doveadm/doveadm-dsync.c:750 #28 0x000055555558bd12 in doveadm_mail_next_user (ctx=0x55555561f288, error_r=0x7fffffffe818) at ./src/doveadm/doveadm-mail.c:464 #29 0x000055555558cf45 in doveadm_mail_cmd_exec (wildcard_user=0x0, ctx=0x55555561f288) at ./src/doveadm/doveadm-mail.c:659 #30 doveadm_cmd_ver2_to_mail_cmd_wrapper (cctx=0x7fffffffe970) at ./src/doveadm/doveadm-mail.c:988 #31 0x0000555555597622 in doveadm_cmd_run_ver2 (argc=5, argv=0x55555560ba38, cctx=cctx@entry=0x7fffffffe970) at ./src/doveadm/doveadm-cmd.c:465 #32 0x0000555555597697 in doveadm_cmd_try_run_ver2 (cmd_name=<optimized out>, argc=<optimized out>, argv=<optimized out>, cctx=0x7fffffffe970) at ./src/doveadm/doveadm-cmd.c:363 #33 0x000055555557919a in main (argc=<optimized out>, argv=<optimized out>) at ./src/doveadm/doveadm.c:361
Neither server had a separator set in the inbox namespace. Having identified the convert_name_to_remote_sep function as the likely trigger point I set the separator to / on both ends, and doveadm backup now runs without issue.
Can provide config if helpful but I think probably irrelevant to this one - the namespace config is the default other than the separator item. May be worth mentioning that having the separator set can avoid this condition in the doveadm backup docs - this took me a while to figure out!
-- Cheers, James Harrison
James Harrison wrote:
I hit a fun issue with doveadm when migrating from dovecot 2.2.36 (1f10bfa63) to 2.3.19.1 (9b53102964) (CentOS 7 to Debian 12). When running doveadm -v -D backup -R -u "user@name" tcp:localhost:1234, I found that the first sync would always work, but subsequent runs of the command would cause doveadm to reach a subfolder (Archives/2008 in the example below) and then silently mmap() increasing powers of 2 before the OOM killer finally got it.
I'm seeing a similar issue -- I've also recently upgraded to Debian 12 -- and one of my dovecot servers is using Maildir storage but the other is using sdbox, which means that they also have different separator characters. That means that my servers are hitting the code change in commit 596c5a52 that James identified; I suspect most other users aren't. Looking at the new code, I see that "name" is never updated, so the for loop finds the first separator, copies the string up to that onto the heap, and then does it again on the same part of the path. I see strings like "INBOX.INBOX.INBOX.INBOX..." in my heap. I think the answer is to set name to end somewhere between where "name_part" is assigned and the end of the loop. Here's a diff that seems work for me: --- dovecot-2.3.19.1+dfsg1.orig/src/doveadm/dsync/dsync-mailbox-tree.c +++ dovecot-2.3.19.1+dfsg1/src/doveadm/dsync/dsync-mailbox-tree.c @@ -268,6 +268,7 @@ convert_name_to_remote_sep(struct dsync_ const char *end = strchr(name, tree->sep); const char *name_part = end == NULL ? name : t_strdup_until(name, end++); + name = end; if (tree->escape_char != '\0') mailbox_list_name_unescape(&name_part, tree->escape_char);
participants (2)
-
James Harrison
-
phelps@pobox.com