json_parse_number broken by compiler optimization

Christian Ehrhardt christian.ehrhardt at canonical.com
Tue Mar 30 18:53:27 EEST 2021


Hi,
the recent Ubuntu (re)builds uncovered an issue with dovecot 1:2.3.13+dfsg1-1
build log: https://launchpadlibrarian.net/529849650/buildlog_ubuntu-hirsute-amd64.dovecot_1%3A2.3.13+dfsg1-1build1_BUILDING.txt.gz
A coworker tried 2.3.14 but got the same result.

What fails is the json_parser build time test like:
  test-json-parser.c:161: Assert(#25) failed:
null_strcmp(json_output[pos].value, value) == 0

I was looking into that a bit more and what I found is that it is
dependent on the new toolchain
of gcc 10.2.0-1.

Not all calls to json_parse_* fail, e.g. the first one looks all good and passes
I was iterating the tests using a report function defined like

(gdb) define repcon
>c
>p pos
>p json_output[pos].type
>p type
>p json_output[pos].value
>p value
>call null_strcmp(json_output[pos].value, value)
>end

The first one to be bad was:
Breakpoint 2, test_json_parser_success (full_size=<optimized out>) at
test-json-parser.c:161
161 test_assert_idx(null_strcmp(json_output[pos].value, value) == 0, pos);
$84 = 25
$85 = JSON_TYPE_NUMBER
$86 = JSON_TYPE_NUMBER
$87 = 0x555555633b25 "-12.456"
$88 = 0x555555693110 ""
$89 = 45

Earlier and later parsing was happy, for example

Breakpoint 2, test_json_parser_success (full_size=<optimized out>) at
test-json-parser.c:161
161 test_assert_idx(null_strcmp(json_output[pos].value, value) == 0, pos);
$90 = 27
$91 = JSON_TYPE_NUMBER
$92 = JSON_TYPE_NUMBER
$93 = 0x555555633b32 "12.456e9"
$94 = 0x555555693110 "12.456e9"
$95 = 0
(gdb)


We have two things we compare here.
1. json_output[] which is a static define and for this value is
   67 »···{ JSON_TYPE_NUMBER, "-12.456" },
2. the return value that json_parse_next returns.
   25 »···"  \"sub2\":-12.456,\n"
  148 »···»···»···»···ret = json_parse_next(parser, &type, &value);

I tried a non negative number and got success which is suspicious

Breakpoint 5, test_json_parser_success (full_size=<optimized out>) at
test-json-parser.c:164
164 test_assert_idx(null_strcmp(json_output[pos].value, value) == 0, pos);
$122 = 25
$123 = JSON_TYPE_NUMBER
$124 = JSON_TYPE_NUMBER
$125 = 0x555555633b2c "12.456"
$126 = 0x555555693110 "12.456"
$127 = 0
(gdb)


Also the return value otherwise LGTM, it is recognized as a number:
540 } else if ((ret = json_parse_number(parser, value_r)) >= 0) {
(gdb) n
541 *type_r = JSON_TYPE_NUMBER;

So parsing worked one might think, but the value was not set.

(gdb) p *value_r
$833 = 0x555555693110 ""

It turns out json_parse_number is too optimized.
Too optimized for debugging further into json_parse_number.
But also optimization is causing the issue.
With -O0 or -O1 it works fine, even putting the lower optimization
level just around json_parse_number it is enough to avoid the issue.


Not a fix but "avoidance" for now:

--- a/src/lib/json-parser.c
+++ b/src/lib/json-parser.c
@@ -386,6 +386,8 @@ static int json_parse_int(struct json_pa
  return 1;
 }

+#pragma GCC push_options
+#pragma GCC optimize ("-O0")
 static int json_parse_number(struct json_parser *parser, const char **value_r)
 {
  int ret;
@@ -415,6 +417,7 @@ static int json_parse_number(struct json
  *value_r = str_c(parser->value);
  return 1;
 }
+#pragma GCC pop_options

 static int json_parse_atom(struct json_parser *parser, const char *atom)
 {


I was trying to see what happened and tracked each str_append_c
that happened inside json_parse_number.
A breakpoint on str_append_c and backtraces showed that we clearly do the
"right" parsing. From the backtraces I've seen it does (just from positions):

For example
(gdb) bt
#0  str_append_c (chr=<optimized out>, str=0x5555556930d0) at str.h:44
#1  json_parse_digits (parser=<optimized out>) at json-parser.c:366
=> that is a digit

It does "- digit digit dot digit digit digit"

This perfectly matches the input value, but still the result stays empty:
(gdb) p (char*)(str->data)
$738 = 0x555555693110 ""

In comparison - when parsing the latter 12.456e9 which works I see two
odd things:
1. the value isn't cleared despite
    str_truncate(parser->value, 0);
   at the beginning of json_parse_number
2. the value (this time) is read exactly as I assumed

Breakpoint 17, json_parse_number (value_r=0x7fffffffe1a0,
parser=0x555555692ee0) at json-parser.c:394
394 str_truncate(parser->value, 0);
Breakpoint 18, str_append_c (chr=<optimized out>, str=0x5555556930d0)
at str.h:44
44 buffer_append_c(str, chr);
(gdb) p (char*)(str->data)
$741 = 0x555555693110 "sub3"

^^ not cleared, still has "sub3" even later on

Breakpoint 18, str_append_c (chr=<optimized out>, str=0x5555556930d0)
at str.h:44
44 buffer_append_c(str, chr);
(gdb) p (char*)(str->data)
$742 = 0x555555693110 "1ub3"

^^ reading left to right one char at a time "over the old value",
which didn't update the data.

So it seems in json_parse_number the ordering of
a) str_truncate(parser->value, 0);
b) the parsing itself
is messed up by optimization.

We already know from the above that reducing just this function to -O1 or less
avoids the issue. But what is it really - I don't know?

Maybe you have a better idea what is going on, the issue should be
reproducible when building
dovecot with the toolchain present on the latest Ubuntu being 21.04
(Hirsute) right now.

--
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


More information about the dovecot mailing list