Hi!
Looking at the code, I think the bug is that it just copies pointer to the value, it should, instead, duplicate the value to some memory region. Can you see if this following patch fixes it?
Aki
On October 17, 2016 at 4:14 PM Pierre Jaury pierre@jaury.eu wrote:
Okay, it seems to me that the bug is due to "t_str_tabunescape" using the unsafe datastack ("t_strdup_noconst") while the string is actually returned in an async callback.
Before it is handled by "client_dict_lookup", "client_dict_wait" actually fires some IO loops that are more than likely to call "t_pop" and free/flush the result string (I checked, it does call "t_pop" a couple times indeed). Maybe it would be safer to use a standard datastack pool when unescaping a string in that context.
I could work on a patch that would:
- add an optional "pool" attribute to the "client_dict_cmd" structure,
- pass the pool to the async lookup function,
- use the pool when escaping strings that should survive the callback chain
What do you think?
Regards, kaiyou
On 10/17/2016 09:52 AM, Pierre Jaury wrote:
While trying to isolate properly and reproduce, I was able to trigger the same bug with the following code:
struct dict *dict; char* dict_uri = "proxy::sieve"; char* key = "priv/key"; char* username = "admin@domain.tld"; char* value, error;
dict_drivers_register_builtin(); dict_init(dict_uri, DICT_DATA_TYPE_STRING, username, doveadm_settings->base_dir, &dict, &error); dict_lookup(dict, pool_datastack_create(), key, &value); printf(">%s\n", value); // outputs an empty string dict_deinit(&dict);
I trimmed it to the bare minimal string manipulation functions involved but cannot reproduce in that case:
pool_t pool = pool_datastack_create();
char* s1 = "test\001n\001rtest"; char* s2 = t_str_tabunescape(s1); char* s3 = p_strdup(pool, s2);
printf("1>%s\n", s1); printf("2>%s\n", s2); printf("3>%s\n", s3); // all three output the string with NL and CR
Maybe I am missing a function call in the process or maybe the issue is related to the way unescaping is performed in the async callback function in "dict-client.c", or maybe even some other edge case.
Finally, I was able to run the first snippet without bug by removing the string duplication in "t_str_tabunescape" (which I realize is not a proper solution), or by explicitely using the following pool:
return str_tabunescape(p_strdup(pool_datastack_create(), str));
Hope this helps.
kaiyou
On 10/17/2016 07:51 AM, Aki Tuomi wrote:
Hi!
This does sound like a bug, we'll have look.
Aki
On 17.10.2016 01:26, Pierre Jaury wrote:
I dived a little bit further into the rabbit hole, up to the point where debugging has become unpracticle but I still haven't found the root cause for sure.
I read most of the code for "p_strdup" based on datastack memory pools (which are used for dictionary lookups both with doveadm and by extdata) and it seems ok. Still, after "t_malloc_real" is called in "t_malloc0", the allocated buffer has the same address as the source string.
The only sensible explanation I can come up with is that during unescaping, strings are not allocated properly, leading to the memory pool reusing the string address and zeroing it in the process before the string copy operation.
I will follow on this path tomorrow, any lead is more than welcome.
kaiyou.
On 10/16/2016 11:16 PM, Pierre Jaury wrote:
Hello,
I am using a dict proxy for my sieve extdata plugin to access some fields from an SQLite database (autoreply text and other database-configured items).
All tests are performed against version 2.2.25.
$ dovecot --version 2.2.25 (7be1766)
My configuration looks like:
dict { sieve = sqlite:/etc/dovecot/pigeonhole-sieve.dict }
[...] sieve_extdata_dict_uri = proxy::sieve
I am able to read pretty much any attribute without any issue, except when the value contains a special character like "\r" or "\n". By using the doveadm dict client, I narrowed it down to the dictionary management part (either server, protocol or client).
I was suspecting escaping functions from "lib/strescape.c" (mostly str_tabescape and its counterpart, used by "lib-dict/client.c"), so I monitored socket communications. It seems that escaping is done properly on the server and the socket is not an issue either.
The following strace dump results from running "doveadm dict get" against the dict socket:
connect(8, {sa_family=AF_UNIX, sun_path="..."}, 110) = 0 fstat(8, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 [...] write(8, "H2\t0\t0\tadmin@domain.tld\tsieve\n", 30) = 30 [...] read(8, "Otest\1r\1ntest\n", 8192) = 14
Indeed "\1r" and "\1n" are the escape sequences used by "lib/strescape.c". I went deeped and debugged the call to "dict_lookup" performed by doveadm. Indeed the client gets the proper string from the socket and to my surprise, it is properly unescaped.
Then, in "client_dict_lookup" ("lib-dict/dict-client.c"), the call to "p_strdup" returns an empty string (null byte set at the target address).
Before the call to the dict "->lookup" attribute (client_dict_lookup):
RAX: 0x7ffff73a37c0 (push r14) RBX: 0x6831b8 ("priv/reply_body") RCX: 0x7fffffffe240 --> 0x682a60 --> 0x6831b8 ("priv/reply_body") RDX: 0x6831b8 ("priv/reply_body") RSI: 0x683288 --> 0x7ffff7653120 --> 0x7ffff73ea620 ([...]) RDI: 0x690ad0 --> 0x7ffff7400713 --> 0x75250079786f7270 ('proxy')
0x7ffff73a1f10
: mov rcx,r11 (value_r) 0x7ffff73a1f13 : mov rdx,r8 (key) 0x7ffff73a1f16 : mov rsi,r10 (pool) 0x7ffff73a1f19 : mov rdi,r9 (dict) 0x7ffff73a1f1c : add rsp,0x8 => 0x7ffff73a1f20 : jmp rax Before the call to p_strdup in "client_dict_lookup":
RSI: 0x6832d8 ("test\r\ntest") (lookup.result.value) RDI: 0x683288 --> 0x7ffff7653120 --> [...] (pool) RAX: 0x0 (result)
0x7ffff73a384f: nop 0x7ffff73a3850: mov rsi,QWORD PTR [rsp+0x8] 0x7ffff73a3855: mov rdi,r14 => 0x7ffff73a3858: call 0x7ffff736d3c0 p_strdup@plt 0x7ffff73a385d: mov QWORD PTR [r13+0x0],rax 0x7ffff73a3861: mov rsi,QWORD PTR [rsp+0x18] 0x7ffff73a3866: xor rsi,QWORD PTR fs:0x28 0x7ffff73a386f: mov eax,ebx
After the call:
0x7ffff73a3850: mov rsi,QWORD PTR [rsp+0x8] 0x7ffff73a3855: mov rdi,r14 0x7ffff73a3858: call 0x7ffff736d3c0 p_strdup@plt => 0x7ffff73a385d: mov QWORD PTR [r13+0x0],rax 0x7ffff73a3861: mov rsi,QWORD PTR [rsp+0x18] 0x7ffff73a3866: xor rsi,QWORD PTR fs:0x28 0x7ffff73a386f: mov eax,ebx 0x7ffff73a3871: jne 0x7ffff73a38da
RSI: 0x0 RDI: 0x6832d8 --> 0x0 RAX: 0x6832d8 --> 0x0 (result)
It is worth noting that I can reproduce the exact same execution flow with a non-multiline result string (lookup.result.value) that is properly copied by "p_strdup" and returned in RAX, then displayed by doveadm.
I am not familiar with the pooling mechanism hidden behind the call to p_strdump and not quite sure why this behaviour is emerging. Maybe I am even miles away from an understanding of the issue here, but it sounds to me like something is wrong in the way "p_strdup" performs the copy.
Hope this helps, kaiyou.