[Dovecot] Panic in Dovecot 1.1.3: index-mail.c: line 1091: assertion failed: (!mail->data.destroying_stream)
Dovecot 1.1.3 Solaris 10 SPARC (Sun Fire T1000) Compiled with Sun Studio 12 compilers. Maildir on NFS Indexes on local disk (UFS). 'dovecot -n' output attached.
IMAP process crashes for certain (many, but not all) users when accessing certain folders (in the example below, in crashes when accessing my INBOX, about 1700 mails). I could access other mailboxes without problems. And a simple telnet to the imap port followed by a login works fine.
Erased the coredumps (was filling up the system too quickly) so I can't really produce a backtrace right now (had to back down to 1.0.13 again even though that one has another core-dump-generating problem - but that is atleast limited to only one specific user so far). Going to set up a separate test server to test things more without disturbing our normal operations...
I started with an empty INDEX directory to make sure there weren't problems with old corrupted indexes.
Output from syslog:
Sep 9 12:34:44 ifm.liu.se dovecot: [ID 107833 mail.info] imap-login: Login: user=<peter>, method=GSSAPI, rip=130.236.160.102, lip=130.236.160.6, secured
Sep 9 12:34:44 ifm.liu.se dovecot: [ID 107833 mail.crit] Panic: IMAP(peter): file index-mail.c: line 1091: assertion failed: (!mail->data.destroying_stream)
Sep 9 12:34:44 ifm.liu.se dovecot: [ID 107833 mail.error] IMAP(peter): Raw backtrace: 0x1000b5188 -> 0x1000605ac -> 0x1000605e4 -> 0x100060734 -> 0x1000657b8 -> 0x100070470 -> 0x1000204e0 -> 0x100020868 -> 0x100016e78 -> 0x10001dab8 -> 0x10001debc -> 0x10001e0a4 -> 0x1000bf4b4 -> 0x1000bea68 -> 0x10002933c -> 0x100014c9c
Sep 9 12:34:48 ifm.liu.se dovecot: [ID 107833 mail.info] auth(default): client out: OK 1 user=peter
Sep 9 12:34:48 ifm.liu.se dovecot: [ID 107833 mail.info] auth(default): passwd(peter,130.236.160.102): lookup
Sep 9 12:34:48 ifm.liu.se dovecot: [ID 107833 mail.info] auth(default): master out: USER 241 peter system_user=peter uid=103 gid=10 home=/home/peter
Sep 9 12:34:48 ifm.liu.se dovecot: [ID 107833 mail.info] imap-login: Login: user=<peter>, method=GSSAPI, rip=130.236.160.102, lip=130.236.160.6, secured
Sep 9 12:34:48 ifm.liu.se dovecot: [ID 107833 mail.crit] Panic: IMAP(peter): file index-mail.c: line 1091: assertion failed: (!mail->data.destroying_stream)
Sep 9 12:34:48 ifm.liu.se dovecot: [ID 107833 mail.error] IMAP(peter): Raw backtrace: 0x1000b5188 -> 0x1000605ac -> 0x1000605e4 -> 0x100060734 -> 0x1000657b8 -> 0x100070470 -> 0x1000204e0 -> 0x100020868 -> 0x100016e78 -> 0x10001dab8 -> 0x10001debc -> 0x10001e0a4 -> 0x1000bf4b4 -> 0x1000bea68 -> 0x10002933c -> 0x100014c9c
- Peter
1.1.3: /etc/dovecot.conf.new
base_dir: /var/run/dovecot/ protocols: imap pop3 imaps pop3s ssl_ca_file: /ifm/etc/certs/dovecot-CA.cert.pem ssl_cert_file: /ifm/etc/certs/dovecot-ifm.cert.pem ssl_key_file: /ifm/etc/certs/dovecot-ifm.key.pem login_dir: /var/run/dovecot/login login_executable(default): /ifm/libexec/dovecot/imap-login login_executable(imap): /ifm/libexec/dovecot/imap-login login_executable(pop3): /ifm/libexec/dovecot/pop3-login login_greeting: Welcome to the IFM Dovecot Server. first_valid_uid: 100 mail_location: maildir:Maildir:INDEX=/var/indexes/%u mmap_disable: yes mail_nfs_storage: yes mail_nfs_index: yes mbox_write_locks: fcntl mail_executable(default): /ifm/libexec/dovecot/imap mail_executable(imap): /ifm/libexec/dovecot/imap mail_executable(pop3): /ifm/libexec/dovecot/pop3 mail_plugin_dir(default): /ifm/lib/dovecot/imap mail_plugin_dir(imap): /ifm/lib/dovecot/imap mail_plugin_dir(pop3): /ifm/lib/dovecot/pop3 imap_client_workarounds(default): outlook-idle delay-newmail imap_client_workarounds(imap): outlook-idle delay-newmail imap_client_workarounds(pop3): pop3_client_workarounds(default): pop3_client_workarounds(imap): pop3_client_workarounds(pop3): outlook-no-nuls oe-ns-eoh namespace: type: private separator: / inbox: yes list: yes subscriptions: yes namespace: type: private separator: / prefix: mail/ hidden: yes subscriptions: yes namespace: type: private separator: / prefix: Mail/ hidden: yes subscriptions: yes namespace: type: private separator: / prefix: ~/mail/ hidden: yes subscriptions: yes namespace: type: private separator: / prefix: ~/Mail/ hidden: yes subscriptions: yes namespace: type: private separator: / prefix: ~%u/mail/ hidden: yes subscriptions: yes namespace: type: private separator: / prefix: ~%u/Mail/ hidden: yes subscriptions: yes auth default: mechanisms: plain login gssapi executable: /ifm/libexec/dovecot/dovecot-auth verbose: yes debug: yes passdb: driver: pam userdb: driver: passwd
On Tue, 2008-09-09 at 13:23 +0200, Peter Eriksson wrote:
Maildir on NFS
This is the first time I've heard this happening with maildir. It's always been with mboxes before.
IMAP process crashes for certain (many, but not all) users when accessing certain folders (in the example below, in crashes when accessing my INBOX, about 1700 mails). I could access other mailboxes without problems. And a simple telnet to the imap port followed by a login works fine.
Any idea what the users were doing when it crashed? Was it just opening the mailbox or opening some mail or deleting/copying mails?
Erased the coredumps (was filling up the system too quickly) so I can't really produce a backtrace right now (had to back down to 1.0.13 again even though that one has another core-dump-generating problem - but that is atleast limited to only one specific user so far). Going to set up a separate test server to test things more without disturbing our normal operations...
Some kind of a way to reproduce this would be helpful. Or I guess in your case even a backtrace, since all the previous ones have been with COPY command and mbox.
mmap_disable: yes mail_nfs_index: yes
BTW. these can be "no" if you're storing indexes locally.
Timo Sirainen skrev:
IMAP process crashes for certain (many, but not all) users when accessing certain folders (in the example below, in crashes when accessing my INBOX, about 1700 mails). I could access other mailboxes without problems. And a simple telnet to the imap port followed by a login works fine.
Any idea what the users were doing when it crashed? Was it just opening the mailbox or opening some mail or deleting/copying mails?
I have set up the isolated test server now so that I can test this a bit better. It crashes when I start up a fresh Dovecot instance, start up Thunderbird and then click on the INBOX on that server (I rsync-copied the whole tree (my mails) to a separare, local, directory so I can test things without disturbing my normal mailbox).
I have noticed one more thing now since I sent that mail though - I noticed that I didn't have debug info in the binaries (originally compiled with "-fast -m64" for high machine-specific optimizations) so I recompiled with "-g" added to the compiler flags and retested. Same problem.
Then I removed all optimization flags (only compiled with "-g -m64" and now it stopped crashing... So my current thesis is that it is something in that code that optimizes wrong (or when optimized exposes some bug) for some reason.
Going to try some variants of optimization flags and compilers (have some patches for the Sun Studio 12 compilers that I'm going to apply too) and see if I can narrow things down a bit more.
Erased the coredumps (was filling up the system too quickly) so I can't really produce a backtrace right now (had to back down to 1.0.13 again even though that one has another core-dump-generating problem - but that is atleast limited to only one specific user so far). Going to set up a separate test server to test things more without disturbing our normal operations...
Some kind of a way to reproduce this would be helpful. Or I guess in your case even a backtrace, since all the previous ones have been with COPY command and mbox.
I'll send a backtrace in a little while.
mmap_disable: yes mail_nfs_index: yes
BTW. these can be "no" if you're storing indexes locally.
Yeah, I know. I was just a leftover from the original server (I originally stored the indexes in the Maildir folder, but decided to store the locally while testing 1.1.3 so I wouldn't disturb the 1.0.13 generated ones in case I had to go back (which I did)).
- Peter
Timo Sirainen wrote:
On Tue, 2008-09-09 at 13:23 +0200, Peter Eriksson wrote:
Maildir on NFS
This is the first time I've heard this happening with maildir. It's always been with mboxes before.
IMAP process crashes for certain (many, but not all) users when accessing certain folders (in the example below, in crashes when accessing my INBOX, about 1700 mails). I could access other mailboxes without problems. And a simple telnet to the imap port followed by a login works fine.
Any idea what the users were doing when it crashed? Was it just opening the mailbox or opening some mail or deleting/copying mails?
I can trigger this issue by the following sequence of IMAP commands:
telnet dovecot imap
...
0 login peter <super-duper-secret-password-goes-here>
1 select inbox
2 fetch 1 body.peek[]
After the fetch command I will get the message sent to me and then the connection goes away. ("2 OK Fetch completed." is _not_ printed)
A debugger attached to the imap process after the inital login has been handled gives the following traceback:
(dbx) where [1] __lwp_kill(0x0, 0x6, 0x1000e7934, 0x1a1a80, 0x0, 0x0), at 0xffffffff7ebd40a4 [2] raise(0x6, 0x0, 0x1000e8338, 0xffffffffffffffff, 0xffffffff7ecec000, 0x0), at 0xffffffff7eb71110 [3] abort(0x1, 0x1b8, 0x1000e7934, 0x1a1a80, 0x0, 0x0), at 0xffffffff7eb4a68c =>[4] i_internal_fatal_handler(type = ???, status = ???, fmt = ???, args = ???) (optimized), at 0x1000e8338 (line ~150) in "failures.c" [5] i_panic(format = ???, ... = ???, ...) (optimized), at 0x1000e7934 (line ~197) in "failures.c" [6] index_mail_close(_mail = ???) (optimized), at 0x10008180c (line ~1091) in "index-mail.c" [7] index_mail_free(_mail = ???) (optimized), at 0x100081d78 (line ~1279) in "index-mail.c" [8] mail_free(mail = ???) (optimized), at 0x100091214 (line ~18) in "mail.c" [9] imap_fetch_deinit(ctx = ???) (optimized), at 0x10002cc14 (line ~392) in "imap-fetch.c" [10] cmd_fetch(cmd = ???) (optimized), at 0x10001c5e0 (line ~74) in "cmd-fetch.c" [11] client_command_input(cmd = ???) (optimized), at 0x100028890 (line ~580) in "client.c" [12] client_handle_input(client = ???) (optimized), at 0x100028fe0 (line ~670) in "client.c" [13] client_input(client = ???) (optimized), at 0x1000291cc (line ~725) in "client.c" [14] io_loop_handler_run(ioloop = ???) (optimized), at 0x1000f2614 (line ~204) in "ioloop-poll.c" [15] io_loop_run(ioloop = ???) (optimized), at 0x1000f1bb8 (line ~320) in "ioloop.c" [16] main(argc = ???, argv = ???, envp = ???) (optimized), at 0x1000391b8 (line ~293) in "main.c"
(dbx) frame 6 Current function is index_mail_close (optimized) 1091 i_assert(!mail->data.destroying_stream); (dbx) list 1085 1085 } (dbx) list 1086 if (mail->data.filter_stream != NULL) 1087 i_stream_unref(&mail->data.filter_stream); 1088 if (mail->data.stream != NULL) { 1089 mail->data.destroying_stream = TRUE; 1090 i_stream_unref(&mail->data.stream); 1091 i_assert(!mail->data.destroying_stream); 1092 } 1093 } 1094 1095 static void index_mail_reset(struct index_mail *mail)
Some kind of a way to reproduce this would be helpful. Or I guess in your case even a backtrace, since all the previous ones have been with COPY command and mbox.
I've tried a couple of different compilers and optimization settings now and the results are as follows:
Sun Studio 12:
-g -fast -m64 FAILS -g -O -m64 FAILS -g -m64 WORKS -fast -m64 FAILS -g -O -m32 WORKS
GCC 4.3.0:
-g -O -m64 FAILS -g -m64 WORKS -g -O -m32 WORKS
Ie: Optimized 64 bit code fails. Other variants works.
Hmm.. Perhaps some 64 vs 32 bits issue somewhere? Pointers passed as 32 bit ints due to missing prototypes somewhere?
I saw the other mail regarding imap problems (subject: imap crashes with SIGSEGV). The machine I'm running this on is a Sun Fire T1000 that have a CPU with 6 cores and 4 threads per core - ie, it looks like a 24 CPU multicore machine. Perhaps a related issue?
- Peter
Some more debugging info that might be useful (or might not, but I figure I'd include it here anyway):
I started the 'imap' process manually from withing Suns 'dbx' debugger and enabled memory checking:
setenv MAIL maildir:/home/peter/Maildir dbx /ifm/pkg/dovecot/1.1.3-cc-32-debug-opt/libexec/dovecot/imap ... (dbx) check -all access checking - ON memuse checking - ON (dbx) run Running: imap (process id 8574) ... RTC: Enabling Error Checking... RTC: Running program...
- PREAUTH [CAPABILITY IMAP4rev1 SASL-IR SORT THREAD=REFERENCES MULTIAPPEND UNSELECT LITERAL+ IDLE CHILDREN NAMESPACE LOGIN-REFERRALS UIDPLUS LIST-EXTENDED I18NLEVEL=1] Logged in as peter 1 select inbox Read from uninitialized (rui): Attempting to read 4 bytes at address 0xffbff288 which is 216 bytes above the current stack pointer stopped in maildir_open (optimized) at line 429 in file "maildir-storage.c" 429 if (stat(t_strconcat(path, "/dovecot-shared", NULL), &st) == 0) {
Might be a false alarm, but might be worth checking out anyway.
(This is when running a 32 bit version which seems to work just fine normally).
I'm having some problems starting the 64 bit versions under the debugger currently (go figure)...
- Peter
Another thing I just noticed (but you probably already is aware of that):
There seems to be a number of places in 'index-mail.c' that stores 'time_t' values in 'uint32_t' variables.
This might cause problems since 'time_t' is 64 bit on 64 bit Solaris systems... (Definitely will cause some funny behaviour in the future when time_t values won't fit inside 32 bits ints :-)
Btw - If I compile just index-mail.c without optimizations then things seem to work fine.
- Peter
Peter Eriksson wrote:
Another thing I just noticed (but you probably already is aware of that):
A last thing... I did some debugger tracing of the calls to i_stream_unref and printed the arguments (see the attached file)
It seems the *stream in these two calls to i_stream_unref references the same 'stream'.
The second call to i_stream_unref is the one wrapped with the 'data.destroying_stream' check that triggers the assert.
I inserted a breakpoint at the index_mail_stream_destroy_callback function and that one actually gets called correctly with 'mail->data.destroying_stream' set to '1'. If I then check the '_mail' structure contents back in the i_stream_unref function it is correctly set to '0'.
But the i_assert() call on line 1091 still triggers for some unknown reason...
My guess:
Optimizer incorrectly assuming that it doesn't need to refetch the variable value from the structure since it doesn't understand that the i_stream_unref(&mail->data.stream) call actually modifies the whole mail->data structure...
Funny that both Gcc and Sun Studio seems to make the same assumptions in that case :-)
mail->data.destroying_stream = TRUE; i_stream_unref(&mail->data.stream); i_assert(!mail->data.destroying_stream);
- Peter
Forgot to attach the dbx output... *Sigh*
Anyway, here it is.
- Peter
(dbx) cont
stopped in i_stream_unref (optimized) at line 20 in file "istream.c"
20 {
(dbx) print stream
stream = 0x100253a30
(dbx) print *stream
*stream = 0x1002b4290
(dbx) print **stream
**stream = {
v_offset = 3961U
stream_errno = 0
mmaped = 0
blocking = 1U
closed = 0
seekable = 1U
eof = 0
real_stream = 0x1002b4240
}
(dbx) cont
stopped in i_stream_unref (optimized) at line 20 in file "istream.c"
20 {
(dbx) print stream
stream = 0x1002692b0
(dbx) print *stream
*stream = 0x1002b4290
(dbx) print **stream
**stream = {
v_offset = 3961U
stream_errno = 0
mmaped = 0
blocking = 1U
closed = 0
seekable = 1U
eof = 0
real_stream = 0x1002b4240
}
(dbx) print *((**stream).real_stream)
*(**stream).real_stream = {
iostream = {
refcount = 1
close = 0x1000eff48 = &imap
istream-file.ci_stream_file_close(struct iostream_private *stream) destroy = 0x1000effa4 = &
imapistream-file.c
i_stream_file_destroy(struct iostream_private *stream)
set_max_buffer_size = 0x1000ee4f8 = &imap
istream.ci_stream_default_set_max_buffer_size(struct iostream_private *stream, register size_t max_size) destroy_callback = 0x100080e00 = &
imapindex-mail.c
index_mail_stream_destroy_callback(struct index_mail *mail)
destroy_context = 0x100269128
}
read = 0x1000effd4 = &imap
istream-file.ci_stream_file_read(struct istream_private *stream) seek = 0x1000f0270 = &
imapistream-file.c
i_stream_file_seek(struct istream_private *stream, register uoff_t v_offset, bool mark)
sync = 0x1000f02e8 = &imap
istream-file.ci_stream_file_sync(struct istream_private *stream) stat = 0x1000f0328 = &
imapistream-file.c
i_stream_file_stat(struct istream_private *stream, bool exact)
istream = {
v_offset = 3961U
stream_errno = 0
mmaped = 0
blocking = 1U
closed = 0
seekable = 1U
eof = 0
real_stream = 0x1002b4240
}
fd = 9
abs_start_offset = 0
statbuf = {
st_dev = 0
st_ino = 0
st_mode = 0
st_nlink = 0
st_uid = 0
st_gid = 0
st_rdev = 0
st_size = -1
st_atim = {
tv_sec = 1221059662
tv_nsec = 0
}
st_mtim = {
tv_sec = 1221059662
tv_nsec = 0
}
st_ctim = {
tv_sec = 1221059662
tv_nsec = 0
}
st_blksize = 0
st_blocks = 0
st_fstype = ""
}
buffer = 0x1002b43a0 "Return-Path: <ug-swosug-bounces@opensolaris.org>\nReceived: from mail.opensolaris.org (oss-mail1.opensolaris.org [72.5.123.71])\n^Iby mailgw.ifm.liu.se (8.13.6/8.13.6) with ESMTP id l2N8cXJg022066\n^Ifor <peter@ifm.liu.se>; Fri, 23 Mar 2007 09:38:35 +0100 (MET)\nReceived: from oss-mail1.opensolaris.org (localhost [127.0.0.1])\n^Iby mail.opensolaris.org (Postfix) with ESMTP id 476273EE67;\n^IFri, 23 Mar 2007 00:38:25 -0800 (PST)\nX-Original-To: ug-swosug@opensolaris.org\nDelivered-To: ug-swosug@opensolaris.org\nReceived:" ...
w_buffer = 0x1002b43a0 "Return-Path: <ug-swosug-bounces@opensolaris.org>\nReceived: from mail.opensolaris.org (oss-mail1.opensolaris.org [72.5.123.71])\n^Iby mailgw.ifm.liu.se (8.13.6/8.13.6) with ESMTP id l2N8cXJg022066\n^Ifor <peter@ifm.liu.se>; Fri, 23 Mar 2007 09:38:35 +0100 (MET)\nReceived: from oss-mail1.opensolaris.org (localhost [127.0.0.1])\n^Iby mail.opensolaris.org (Postfix) with ESMTP id 476273EE67;\n^IFri, 23 Mar 2007 00:38:25 -0800 (PST)\nX-Original-To: ug-swosug@opensolaris.org\nDelivered-To: ug-swosug@opensolaris.org\nReceived:" ...
buffer_size = 4096U
max_buffer_size = 8192U
skip = 3961U
pos = 3961U
parent = (nil)
parent_start_offset = 0
line_str = (nil)
}
...
(dbx) print (*(struct index_mail *)_mail)->data.destroying_stream (*((struct index_mail *) _mail)).data.destroying_stream = 0
(dbx) where
[1] __lwp_kill(0x0, 0x6, 0x1000e7934, 0x1a1a80, 0x0, 0x0), at 0xffffffff7ebd40a4
[2] raise(0x6, 0x0, 0x1000e8338, 0xffffffffffffffff, 0xffffffff7ecec000, 0x0), at 0xffffffff7eb71110
[3] abort(0x1, 0x1b8, 0x1000e7934, 0x1a1a80, 0x0, 0x0), at 0xffffffff7eb4a68c
[4] i_internal_fatal_handler(type = ???, status = ???, fmt = ???, args = ???) (optimized), at 0x1000e8338 (line ~150) in "failures.c"
[5] i_panic(format = ???, ... = ???, ...) (optimized), at 0x1000e7934 (line ~197) in "failures.c"
=>[6] index_mail_close(_mail = ???) (optimized), at 0x10008180c (line ~1091) in "index-mail.c"
[7] index_mail_free(_mail = ???) (optimized), at 0x100081d78 (line ~1279) in "index-mail.c"
[8] mail_free(mail = ???) (optimized), at 0x100091214 (line ~18) in "mail.c"
[9] imap_fetch_deinit(ctx = ???) (optimized), at 0x10002cc14 (line ~392) in "imap-fetch.c"
[10] cmd_fetch(cmd = ???) (optimized), at 0x10001c5e0 (line ~74) in "cmd-fetch.c"
[11] client_command_input(cmd = ???) (optimized), at 0x100028890 (line ~580) in "client.c"
[12] client_handle_input(client = ???) (optimized), at 0x100028fe0 (line ~670) in "client.c"
[13] client_input(client = ???) (optimized), at 0x1000291cc (line ~725) in "client.c"
[14] io_loop_handler_run(ioloop = ???) (optimized), at 0x1000f2614 (line ~204) in "ioloop-poll.c"
[15] io_loop_run(ioloop = ???) (optimized), at 0x1000f1bb8 (line ~320) in "ioloop.c"
[16] main(argc = ???, argv = ???, envp = ???) (optimized), at 0x1000391b8 (line ~293) in "main.c"
On Wed, 2008-09-10 at 15:46 +0200, Peter Eriksson wrote:
Attempting to read 4 bytes at address 0xffbff288 which is 216 bytes above the current stack pointer stopped in maildir_open (optimized) at line 429 in file "maildir-storage.c" 429 if (stat(t_strconcat(path, "/dovecot-shared", NULL), &st) == 0) {
Might be a false alarm, but might be worth checking out anyway.
I don't see how that code could be broken. It also goes fine through valgrind.
There seems to be a number of places in 'index-mail.c' that stores 'time_t' values in 'uint32_t' variables.
This might cause problems since 'time_t' is 64 bit on 64 bit Solaris systems... (Definitely will cause some funny behaviour in the future when time_t values won't fit inside 32 bits ints :-)
It'll fit for the next 97 years. And I doubt it'll be a problem then anymore.
Optimizer incorrectly assuming that it doesn't need to refetch the variable value from the structure since it doesn't understand that the i_stream_unref(&mail->data.stream) call actually modifies the whole mail->data structure...
Funny that both Gcc and Sun Studio seems to make the same assumptions in that case :-)
That is interesting.. I don't think I'm doing anything wrong there though. I'll see if I can reproduce this and then try to reduce the test case needed to catch this and then ask gcc people if it's really a bug.
Timo Sirainen escreveu:
On Wed, 2008-09-10 at 15:46 +0200, Peter Eriksson wrote:
There seems to be a number of places in 'index-mail.c' that stores 'time_t' values in 'uint32_t' variables.
This might cause problems since 'time_t' is 64 bit on 64 bit Solaris systems... (Definitely will cause some funny behaviour in the future when time_t values won't fit inside 32 bits ints :-)
It'll fit for the next 97 years. And I doubt it'll be a problem then anymore.
That's how the Y2K bug started.:-)
-- Eduardo M Kalinowski eduardo@kalinowski.com.br
Optimizer incorrectly assuming that it doesn't need to refetch the variable value from the structure since it doesn't understand that the i_stream_unref(&mail->data.stream) call actually modifies the whole mail->data structure...
Funny that both Gcc and Sun Studio seems to make the same assumptions in that case :-)
That is interesting.. I don't think I'm doing anything wrong there though. I'll see if I can reproduce this and then try to reduce the test case needed to catch this and then ask gcc people if it's really a bug.
I'm getting more and more sure that this is a C pointer aliasing optimization problem that occurs since the compiler thinks that the i_stream_unref() function doesn't modify the mail->data.destroying_stream flag (not surprising since i_stream_unref is passed a pointer to &mail->data.stream so it looks like it only modifies other parts of the mail->data structure).
Today I can't seem to reproduce it with the GCC compiler (only with the Sun Studio compiler). *Mind boggles*. What the...
Anyway, I've set up a small test case that tries to mimic the code in question (attached). Build with either of the following commands:
make gcc-64-alias make gcc-64-no-alias make gcc-32-alias make gcc-32-no-alias make cc-64-alias make cc-64-no-alias make cc-32-alias make cc-32-no-alias
All will generate two binaries t.all and t.sep. The difference is that t.sep is separately compiled source files, whereas t.all is a single compile of all the .c files concated (to allow the compiler to make better aliasing analysis).
For me t.sep built by cc-64-alias and cc-32-alias fails, all other works.
Btw. If I rebuild Dovecot 1.1.3 with Sun Studio 12, and then just rebuilds src/lib-storage/index/index-mail.c with "-xalias_level=any" so it can't assume that the unref call doesn't modify the destroying_stream flag then things work (no assert triggered).
What makes me a bit worried is if there are other parts of Dovecot that uses similar coding that might trigger similar problem again in the futured. Perhaps one should compile all the files in Dovecot with "-xalias_level=any" when compiling with the Studio compilers just in case. Hmmm...
- Peter
On Thu, 2008-09-11 at 15:13 +0200, Peter Eriksson wrote:
Anyway, I've set up a small test case that tries to mimic the code in question (attached). Build with either of the following commands:
I simplified it further, attached. The interesting thing is that this bug shows up only when destroyed is a bitfield. For example:
unsigned char destroyed; -> ok unsigned int destroyed; -> ok unsigned int destroyed:1; -> fail unsigned int destroyed:8; -> fail unsigned int destroyed:31; -> fail unsigned int destroyed:32; -> ok
This feels like a compiler bug to me. Any ideas where this could be reported to? :)
What makes me a bit worried is if there are other parts of Dovecot that uses similar coding that might trigger similar problem again in the futured.
Possibly.
participants (3)
-
Eduardo M KALINOWSKI
-
Peter Eriksson
-
Timo Sirainen