[Dovecot] imap process consuming 100% CPU (Dovecot 1.0.3)
Hi,
I have yet another problem with Dovecot: sometimes (rarely, maybe once every few days) one of the imap processes will 'hang', consuming all available CPU time. It does not seem to 'finish' in any reasonable amount of time (in one instance I waited a few days). This process will not even exit gracefully, it needs to be killed with 'kill -9 <PID>'.
It has just happened again. The output from gdb is:
# gdb attach 29401 GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i586-suse-linux"...attach: No such file or directory.
Attaching to process 29401 Reading symbols from /usr/local/libexec/dovecot/imap...done. Using host libthread_db library "/lib/tls/libthread_db.so.1". Reading symbols from /lib/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/tls/libc.so.6...done. Loaded symbols for /lib/tls/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 0x0806049d in imap_sync_more (ctx=0x80d9770) at imap-sync.c:104 104 if (ctx->seq == 0) { (gdb) bt full #0 0x0806049d in imap_sync_more (ctx=0x80d9770) at imap-sync.c:104 flags = 1075042320 keywords = <value optimized out> str = (string_t *) 0x80c31e0 ret = 1 #1 0x08060797 in cmd_sync_continue (cmd=0x80ccdbc) at imap-sync.c:207 ctx = (struct cmd_sync_context *) 0x80cf000 ret = <value optimized out> #2 0x0805ad5a in _client_output (context=0x80ccd78) at client.c:465 cmd = (struct client_command_context *) 0x80ccdbc ret = 0 #3 0x080af0bd in stream_send_io (context=0x80cceb0) at ostream-file.c:356 ostream = (struct ostream *) 0x80ccee0 ret = <value optimized out> #4 0x080aa441 in io_loop_handler_run (ioloop=0x80cb9b0) at ioloop-epoll.c:203 ctx = (struct ioloop_handler_context *) 0x80cb9d8 list = (struct io_list *) 0x80d7780 io = (struct io *) 0x80dc430 tv = {tv_sec = 2, tv_usec = 540048} events_count = 4 t_id = 2 msecs = <value optimized out> ret = 1 i = 0 j = 1 call = <value optimized out> #5 0x080a9889 in io_loop_run (ioloop=0x80cb9b0) at ioloop.c:329 No locals. #6 0x08062087 in main (argc=1, argv=0xbffffaf4, envp=0xbffffafc) at main.c:290 plugin_dir = <value optimized out>
Mail log does not reveal anything immediately interesting.
Thoughts?
Server config ('dovecot -n -c /etc/dovecot/dovecot.conf'):
# 1.0.3: /etc/dovecot/dovecot.conf base_dir: /var/run/dovecot/ protocols: imap imaps pop3 pop3s ssl_ca_file: /etc/ssl/current/cacert.pem ssl_cert_file: /etc/ssl/current/server.pem ssl_key_file: /etc/ssl/current/private.pem disable_plaintext_auth: no login_dir: /var/run/dovecot//login login_executable(default): /usr/local/libexec/dovecot/imap-login login_executable(imap): /usr/local/libexec/dovecot/imap-login login_executable(pop3): /usr/local/libexec/dovecot/pop3-login first_valid_uid: 50 mail_debug: yes mail_drop_priv_before_exec: yes mail_executable(default): /usr/local/libexec/dovecot/imap mail_executable(imap): /usr/local/libexec/dovecot/imap mail_executable(pop3): /usr/local/libexec/dovecot/pop3 mail_plugin_dir(default): /usr/local/lib/dovecot/imap mail_plugin_dir(imap): /usr/local/lib/dovecot/imap mail_plugin_dir(pop3): /usr/local/lib/dovecot/pop3 mail_log_prefix: %Us(%u)(%p): pop3_uidl_format(default): pop3_uidl_format(imap): pop3_uidl_format(pop3): %08Xu%08Xv auth default: debug_passwords: yes passdb: driver: passwd-file args: /etc/dovecot/passwd userdb: driver: passwd-file args: /etc/dovecot/passwd socket: type: listen client: path: /var/spool/postfix/private/auth mode: 432 user: postfix group: postfix
-- Best regards, Robert Tomanek mailto:rth23@poczta.onet.pl
Hi,
Sunday, September 2, 2007, 22:42:37, you wrote:
0x0806049d in imap_sync_more (ctx=0x80d9770) at imap-sync.c:104 104 if (ctx->seq == 0) {
A short follow-up on this, looks like an infinite loop to me, unless some threading magic is supposed to happen here:
(gdb) step 120 if (ctx->sync_rec.seq1 > ctx->messages_count) (gdb) 104 if (ctx->seq == 0) { (gdb) 117 if (ctx->sync_rec.seq2 > ctx->messages_count) { (gdb) 120 if (ctx->sync_rec.seq1 > ctx->messages_count) (gdb) 104 if (ctx->seq == 0) { (gdb) 117 if (ctx->sync_rec.seq2 > ctx->messages_count) { (gdb) 120 if (ctx->sync_rec.seq1 > ctx->messages_count) (gdb) print ctx->seq $1 = 8317 (gdb) print ctx->sync_rec.seq1 $2 = 8318 (gdb) print ctx->sync_rec.seq2 $3 = 8320 (gdb) print ctx->messages_count $4 = 8317
The corresponding code in imap-sync.c is: 103: for (;;) { 104: if (ctx->seq == 0) { 105: /* get next one */ 106: ret = mailbox_sync_next(ctx->sync_ctx, 107: &ctx->sync_rec); 108: if (ret <= 0) { 109: if (ret == 0) { 110: /* all finished ok */ 111: ret = 1; 112: } 113: break; 114: } 115: } 116: 117: if (ctx->sync_rec.seq2 > ctx->messages_count) { 118: /* don't send change notifications of messages we 119: haven't even announced to client yet */ 120: if (ctx->sync_rec.seq1 > ctx->messages_count) 121: continue; 122: ctx->sync_rec.seq2 = ctx->messages_count; 123: } ... 185: }
-- Best regards, Robert Tomanek mailto:rth23@poczta.onet.pl
On Mon, 2007-09-03 at 12:37 +0200, Robert Tomanek wrote:
Hi,
Sunday, September 2, 2007, 22:42:37, you wrote:
0x0806049d in imap_sync_more (ctx=0x80d9770) at imap-sync.c:104 104 if (ctx->seq == 0) {
A short follow-up on this, looks like an infinite loop to me, unless some threading magic is supposed to happen here:
Hi,
Saturday, September 8, 2007, 2:29:45, you wrote:
0x0806049d in imap_sync_more (ctx=0x80d9770) at imap-sync.c:104 104 if (ctx->seq == 0) {
A short follow-up on this, looks like an infinite loop to me, unless some threading magic is supposed to happen here:
Compiled the new version, so far so good...
Thanks!
-- Best regards, Robert Tomanek mailto:rth23@poczta.onet.pl
On Mon, 2007-09-03 at 12:37 +0200, Robert Tomanek wrote:
Hi,
Sunday, September 2, 2007, 22:42:37, you wrote:
0x0806049d in imap_sync_more (ctx=3D0x80d9770) at imap-sync.c:104 104 if (ctx->seq =3D=3D 0) {
A short follow-up on this, looks like an infinite loop to me, unless some threading magic is supposed to happen here:
We also are seeing imap processes consuming 100% CPU. We have installed Dovecot v1.0.5 and still see the problem. I have a user who can pretty reliably reproduce the problem. It occurs each morning after the imap connection has been idling all night. The user is running Thunderbird 2.0.0.6 (I think) on Solaris. On the client side, he see an hourglass that never goes away, has to shut down Thunderbird.
Below is a log, which shows how the CPU is being used (the ps commands are done one right after another) and a dbx trace of the state of the imap process. I'm hoping that's enough info to figure out what's happening.
Thanks much, Jackie
Jackie Hunt
ACNS Voice: (970) 663-3789
Colorado State University FAX: (970) 491-1958
Fort Collins, CO 80523 Email: jackie.hunt@colostate.edu
$ ps -ef | grep -i xxxxxx | grep imap xxxxxx 43004 92942 74 07:15:33 - 38:54 imap $ ps -ef | grep -i xxxxxx | grep imap xxxxxx 43004 92942 81 07:15:33 - 38:58 imap $ ps -ef | grep -i xxxxxx | grep imap xxxxxx 43004 92942 120 07:15:33 - 39:01 imap $ dbx -a 43004 Waiting to attach to process 43004 ... Successfully attached to imap. warning: Directory containing imap could not be determined. Apply 'use' command to initialize source path.
Type 'help' for help.
reading symbolic information ...
stopped in pread at 0xd01f06cc
0xd01f06cc (pread+0x30) 80410014 lwz r2,0x14(r1)
(dbx) step
stopped in istream-file._read at line 120 in file ""
could not read "istream-file.c"
(dbx) use src lib src/lib src/lib-storage/index/mbox src/lib-mail src/lib-storage/index src/lib-storage src/imap src/imap-login
(dbx) use
src lib src/lib src/lib-storage/index/mbox src/lib-mail src/lib-storage/index src/lib-storage src/imap src/imap-login
(dbx) where
istream-file._read(stream = 0x200274a8), line 120 in "istream-file.c"
i_stream_read(stream = 0x200274cc), line 58 in "istream.c"
istream-raw-mbox._read(stream = 0x20028ab8), line 160 in "istream-raw-mbox.c"
i_stream_read(stream = 0x20028adc), line 58 in "istream.c"
istream-limit._read(stream = 0x20031458), line 56 in "istream-limit.c"
i_stream_read(stream = 0x2003147c), line 58 in "istream.c"
istream-header-filter._read(stream = 0x2004aa98), line 234 in "istream-header-filter.c"
i_stream_read(stream = 0x2004aabc), line 58 in "istream.c"
i_stream_read_data(stream = 0x2004aabc, data = 0x2ff22630, size = 0x2ff22638, threshold = 1), line 250 in "istream.c"
message_get_body_size(input = 0x2004aabc, body = 0x2002827c, has_nuls = (nil)), line 107 in "message-size.c"
index_mail_init_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 502 in "index-mail.c"
mbox_mail_get_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 206 in "mbox-mail.c"
mail_get_stream(mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 107 in "mail.c"
imap-fetch-body.fetch_body(ctx = 0x200205e0, mail = 0x200281a8, context = 0x20020858), line 331 in "imap-fetch-body.c"
imap_fetch(ctx = 0x200205e0), line 291 in "imap-fetch.c"
cmd_fetch(cmd = 0x2001e35c), line 163 in "cmd-fetch.c"
cmd_uid(cmd = 0x2001e35c), line 19 in "cmd-uid.c"
client_handle_input(cmd = 0x2001e35c), line 344 in "client.c"
client_handle_input(cmd = 0x2001e35c), line 398 in "client.c"
_client_input(context = 0x2001e318), line 441 in "client.c"
io_loop_handler_run(ioloop = 0x2001d0e8), line 199 in "ioloop-poll.c"
io_loop_run(ioloop = 0x2001d0e8), line 329 in "ioloop.c"
main(argc = 1, argv = 0x2ff22b50, envp = 0x2ff22b58), line 290 in "main.c"
(dbx) list
58 return _stream->read(_stream);
59 }
60
61 void i_stream_skip(struct istream *stream, uoff_t count)
62 {
63 struct _istream *_stream = stream->real_stream;
64 size_t data_size;
65
66 data_size = _stream->pos - _stream->skip;
67 if (count <= data_size) {
(dbx) dump
i_stream_read(stream = 0x2003147c), line 58 in "istream.c"
_stream = 0x20031458
(dbx) up
istream-header-filter._read(stream = 0x2004aa98), line 234 in "istream-header-filter.c"
(dbx) list
234 if ((ret = i_stream_read(mstream->input)) == -2) {
235 if (stream->skip == 0)
236 return -2;
237 }
238 stream->istream.stream_errno = mstream->input->stream_errno;
239 stream->istream.eof = mstream->input->eof;
240 stream->buffer = i_stream_get_data(mstream->input, &pos);
241 } else {
242 ret = 0;
243 }
(dbx) dump
istream-header-filter._read(stream = 0x2004aa98), line 234 in "istream-header-filter.c"
mstream = 0x2004aa98
ret = 4094
pos = 1
(dbx) up
i_stream_read(stream = 0x2004aabc), line 58 in "istream.c"
(dbx) list
58 return _stream->read(_stream);
59 }
60
61 void i_stream_skip(struct istream *stream, uoff_t count)
62 {
63 struct _istream *_stream = stream->real_stream;
64 size_t data_size;
65
66 data_size = _stream->pos - _stream->skip;
67 if (count <= data_size) {
(dbx) dump
i_stream_read(stream = 0x2004aabc), line 58 in "istream.c"
_stream = 0x2004aa98
(dbx) up
i_stream_read_data(stream = 0x2004aabc, data = 0x2ff22630, size = 0x2ff22638, threshold = 1), line 250 in "istream.c"
(dbx) list
250 ret = i_stream_read(stream);
251 if (ret > 0)
252 read_more = TRUE;
253 } while (ret > 0);
254
255 *data = i_stream_get_data(stream, size);
256 return ret == -2 ? -2 :
257 (read_more || ret == 0 ? 0 : -1);
258 }
259
(dbx) dump
i_stream_read_data(stream = 0x2004aabc, data = 0x2ff22630, size = 0x2ff22638, threshold = 1), line 250 in "istream.c"
read_more = 0
ret = 0
(dbx) up
message_get_body_size(input = 0x2004aabc, body = 0x2002827c, has_nuls = (nil)), line 107 in "message-size.c"
(dbx) list
107 } while (i_stream_read_data(input, &msg, &size, 1) > 0);
108
109 i_stream_skip(input, 1);
110 body->physical_size++;
111
112 body->virtual_size = body->physical_size + missing_cr_count;
113 i_assert(body->virtual_size >= body->physical_size);
114 }
115
116 void message_size_add(struct message_size *dest,
(dbx) dump
message_get_body_size(input = 0x2004aabc, body = 0x2002827c, has_nuls = (nil)), line 107 in "message-size.c"
last_cr = 0
missing_cr_count = 83389
msg = "dd"
i = 4095
size = 1
(dbx) up
index_mail_init_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 502 in "index-mail.c"
(dbx) list
502 message_get_body_size(data->stream,
503 &data->body_size, NULL);
504 data->body_size_set = TRUE;
505 }
506 }
507
508 *body_size = data->body_size;
509 }
510
511 if (data->hdr_size_set && data->body_size_set) {
(dbx) dump
index_mail_init_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 502 in "index-mail.c"
data = 0x2002820c
mail = 0x200281a8
(dbx) up
mbox_mail_get_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 206 in "mbox-mail.c"
(dbx) list
206 return index_mail_init_stream(mail, hdr_size, body_size);
207 }
208
209 struct mail_vfuncs mbox_mail_vfuncs = {
210 index_mail_free,
211 index_mail_set_seq,
212
213 index_mail_get_flags,
214 index_mail_get_keywords,
215 index_mail_get_parts,
(dbx) dump
mbox_mail_get_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 206 in "mbox-mail.c"
raw_stream = 0x2ff22780
mbox = 0x20024a98
data = 0x2002820c
mail = 0x200281a8
offset = 804398928
(dbx) up
mail_get_stream(mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 107 in "mail.c"
(dbx) list
107 return p->v.get_stream(mail, hdr_size, body_size);
108 }
109
110 const char *mail_get_special(struct mail *mail, enum mail_fetch_field field)
111 {
112 struct mail_private *p = (struct mail_private *)mail;
113
114 return p->v.get_special(mail, field);
115 }
116
(dbx) dump
mail_get_stream(mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 107 in "mail.c"
p = 0x200281a8
(dbx) up
imap-fetch-body.fetch_body(ctx = 0x200205e0, mail = 0x200281a8, context = 0x20020858), line 331 in "imap-fetch-body.c"
(dbx) list
331 body->section[0] == 'H' ? NULL : &body_size);
332 if (ctx->cur_input == NULL)
333 return -1;
334
335 i_stream_ref(ctx->cur_input);
336 ctx->update_partial = TRUE;
337
338 switch (body->section[0]) {
339 case '\0':
340 /* BODY[] - fetch everything */
(dbx) dump
imap-fetch-body.fetch_body(ctx = 0x200205e0, mail = 0x200281a8, context = 0x20020858), line 331 in "imap-fetch-body.c"
fetch_size = 0x20011dd8
hdr_size = (physical_size = 843, virtual_size = 843, lines = 15)
body = 0x20020858
body_size = (physical_size = 804398512, virtual_size = 12, lines = 0)
(dbx) up
imap_fetch(ctx = 0x200205e0), line 291 in "imap-fetch.c"
(dbx) list
291 ret = handlers[ctx->cur_handler].
292 handler(ctx, ctx->cur_mail,
293 handlers[ctx->cur_handler].context);
294 t_pop();
295
296 if (ret == 0)
297 return 0;
298
299 if (ret < 0) {
300 if (ctx->cur_mail->expunged) {
(dbx) dump
imap_fetch(ctx = 0x200205e0), line 291 in "imap-fetch.c"
handlers = 0x200206f8
count = 3
ret = 1
(dbx) up
cmd_fetch(cmd = 0x2001e35c), line 163 in "cmd-fetch.c"
(dbx) list
163 if ((ret = imap_fetch(ctx)) == 0) {
164 /* unfinished */
165 client->command_pending = TRUE;
166 cmd->func = cmd_fetch_continue;
167 cmd->context = ctx;
168 return FALSE;
169 }
170 if (ret < 0)
171 ctx->failed = TRUE;
172
(dbx) dump
cmd_fetch(cmd = 0x2001e35c), line 163 in "cmd-fetch.c"
messageset = "11988"
search_arg = 0x200205a8
ctx = 0x200205e0
args = 0x2001e590
ret = 536954712
client = 0x2001e318
(dbx) up
cmd_uid(cmd = 0x2001e35c), line 19 in "cmd-uid.c"
(dbx) list
19 return cmd->func(cmd);
20 } else {
21 client_send_tagline(cmd, t_strconcat(
22 "BAD Unknown UID command ", cmd_name, NULL));
23 return TRUE;
24 }
25 }
(dbx) dump
cmd_uid(cmd = 0x2001e35c), line 19 in "cmd-uid.c"
cmd_name = "fetch"
(dbx) up
client_handle_input(cmd = 0x2001e35c), line 344 in "client.c"
(dbx) list
344 if (cmd->func(cmd) || cmd->param_error) {
345 /* command execution was finished */
346 if (!cmd->param_error)
347 client->bad_counter = 0;
348 _client_reset_command(client);
349 return TRUE;
350 }
351
352 /* unfinished */
353 if (client->command_pending)
(dbx) dump
client_handle_input(cmd = 0x2001e35c), line 344 in "client.c"
client = 0x2001e318
(dbx) up
client_handle_input(cmd = 0x2001e35c), line 398 in "client.c"
(dbx) list
398 client_handle_input(cmd);
399 }
400
401 return TRUE;
402 }
403
404 void _client_input(void *context)
405 {
406 struct client *client = context;
407 struct client_command_context *cmd = &client->cmd;
(dbx) dump
client_handle_input(cmd = 0x2001e35c), line 398 in "client.c"
client = 0x2001e318
(dbx) up
_client_input(context = 0x2001e318), line 441 in "client.c"
(dbx) list
441 ret = client_handle_input(cmd);
442 t_pop();
443 } while (ret && !client->disconnected);
444 o_stream_uncork(client->output);
445 client->handling_input = FALSE;
446
447 if (client->command_pending)
448 client->input_pending = TRUE;
449
450 if (client->output->closed)
(dbx) dump
_client_input(context = 0x2001e318), line 441 in "client.c"
ret = 2
cmd = 0x2001e35c
client = 0x2001e318
(dbx) up
io_loop_handler_run(ioloop = 0x2001d0e8), line 199 in "ioloop-poll.c"
(dbx) list
199 io->callback(io->context);
200 if (t_pop() != t_id) {
201 i_panic("Leaked a t_pop() call in "
202 "I/O handler %p",
203 (void *)io->callback);
204 }
205 }
206 }
207 }
208 }
(dbx) dump
io_loop_handler_run(ioloop = 0x2001d0e8), line 199 in "ioloop-poll.c"
t_id = 2
tv = (tv_sec = 1, tv_usec = 54083)
msecs = 1055
pollfd = 0x2001d148
ctx = 0x2001d118
ret = 0
io = 0x2001e4e8
call = 1
(dbx) up
io_loop_run(ioloop = 0x2001d0e8), line 329 in "ioloop.c"
(dbx) list
329 io_loop_handler_run(ioloop);
330 }
331
332 void io_loop_stop(struct ioloop *ioloop)
333 {
334 ioloop->running = FALSE;
335 }
336
337 void io_loop_set_running(struct ioloop *ioloop)
338 {
(dbx) dump
io_loop_run(ioloop = 0x2001d0e8), line 329 in "ioloop.c"
(dbx) up
main(argc = 1, argv = 0x2ff22b50, envp = 0x2ff22b58), line 290 in "main.c"
(dbx) list
290 io_loop_run(ioloop);
291 main_deinit();
292
293 io_loop_destroy(&ioloop);
294 lib_deinit();
295
296 return 0;
297 }
(dbx) dump
main(argc = 1, argv = 0x2ff22b50, envp = 0x2ff22b58), line 290 in "main.c"
(dbx) up
__start() at 0x100001d8
(dbx) list
no source file
(dbx) dump
__start() at 0x100001d8
(dbx) cont
execution completed (dbx) quit $
On Wed, 2007-09-26 at 09:12 -0600, Jackie Hunt wrote:
istream-file._read(stream = 0x200274a8), line 120 in "istream-file.c" i_stream_read(stream = 0x200274cc), line 58 in "istream.c" istream-raw-mbox._read(stream = 0x20028ab8), line 160 in "istream-raw-mbox.c" i_stream_read(stream = 0x20028adc), line 58 in "istream.c" istream-limit._read(stream = 0x20031458), line 56 in "istream-limit.c" i_stream_read(stream = 0x2003147c), line 58 in "istream.c" istream-header-filter._read(stream = 0x2004aa98), line 234 in "istream-header-filter.c" i_stream_read(stream = 0x2004aabc), line 58 in "istream.c" i_stream_read_data(stream = 0x2004aabc, data = 0x2ff22630, size = 0x2ff22638, threshold = 1), line 250 in "istream.c" message_get_body_size(input = 0x2004aabc, body = 0x2002827c, has_nuls = (nil)), line 107 in "message-size.c" index_mail_init_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 502 in "index-mail.c" mbox_mail_get_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 206 in "mbox-mail.c" mail_get_stream(mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 107 in "mail.c" imap-fetch-body.fetch_body(ctx = 0x200205e0, mail = 0x200281a8, context = 0x20020858), line 331 in "imap-fetch-body.c" imap_fetch(ctx = 0x200205e0), line 291 in "imap-fetch.c" cmd_fetch(cmd = 0x2001e35c), line 163 in "cmd-fetch.c" cmd_uid(cmd = 0x2001e35c), line 19 in "cmd-uid.c" client_handle_input(cmd = 0x2001e35c), line 344 in "client.c" client_handle_input(cmd = 0x2001e35c), line 398 in "client.c" _client_input(context = 0x2001e318), line 441 in "client.c" io_loop_handler_run(ioloop = 0x2001d0e8), line 199 in "ioloop-poll.c" io_loop_run(ioloop = 0x2001d0e8), line 329 in "ioloop.c"
This looks like a normal UID FETCH command. Are you sure Dovecot is doing infinite looping here, or if it's just that Thunderbird is sending the same UID FETCH command over and over?
If it's Dovecot doing the infinite loop, could you truss the process for a while and show me the output?
Or it would be good to know which one of these functions is looping. Can you set breakpoints and continue the execution? With gdb I'd do:
b imap_fetch continue // does it stop? if not, ^C and try the next one: b fetch_body continue b mail_get_stream continue ..etc..
The last breakpoint where it doesn't stop is looping.
Attached is the truss for the imap process we are seeing which is chewing up CPU. We've seen this issue on several different clients, usually first thing in the morning. Shutting down the client and restarting always seems to get Dovecot back in sync. Hope we can get some help with this. We're running 1.0.5. Thanks!
Jackie Hunt ACNS Voice: (970) 663-3789 Colorado State University FAX: (970) 491-1958 Fort Collins, CO 80523 Email: jackie.hunt@colostate.edu
On Sun, 30 Sep 2007, Timo Sirainen wrote:
On Wed, 2007-09-26 at 09:12 -0600, Jackie Hunt wrote:
istream-file._read(stream = 0x200274a8), line 120 in "istream-file.c" i_stream_read(stream = 0x200274cc), line 58 in "istream.c" istream-raw-mbox._read(stream = 0x20028ab8), line 160 in "istream-raw-mbox.c" i_stream_read(stream = 0x20028adc), line 58 in "istream.c" istream-limit._read(stream = 0x20031458), line 56 in "istream-limit.c" i_stream_read(stream = 0x2003147c), line 58 in "istream.c" istream-header-filter._read(stream = 0x2004aa98), line 234 in "istream-header-filter.c" i_stream_read(stream = 0x2004aabc), line 58 in "istream.c" i_stream_read_data(stream = 0x2004aabc, data = 0x2ff22630, size = 0x2ff22638, threshold = 1), line 250 in "istream.c" message_get_body_size(input = 0x2004aabc, body = 0x2002827c, has_nuls = (nil)), line 107 in "message-size.c" index_mail_init_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 502 in "index-mail.c" mbox_mail_get_stream(_mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 206 in "mbox-mail.c" mail_get_stream(mail = 0x200281a8, hdr_size = 0x2ff227a8, body_size = 0x2ff227b8), line 107 in "mail.c" imap-fetch-body.fetch_body(ctx = 0x200205e0, mail = 0x200281a8, context = 0x20020858), line 331 in "imap-fetch-body.c" imap_fetch(ctx = 0x200205e0), line 291 in "imap-fetch.c" cmd_fetch(cmd = 0x2001e35c), line 163 in "cmd-fetch.c" cmd_uid(cmd = 0x2001e35c), line 19 in "cmd-uid.c" client_handle_input(cmd = 0x2001e35c), line 344 in "client.c" client_handle_input(cmd = 0x2001e35c), line 398 in "client.c" _client_input(context = 0x2001e318), line 441 in "client.c" io_loop_handler_run(ioloop = 0x2001d0e8), line 199 in "ioloop-poll.c" io_loop_run(ioloop = 0x2001d0e8), line 329 in "ioloop.c"
This looks like a normal UID FETCH command. Are you sure Dovecot is doing infinite looping here, or if it's just that Thunderbird is sending the same UID FETCH command over and over?
If it's Dovecot doing the infinite loop, could you truss the process for a while and show me the output?
Or it would be good to know which one of these functions is looping. Can you set breakpoints and continue the execution? With gdb I'd do:
b imap_fetch continue // does it stop? if not, ^C and try the next one: b fetch_body continue b mail_get_stream continue ..etc..
The last breakpoint where it doesn't stop is looping.
On Wed, 2007-10-10 at 10:50 -0600, Jackie Hunt wrote:
Attached is the truss for the imap process we are seeing which is chewing up CPU. We've seen this issue on several different clients, usually first thing in the morning. Shutting down the client and restarting always seems to get Dovecot back in sync. Hope we can get some help with this. We're running 1.0.5. Thanks!
The truss output looks like Dovecot re-reads the entire mbox file for each FETCH command. Did you use NFS? Do you have mbox_dirty_syncs=yes (default)?
Anyway this isn't an infinite loop in Dovecot. It's processing multiple FETCH commands. Maybe the client got stuck somehow and just keeps sending the same command over and over again? Would have been nice to see the entire FETCH command and its output. "truss -r0 -w1" does that I think.
On Sun, 2007-10-21 at 00:47 +0300, Timo Sirainen wrote:
On Wed, 2007-10-10 at 10:50 -0600, Jackie Hunt wrote:
Attached is the truss for the imap process we are seeing which is chewing up CPU. We've seen this issue on several different clients, usually first thing in the morning. Shutting down the client and restarting always seems to get Dovecot back in sync. Hope we can get some help with this. We're running 1.0.5. Thanks!
The truss output looks like Dovecot re-reads the entire mbox file for each FETCH command. Did you use NFS? Do you have mbox_dirty_syncs=yes (default)?
Anyway this isn't an infinite loop in Dovecot. It's processing multiple FETCH commands. Maybe the client got stuck somehow and just keeps sending the same command over and over again? Would have been nice to see the entire FETCH command and its output. "truss -r0 -w1" does that I think.
"truss -d -r0 -w1" for future reference. That includes nice timestamps and also shows the IMAP traffic.
Hi,
I have yet another problem with Dovecot: sometimes (rarely, maybe once every few days) one of the imap processes will 'hang', consuming all available CPU time. It does not seem to 'finish' in any reasonable amount of time (in one instance I waited a few days).
....
Robert Tomanek mailto:rth23@poczta.onet.pl
We have also seen this behavior, running Dovecot 1.0 on AIX.
Thanks for your debug work on it Robert.
Jackie
Jackie Hunt
ACNS Voice: (970) 663-3789
Colorado State University FAX: (970) 491-1958
Fort Collins, CO 80523 Email: jackie.hunt@colostate.edu
Hello Jackie,
Thursday, September 6, 2007, 21:09:48, you wrote:
I have yet another problem with Dovecot: sometimes (rarely, maybe once every few days) one of the imap processes will 'hang', consuming all available CPU time. It does not seem to 'finish' in any reasonable amount of time (in one instance I waited a few days). We have also seen this behavior, running Dovecot 1.0 on AIX.
Thanks for your debug work on it Robert.
Unfortunately, I am not really moving forward with that as I do not have much insight into the context structure in question. I guess we need to wait until Timo answers my e-mail.
On the other hand, nice the hear I am not the only person with this problem :-|
Jackie, is there a chance you could confirm my doubts by doing what I did in mid:1799018363.20070903123736@poczta.onet.pl and the original message? This way we would be sure we are talking about the same problem cause here since 'Dovecot hangs' sounds a bit generic. I think the stack trace and the variable values should be enough to confirm it.
-- Best regards, Robert Tomanek mailto:rth23@poczta.onet.pl
Hello Jackie,
Unfortunately, I am not really moving forward with that as I do not have much insight into the context structure in question. I guess we need to wait until Timo answers my e-mail.
Yes, haven't seen much from him lately. Must be busy or vacation...
On the other hand, nice the hear I am not the only person with this problem :-|
I think I've also seen these posted in the past from users. We hadn't experienced it until recently, just went into "production" this summer.
Jackie, is there a chance you could confirm my doubts by doing what I did in mid:1799018363.20070903123736@poczta.onet.pl and the original message? This way we would be sure we are talking about the same problem cause here since 'Dovecot hangs' sounds a bit generic. I think the stack trace and the variable values should be enough to confirm it.
Sorry, don't know about the mid: protocol. Not sure how to get to the reference you sent. From your email it looks like gdb has the ability to attach to a running process. We have dbx which I don't believe can do that. It would be nice to confirm our issues are the same though. What I'd really like is to find a way to recreate the problem.
Jackie
Jackie Hunt
ACNS Voice: (970) 663-3789
Colorado State University FAX: (970) 491-1958
Fort Collins, CO 80523 Email: jackie.hunt@colostate.edu
On Sep 7 2007, Jackie Hunt wrote:
Sorry, don't know about the mid: protocol. Not sure how to get to the reference you sent. From your email it looks like gdb has the ability to attach to a running process. We have dbx which I don't believe can do that. It would be nice to confirm our issues are the same though. What I'd really like is to find a way to recreate the problem.
Start dbx, and type 'attach <pid>' at its prompt.
-Brian Hayden OIT Internet Services, University of Minnesota
Start dbx, and type 'attach <pid>' at its prompt.
-Brian Hayden OIT Internet Services, University of Minnesota
Thanks, Brian. Got me looking again and on AIX I can use:
dbx -a <pid>
I'll see what I can learn...
Jackie
Jackie Hunt
ACNS Voice: (970) 663-3789
Colorado State University FAX: (970) 491-1958
Fort Collins, CO 80523 Email: jackie.hunt@colostate.edu
Hello Jackie,
Friday, September 7, 2007, 7:47:59 PM, you wrote:
Jackie, is there a chance you could confirm my doubts by doing what I did in mid:1799018363.20070903123736@poczta.onet.pl and the original message? This way we would be sure we are talking about the same problem cause here since 'Dovecot hangs' sounds a bit generic. I think the stack trace and the variable values should be enough to confirm it. Sorry, don't know about the mid: protocol. Not sure how to get to the reference you sent. From your email it looks like gdb has the ability to attach to a running process. We have dbx which I don't believe can do that. It would be nice to confirm our issues are the same though. What I'd really like is to find a way to recreate the problem.
Sorry for being unclear -- mid is simply a Message-Id (my e-mail client supports directly referencing messages this way; in any case, it's a part of message headers so you can uniquely identify a message this way). In this case, I meant my message sent on 2007-09-03 12:37 where I pasted the debug session and my suspicions about the infinite loop.
Recreating the problem would be great, I hope you will be more successful than I am -- no idea so far. Should be easier for you since you probably have more users with real-life e-mail usage scenarios. Let us know if you get any progress here.
As for debugging, Brian (see previous e-mail in thread) mentioned attaching to a process is similar with dbx. I've never used dbx but a quick search in Google suggests it is pretty easy to use and similar to gdb. Your mileage may vary, though :)
-- Best regards, Robert Tomanek mailto:rth23@poczta.onet.pl
participants (4)
-
bhayden@umn.edu
-
Jackie Hunt
-
Robert Tomanek
-
Timo Sirainen