On 07/22/2011 01:02 PM, Kostas Zorbadelos wrote:
Hello,
since I saw no action on this, here is a newer update we discovered today.
After setting pop3_lock_session = no the core dumps went away. We will leave it like that and watch it for the next few days. If we set pop3_lock_session = yes, the problem is reproduced.
If I can do anything else to help debug the problem, please let me know.
Regards,
Kostas
Greetings to all.
It's my first post to the list. We just completed a migration from qpopper to dovecot for our IMAP and POP3 services. We have a rather large mail environment (we are the biggest provider in Greece).
So, here are the details:
- Keep getting errors like these in our production environment
Jul 22 00:18:21 pop01 dovecot: master: Error: service(pop3): child 4078 killed with signal 11 (core dumps disabled) Jul 22 00:19:31 pop03 dovecot: master: Error: service(pop3): child 18849 killed with signal 11 (core dumps disabled)
dovecot -n output
/opt/dovecot/sbin/dovecot -n # 2.0.13: /opt/dovecot/etc/dovecot/dovecot.conf # OS: Linux 2.6.18-92.1.22.el5 x86_64 CentOS release 5.5 (Final) auth_cache_negative_ttl = 10 mins auth_cache_size = 5 M auth_cache_ttl = 10 mins auth_verbose = yes default_client_limit = 5000 default_process_limit = 500 disable_plaintext_auth = no first_valid_uid = 200 listen = * log_timestamp = "%Y-%m-%d %H:%M:%S " login_greeting =<COMPANY> ready mail_access_groups = mail otemail disk root mail_fsync = always mail_location = mbox:INDEX=/var/index/dovecot/%2.16Hn/%2.254Hn/%u mail_nfs_storage = yes mbox_lock_timeout = 2 mins mbox_min_index_size = 200 k mbox_read_locks = dotlock_try fcntl mbox_write_locks = dotlock_try fcntl passdb { args = /opt/dovecot/etc/dovecot/dovecot-ldap.conf.ext driver = ldap } protocols = imap pop3 service auth-worker { user = dovenull } service imap-login { inet_listener imap { port = 143 } inet_listener imaps { port = 993 ssl = yes } } service pop3-login { inet_listener pop3 { port = 110 } inet_listener pop3s { port = 995 ssl = yes } } ssl = no userdb { args = /opt/dovecot/etc/dovecot/dovecot-ldap.conf.ext driver = ldap } verbose_proctitle = yes protocol imap { imap_client_workarounds = delay-newmail tb-extra-mailbox-sep mail_max_userip_connections = 100 } protocol pop3 { mail_max_userip_connections = 100 pop3_client_workarounds = outlook-no-nuls oe-ns-eoh pop3_fast_size_lookups = yes pop3_lock_session = yes pop3_reuse_xuidl = yes pop3_uidl_format = %08Xu%08Xv }
I enabled core dumps in one of our backend servers and here is the relevant gdb trace:
[root@pop08 ~]# gdb /opt/dovecot/libexec/dovecot/pop3
/core.9273 GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-32.el5_6.2) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or laterhttp://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /opt/dovecot/libexec/dovecot/pop3...(no debugging symbols found)...done. Reading symbols from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0...(no debugging symbols found)...done. Loaded symbols for /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 Reading symbols from /opt/dovecot/lib/dovecot/libdovecot.so.0...(no debugging symbols found)...done. Loaded symbols for /opt/dovecot/lib/dovecot/libdovecot.so.0 Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done. Loaded symbols for /lib64/librt.so.1 Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done. Loaded symbols for /lib64/libpthread.so.0 Core was generated by `dovecot/pop3'. Program terminated with signal 11, Segmentation fault. #0 0x00002b52e1027e54 in istream_raw_mbox_get_start_offset () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 (gdb) bt full #0 0x00002b52e1027e54 in istream_raw_mbox_get_start_offset () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 No symbol table info available. #1 0x00002b52e102b759 in ?? () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 No symbol table info available. #2 0x00002b52e100a2c0 in index_mail_expunge () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 No symbol table info available. #3 0x0000000000405e9c in client_update_mails () No symbol table info available. #4 0x00000000004061c1 in client_command_execute () No symbol table info available. #5 0x00000000004045b9 in client_handle_input () No symbol table info available. #6 0x00002b52e12df698 in io_loop_call_io () from /opt/dovecot/lib/dovecot/libdovecot.so.0 No symbol table info available. #7 0x00002b52e12e09d5 in io_loop_handler_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0 No symbol table info available. #8 0x00002b52e12df62d in io_loop_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0 No symbol table info available. #9 0x00002b52e12cdf13 in master_service_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0 No symbol table info available. #10 0x0000000000403994 in main () No symbol table info available. (gdb) All traces of the crashes are identical, that is #0 0x00002b52e1027e54 in istream_raw_mbox_get_start_offset () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 #1 0x00002b52e102b759 in ?? () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 #2 0x00002b52e100a2c0 in index_mail_expunge () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 #3 0x0000000000405e9c in client_update_mails () #4 0x00000000004061c1 in client_command_execute () #5 0x00000000004045b9 in client_handle_input () #6 0x00002b52e12df698 in io_loop_call_io () from /opt/dovecot/lib/dovecot/libdovecot.so.0 #7 0x00002b52e12e09d5 in io_loop_handler_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0 #8 0x00002b52e12df62d in io_loop_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0 #9 0x00002b52e12cdf13 in master_service_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0 #10 0x0000000000403994 in main ()
We have mboxes over NFS and we also have an ldap user backend. For now, I do not have a scenario that reproduces the problem. Any idea, or input are highly appreciated. Of course I can provide any information requested (without exposing restricted company or client data) to help trace the problem and lead to the solution.
Thanks and keep up the good work!
Regards,
Kostas Zorbadelos