[Dovecot] Signal 11 - can't get core dump?
Hi all, We have been running Dovecot for over a year now in our moderate volume mail environment (>4000 mailboxes) and it's been running great.
However, over the past month or so we've begun to experience issues where IMAP clients will appear to "hang" for a period of time, we then see a sig 11, and afterwards mailflow returns to normal. This usually happens a few times a day.
Naturally, our first instinct was to try and get a core dump, but after following the instructions at http://www.dovecot.org/bugreport.html we now recieve this warning:
2010-08-24T13:33:56-07:00 mail dovecot: child 9728 (login) killed with signal 11 (core not dumped - add -D to login_executable) (ip=208.98.210.13)
Adding -D to the login_executable lines didn't seem to have any effect, we are still receiving the same message. Did we perhaps misinterpret the hint?
As we've been trying for over a week now to find a way to get a dump without any success, we figured posing this to the list might help us shed some light on the issue. Any suggestions would be appreciated!
dovecot -n
# 1.1.20: /etc/dovecot.conf # OS: Linux 2.6.18-164.15.1.el5xen i686 CentOS release 5.5 (Final) nfs base_dir: /var/run/dovecot/ listen: 208.98.210.10 ssl_ca_file: /etc/ssl/gd_bundle.crt ssl_cert_file: /etc/ssl/sunwave.net.crt ssl_key_file: /etc/ssl/sunwave.net.key ssl_parameters_regenerate: 1 ssl_cipher_list: ALL:!LOW verbose_ssl: yes login_dir: /var/run/dovecot//login login_executable(default): /usr/libexec/dovecot/imap-login -D login_executable(imap): /usr/libexec/dovecot/imap-login -D login_executable(pop3): /usr/libexec/dovecot/pop3-login -D mail_max_userip_connections(default): 50 mail_max_userip_connections(imap): 50 mail_max_userip_connections(pop3): 10 first_valid_uid: 200 last_valid_uid: 200 first_valid_gid: 200 last_valid_gid: 200 mail_location: maildir:/var/spool/mail/%Ld/%1Lu/%Ln/Maildir mail_debug: yes mmap_disable: yes mail_nfs_storage: yes mail_nfs_index: yes mail_executable(default): /usr/local/bin/mailtools/mail_imap.sh mail_executable(imap): /usr/local/bin/mailtools/mail_imap.sh mail_executable(pop3): /usr/local/bin/mailtools/mail_pop3.sh mail_plugins(default): quota imap_quota trash mail_plugins(imap): quota imap_quota trash mail_plugins(pop3): quota mail_plugin_dir(default): /usr/lib/dovecot/imap mail_plugin_dir(imap): /usr/lib/dovecot/imap mail_plugin_dir(pop3): /usr/lib/dovecot/pop3 namespace: type: private prefix: INBOX. inbox: yes list: yes subscriptions: yes lda: postmaster_address: do_not_reply@sunwave.net mail_plugins: quota cmusieve auth default: mechanisms: plain login default_realm: sunwave.net verbose: yes passdb: driver: sql args: /etc/dovecot-mysql.conf userdb: driver: sql args: /etc/dovecot-mysql.conf socket: type: listen client: path: /var/spool/postfix/private/auth mode: 432 user: postfix group: postfix master: path: /var/run/dovecot/auth-master mode: 511 user: vmail plugin: sieve: /var/spool/mail/%Ld/%1Lu/%Ln/Maildir/.sieve quota: maildir:Quota quota_rule2: Trash:storage=10%% quota_warning: storage=70%% /usr/local/bin/mailtools/quota-warning.sh 70 quota_warning2: storage=80%% /usr/local/bin/mailtools/quota-warning.sh 80 quota_warning3: storage=90%% /usr/local/bin/mailtools/quota-warning.sh 90 trash: /etc/dovecot-trash.conf
Thanks, Marty
On 25.8.2010, at 0.09, Marty Anstey wrote:
However, over the past month or so we've begun to experience issues where IMAP clients will appear to "hang" for a period of time, we then see a sig 11, and afterwards mailflow returns to normal. This usually happens a few times a day.
Naturally, our first instinct was to try and get a core dump, but after following the instructions at http://www.dovecot.org/bugreport.html we now recieve this warning:
2010-08-24T13:33:56-07:00 mail dovecot: child 9728 (login) killed with signal 11 (core not dumped - add -D to login_executable) (ip=208.98.210.13)
This is a pre-login crash. Those can be difficult.
Adding -D to the login_executable lines didn't seem to have any effect, we are still receiving the same message. Did we perhaps misinterpret the hint?
That should work with Linux. But maybe it doesn't work with all kernels. A few things you could anyway try:
- check that /var/run/dovecot/login directory is writable by the login process user (dovecot I guess)
- try login_chroot=no
- echo 2 > /proc/sys/fs/suid_dumpable
# 1.1.20: /etc/dovecot.conf
Unless the crash is something simple (which I doubt it is), I don't really want to spend time trying to debug and fix it for v1.1 anymore. There's a good chance it's already fixed in v1.2 (and a very good chance it's fixed in v2.0).
You could also try setting login_process_per_connection=no. That makes login processes behave quite differently and could also happen to fix this.
On 24/08/2010 4:48 PM, Timo Sirainen wrote:
That should work with Linux. But maybe it doesn't work with all kernels. A few things you could anyway try:
- check that /var/run/dovecot/login directory is writable by the login process user (dovecot I guess)
- try login_chroot=no
- echo 2 > /proc/sys/fs/suid_dumpable
# 1.1.20: /etc/dovecot.conf Unless the crash is something simple (which I doubt it is), I don't really want to spend time trying to debug and fix it for v1.1 anymore. There's a good chance it's already fixed in v1.2 (and a very good chance it's fixed in v2.0).
You could also try setting login_process_per_connection=no. That makes login processes behave quite differently and could also happen to fix this.
We will give these suggestions a try. If we succeed in getting a core dump, i'll report back.... otherwise, it looks like our best option will be to upgrade.
Thanks!
Marty
Am 25.08.2010 um 18:26 schrieb Marty Anstey:
We will give these suggestions a try. If we succeed in getting a core dump, i'll report back.... otherwise, it looks like our best option will be to upgrade.
It's not that hard to find guidance on how to enable core dumps on your distro. Actually it comes up on the first page on Google. Only thing you have to apply is reverse logic in order to enable them:
http://www.cyberciti.biz/faq/linux-disable-core-dumps/
Educated guess would be to add the correct parameters to '/etc/security/limits.conf', or as they are likely to be overridden by start-scripts or service 'defaults' to wherever RH flavors do that…
# grep -i core /etc/security/* # grep -i core /etc/default/*
Updating is advised in any case but will spoil the fun of searching for the tweaks, and at some point you may require the same thing with the current releases and then you're back to the start.
Regards Thomas
participants (3)
-
Marty Anstey
-
Thomas Leuxner
-
Timo Sirainen