[Dovecot] Dovecot pop3 segfault problems
Hi to all,
I have a Debian 4.0 x86_64 server and installed Dovecot from the repositories
ii dovecot-common 1.0.rc15-2etch5 ii dovecot-imapd 1.0.rc15-2etch5 ii dovecot-pop3d 1.0.rc15-2etch5
Recently I'm getting a lot of these messages
pop3[24594]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fffc5d0c1e0 error 4
in syslog and also dmesg.
Also the load of the servers keeps increasing (like above 100) and when dovecot is stopped the load decreases.
What problem could this be and how can I resolve it?
Thanks, Enid
On 2010-09-21 5:41 AM, enid vx <enidv11@gmail.com> wrote:
I have a Debian 4.0 x86_64 server and installed Dovecot from the repositories
ii dovecot-common 1.0.rc15-2etch5
<snip>
What problem could this be and how can I resolve it?
Your first task is to upgrade. 1.0rc15 is way too old and just plain unsupported.
If you must upgrade the entire OS, it is now time to do so.
--
Best regards,
Charles
Hi Charles,
I did upgrade (with some difficulties because the server is in production) with the tar.gz version 2.0.3 but although these messages keep repeating. Is this problem related with the Dovecot sw or related with the OS or server HW?
Thanks, Enid
On Tue, Sep 21, 2010 at 2:26 PM, Charles Marcus <CMarcus@media-brokers.com>wrote:
On 2010-09-21 5:41 AM, enid vx <enidv11@gmail.com> wrote:
I have a Debian 4.0 x86_64 server and installed Dovecot from the repositories
ii dovecot-common 1.0.rc15-2etch5
<snip>
What problem could this be and how can I resolve it?
Your first task is to upgrade. 1.0rc15 is way too old and just plain unsupported.
If you must upgrade the entire OS, it is now time to do so.
--
Best regards,
Charles
On Tue, 2010-09-21 at 16:13 +0200, enid vx wrote:
Hi Charles,
I did upgrade (with some difficulties because the server is in production) with the tar.gz version 2.0.3 but although these messages keep repeating. Is this problem related with the Dovecot sw or related with the OS or server HW?
Are these coming only from pop3 processes? Are you running imap? It's anyway difficult to say anything about segfaults without a gdb backtrace. See http://dovecot.org/bugreport.html how to get one.
Hi Enid,
Please don't top-post...
On 2010-09-21 10:13 AM, enid vx <enidv11@gmail.com> wrote:
On Tue, Sep 21, 2010 at 2:26 PM, Charles Marcus wrote:
Your first task is to upgrade. 1.0rc15 is way too old and just plain unsupported.
If you must upgrade the entire OS, it is now time to do so.
I did upgrade (with some difficulties because the server is in production) with the tar.gz version 2.0.3 but although these messages keep repeating. Is this problem related with the Dovecot sw or related with the OS or server HW?
Now it is time to provide actual log entries exhibiting the problem, as well as config details (output of doveconf -n for starters)... :)
--
Best regards,
Charles
On Tue, Sep 21, 2010 at 4:19 PM, Charles Marcus <CMarcus@media-brokers.com>wrote:
Hi Enid,
Please don't top-post...
On 2010-09-21 10:13 AM, enid vx <enidv11@gmail.com> wrote:
On Tue, Sep 21, 2010 at 2:26 PM, Charles Marcus wrote:
Your first task is to upgrade. 1.0rc15 is way too old and just plain unsupported.
If you must upgrade the entire OS, it is now time to do so.
I did upgrade (with some difficulties because the server is in production) with the tar.gz version 2.0.3 but although these messages keep repeating. Is this problem related with the Dovecot sw or related with the OS or server HW?
Now it is time to provide actual log entries exhibiting the problem, as well as config details (output of doveconf -n for starters)... :)
--
The problem is happening with imap also but at lower rate. The output of doveconf -n:
# 2.0.3: /usr/local/etc/dovecot/dovecot.conf # OS: Linux 2.6.18-6-amd64 x86_64 Debian 4.0 auth_mechanisms = plain login default_login_user = dovecot disable_plaintext_auth = no mail_location = mbox:~:INBOX=/var/mail/%u passdb { driver = pam } ssl_cert = </etc/ssl/certs/dovecot.pem ssl_key = </etc/ssl/private/dovecot.pem userdb { driver = passwd }
Thanks,
Enid
On Tue, Sep 21, 2010 at 5:14 PM, Charles Marcus <CMarcus@media-brokers.com>wrote:
On 2010-09-21 11:04 AM, enid vx <enidv11@gmail.com> wrote:
The output of doveconf -n:
Log entries exhibiting the problem?
--
Best regards,
Charles
The logs showing this problems are:
Sep 22 08:21:48 domainname kernel: pop3[21085]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff3eb5d6a0 error 4 Sep 22 08:23:49 domainname kernel: pop3[22195]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff471b2fa0 error 4 Sep 22 08:25:49 domainname kernel: pop3[23323]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff643022e0 error 4 Sep 22 08:27:49 domainname kernel: pop3[24336]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff306fd090 error 4 Sep 22 08:29:49 domainname kernel: pop3[25450]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff03a128c0 error 4 Sep 22 08:31:49 domainname kernel: pop3[26374]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fffefb431a0 error 4 Sep 22 08:33:49 domainname kernel: pop3[27445]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fffa1a08510 error 4 Sep 22 08:35:49 domainname kernel: pop3[29432]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fffec6d2e40 error 4 Sep 22 08:37:49 domainname kernel: pop3[30872]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff11964130 error 4
Each of them followed by: Sep 22 08:37:49 domainname dovecot: child 30872 (pop3) killed with signal 11
Thanks,
Enid
On 2010-09-22 5:14 AM, enid vx <enidv11@gmail.com> wrote:
The logs showing this problems are:
Sep 22 08:21:48 domainname kernel: pop3[21085]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff3eb5d6a0 error 4 Sep 22 08:23:49 domainname kernel: pop3[22195]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff471b2fa0 error 4 Sep 22 08:25:49 domainname kernel: pop3[23323]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff643022e0 error 4 Sep 22 08:27:49 domainname kernel: pop3[24336]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff306fd090 error 4 Sep 22 08:29:49 domainname kernel: pop3[25450]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff03a128c0 error 4 Sep 22 08:31:49 domainname kernel: pop3[26374]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fffefb431a0 error 4 Sep 22 08:33:49 domainname kernel: pop3[27445]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fffa1a08510 error 4 Sep 22 08:35:49 domainname kernel: pop3[29432]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fffec6d2e40 error 4 Sep 22 08:37:49 domainname kernel: pop3[30872]: segfault at 0000000000000004 rip 000000000044333a rsp 00007fff11964130 error 4
Each of them followed by: Sep 22 08:37:49 domainname dovecot: child 30872 (pop3) killed with signal 11
and now, can you respond to Timos request for a backtrace:
On 2010-09-21 10:16 AM, Timo Sirainen <tss@iki.fi> wrote:
Are these coming only from pop3 processes? Are you running imap? It's anyway difficult to say anything about segfaults without a gdb backtrace. See http://dovecot.org/bugreport.html how to get one.
--
Best regards,
Charles
and now, can you respond to Timos request for a backtrace:
On 2010-09-21 10:16 AM, Timo Sirainen <tss@iki.fi> wrote:
Are these coming only from pop3 processes? Are you running imap? It's anyway difficult to say anything about segfaults without a gdb backtrace. See http://dovecot.org/bugreport.html how to get one.
--
Hi all,
after the update I noticed that the error messages didn't go away for a little time. Also seeing the high load and high kernel cpu usage I did some changes to the /etc/fstab (adding noatime,nodiratime) to /var /home partitions. Also the dovecot -n now is like this: auth_mechanisms = plain login default_login_user = dovecot disable_plaintext_auth = no dotlock_use_excl = yes mail_access_groups = mail mail_fsync = never mail_location = mbox:~/.:INBOX=/var/mail/%u mmap_disable = yes passdb { driver = pam } ssl_cert = </etc/ssl/certs/dovecot.pem ssl_key = </etc/ssl/private/dovecot.pem userdb { driver = passwd }
I see that for about 2-3 days now the error has gone away, and dovecot is running ok, but the high load is continually present, and I suspect of disk I/O latency.
Again thank you for your support. Enid
On 2010-09-24 6:27 AM, enid vx wrote:
I see that for about 2-3 days now the error has gone away, and dovecot is running ok, but the high load is continually present, and I suspect of disk I/O latency.
So, any stats on how busy your server is? Dovecot is very efficient - much more so at least than Courier-imap. Many people report both a huge increase in speed and huge decrease in server load after upgrading from it. My servers aren't very busy so I only noticed the huge increase in performance...
--
Best regards,
Charles
On 9/24/2010 5:27 AM, enid vx wrote:
and now, can you respond to Timos request for a backtrace:
On 2010-09-21 10:16 AM, Timo Sirainen<tss@iki.fi> wrote:
Are these coming only from pop3 processes? Are you running imap? It's anyway difficult to say anything about segfaults without a gdb backtrace. See http://dovecot.org/bugreport.html how to get one.
--
Hi all,
after the update I noticed that the error messages didn't go away for a little time. Also seeing the high load and high kernel cpu usage I did some changes to the /etc/fstab (adding noatime,nodiratime) to /var /home partitions. Also the dovecot -n now is like this: auth_mechanisms = plain login default_login_user = dovecot disable_plaintext_auth = no dotlock_use_excl = yes mail_access_groups = mail mail_fsync = never mail_location = mbox:~/.:INBOX=/var/mail/%u mmap_disable = yes passdb { driver = pam } ssl_cert =</etc/ssl/certs/dovecot.pem ssl_key =</etc/ssl/private/dovecot.pem userdb { driver = passwd }
I see that for about 2-3 days now the error has gone away, and dovecot is running ok, but the high load is continually present, and I suspect of disk I/O latency.
You might try running 'vmstat 2', 'iostat -x', etc.. It sounds to me like a disk i/o problem, but we are just guessing unless we know more about the hardware and utilization. A simple fix might be to add a fast local drive and move /var/mail to it's own spindle, assuming you don't have this configuration already.
Ken
Again thank you for your support. Enid
-- Ken Anderson Pacific Internet - http://www.pacific.net
participants (4)
-
Charles Marcus
-
enid vx
-
Ken A
-
Timo Sirainen