[Dovecot] Strange dmesg messages
Guys,
I'm getting strange messages on my new server with dovecot-1.1rc4 + lda
- sieve + ldap + postfix + suse10. This server is in production since monday, and worked fine until today. The only thing I had changed was dovecot from 1.1rc3 to 1.1rc4. I already enabled mail_debug, but I got any erros after that.
Does anyone have any idea of what is happening?
Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: <ffffffff8019df99>{sys_inotify_rm_watch+280} PGD 1107f0067 PUD 1154c8067 PMD 0 Oops: 0002 [1] SMP last sysfs file: /block/sda/size CPU 2 Modules linked in: iptable_filter ip_tables x_tables nls_utf8 joydev st sr_mod ipv6 bonding button battery ac raid0 xfs_quota ext3 jbd loop sddlmfdrv ehci_hcd uhci_hcd hw_random usbcore shpchp ide_cd cdrom pci_hotplug bnx2 sddlmadrv xfs exportfs dm_snapshot edd dm_mod fan thermal processor lpfc scsi_transport_fc sg megaraid_sas piix sd_mod scsi_mod ide_disk ide_core Pid: 24319, comm: imap Tainted: P U 2.6.16.21-0.8-smp #1 RIP: 0010:[<ffffffff8019df99>] <ffffffff8019df99>{sys_inotify_rm_watch +280} RSP: 0018:ffff810112abbf38 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff81008bd57cf8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8100360627b8 RBP: 0000000000000000 R08: 000000000000db6a R09: 00000000051ae25c R10: 0000000047f3f600 R11: 0000000000000213 R12: ffff81008bd57cc0 R13: ffff8100360627b8 R14: ffff8100360625b0 R15: ffff810121f7c580 FS: 00002b90b22a5ae0(0000) GS:ffff81012bd6b340(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 000000010adca000 CR4: 00000000000006e0 Process imap (pid: 24319, threadinfo ffff810112aba000, task ffff810026b107d0) Stack: 00000001f8c42d10 ffffffff8017b3d3 0000000000000001 00000000005ff680 00000000006027a0 0000000000000000 00000000ffffffff 00000000005e0420 00007ffff8c4af20 ffffffff8010a7be Call Trace: <ffffffff8017b3d3>{sys_read+69} <ffffffff8010a7be>{system_call+126}
And dovecot's log only shows:
Apr 2 18:09:16 mailserver02 dovecot: POP3(paulo.faria@): Disconnected:
Logged out top=0/0, retr=1/2248, del=0/495, size=2406763
2
Apr 2 18:09:20 mailserver02 deliver(everson.todoroki@):
msgid=20080402210925.B94E2400A735@relay01.com.br: saved m
ail to INBOX
Apr 2 18:09:20 mailserver02 dovecot: child 24319 (imap) killed with
signal 9
Apr 2 18:09:21 mailserver02 dovecot: pop3-login: Login:
user=
mailserver02:~ # dovecot -n # 1.1.rc4: /etc/dovecot//dovecot.conf syslog_facility: local1 protocols: imap pop3 ssl_disable: yes disable_plaintext_auth: no shutdown_clients: no login_dir: /usr//var/run/dovecot/login login_executable(default): /usr//libexec/dovecot/imap-login login_executable(imap): /usr//libexec/dovecot/imap-login login_executable(pop3): /usr//libexec/dovecot/pop3-login login_process_per_connection: no login_greeting_capability(default): yes login_greeting_capability(imap): yes login_greeting_capability(pop3): no login_process_size: 128 login_processes_count: 30 login_max_processes_count: 1024 login_max_connections: 512 max_mail_processes: 10240 mail_uid: 1033 mail_gid: 1033 mail_location: maildir:%h/Maildir mail_debug: yes mail_executable(default): /usr//libexec/dovecot/imap mail_executable(imap): /usr//libexec/dovecot/imap mail_executable(pop3): /usr//libexec/dovecot/pop3 mail_plugins(default): quota imap_quota autocreate mail_plugins(imap): quota imap_quota autocreate mail_plugins(pop3): quota mail_plugin_dir(default): /usr//lib/dovecot/imap mail_plugin_dir(imap): /usr//lib/dovecot/imap mail_plugin_dir(pop3): /usr//lib/dovecot/pop3 maildir_copy_with_hardlinks = yes pop3_uidl_format(default): %08Xu%08Xv pop3_uidl_format(imap): %08Xu%08Xv pop3_uidl_format(pop3): %f pop3_client_workarounds(default): pop3_client_workarounds(imap): pop3_client_workarounds(pop3): outlook-no-nuls namespace: type: private prefix: INBOX. inbox: yes list: yes subscriptions: yes auth default: mechanisms: plain login cache_size: 20480 cache_ttl: 300 cache_negative_ttl: 0 passdb: driver: ldap args: /etc/dovecot/dovecot-ldap.conf userdb: driver: prefetch userdb: driver: ldap args: /etc/dovecot/dovecot-ldap.conf socket: type: listen master: path: /var/run/dovecot/auth-master mode: 438 plugin: quota: maildir quota_rule: *:bytes=10240 quota_rule2: *:messages=30000 autocreate: SPAM.E Spam autocreate2: SPAM.Marcar Spam autocreate3: SPAM autocreate4: SPAM.Desmarcar Spam
[]s,
Raphael Costa
Raphael Bittencourt S. Costa wrote:
Guys,
I'm getting strange messages on my new server with dovecot-1.1rc4 + lda
- sieve + ldap + postfix + suse10. This server is in production since monday, and worked fine until today. The only thing I had changed was dovecot from 1.1rc3 to 1.1rc4. I already enabled mail_debug, but I got any erros after that.
Does anyone have any idea of what is happening?
Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: <ffffffff8019df99>{sys_inotify_rm_watch+280}
It is a kernel level crash in function sys_inotify_rm_watch.;
*/*/fs/inotify_user.c - inotify support for userspace/*/*
I don't think its dovecot's fault. Have you installed any vendor specific/propertary storage drivers?
Uldis
On Thu, 2008-04-03 at 09:25 +0300, Uldis Pakuls wrote:
Raphael Bittencourt S. Costa wrote:
Guys,
I'm getting strange messages on my new server with dovecot-1.1rc4 + lda
- sieve + ldap + postfix + suse10. This server is in production since monday, and worked fine until today. The only thing I had changed was dovecot from 1.1rc3 to 1.1rc4. I already enabled mail_debug, but I got any erros after that.
Does anyone have any idea of what is happening?
Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: <ffffffff8019df99>{sys_inotify_rm_watch+280}
It is a kernel level crash in function sys_inotify_rm_watch.;
*/*/fs/inotify_user.c - inotify support for userspace/*/*
I don't think its dovecot's fault. Have you installed any vendor specific/propertary storage drivers?
Just using Hitachi's HDLM for multipath and failover.
Uldis
Raphael Bittencourt S. Costa wrote:
On Thu, 2008-04-03 at 09:25 +0300, Uldis Pakuls wrote:
Raphael Bittencourt S. Costa wrote:
Guys,
I'm getting strange messages on my new server with dovecot-1.1rc4 + lda
- sieve + ldap + postfix + suse10. This server is in production since monday, and worked fine until today. The only thing I had changed was dovecot from 1.1rc3 to 1.1rc4. I already enabled mail_debug, but I got any erros after that.
Does anyone have any idea of what is happening?
Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: <ffffffff8019df99>{sys_inotify_rm_watch+280}
It is a kernel level crash in function sys_inotify_rm_watch.;
*/*/fs/inotify_user.c - inotify support for userspace/*/*
I don't think its dovecot's fault. Have you installed any vendor specific/propertary storage drivers?
Just using Hitachi's HDLM for multipath and failover.
It's looks like problem with storage drivers (kernel modules) ; crash after attempt to free inotify data. As workaroud I may sugest to use difrenet notify method. (configure option "--with-notify=")
Uldis
p.s. I have 9 production servers running different versions of SuSE linux and never have such a problem. Something like this I got only once: after adding LSI pseudo raid adapter useing LSI propertary drivers./**/*/*/
/*/*
I moved the maildir location from Hitachi's storage to local sas disk and got the same erro using imaptest. Debug doesn't show any usable information. :-(
Any ideas?
On Thu, 2008-04-03 at 19:01 +0300, Uldis Pakuls wrote:
Raphael Bittencourt S. Costa wrote:
On Thu, 2008-04-03 at 09:25 +0300, Uldis Pakuls wrote:
Raphael Bittencourt S. Costa wrote:
Guys,
I'm getting strange messages on my new server with dovecot-1.1rc4 + lda
- sieve + ldap + postfix + suse10. This server is in production since monday, and worked fine until today. The only thing I had changed was dovecot from 1.1rc3 to 1.1rc4. I already enabled mail_debug, but I got any erros after that.
Does anyone have any idea of what is happening?
Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: <ffffffff8019df99>{sys_inotify_rm_watch+280}
It is a kernel level crash in function sys_inotify_rm_watch.;
*/*/fs/inotify_user.c - inotify support for userspace/*/*
I don't think its dovecot's fault. Have you installed any vendor specific/propertary storage drivers?
Just using Hitachi's HDLM for multipath and failover.
It's looks like problem with storage drivers (kernel modules) ; crash after attempt to free inotify data. As workaroud I may sugest to use difrenet notify method. (configure option "--with-notify=")
Uldis
p.s. I have 9 production servers running different versions of SuSE linux and never have such a problem. Something like this I got only once: after adding LSI pseudo raid adapter useing LSI propertary drivers./**/*/*/
/*/*
The erro stops when I uncomment
mail_max_userip_connections = 10
does it make any sence?
On Thu, 2008-04-03 at 16:34 -0300, Raphael Bittencourt S. Costa wrote:
I moved the maildir location from Hitachi's storage to local sas disk and got the same erro using imaptest. Debug doesn't show any usable information. :-(
Any ideas?
On Thu, 2008-04-03 at 19:01 +0300, Uldis Pakuls wrote:
Raphael Bittencourt S. Costa wrote:
On Thu, 2008-04-03 at 09:25 +0300, Uldis Pakuls wrote:
Raphael Bittencourt S. Costa wrote:
Guys,
I'm getting strange messages on my new server with dovecot-1.1rc4 + lda
- sieve + ldap + postfix + suse10. This server is in production since monday, and worked fine until today. The only thing I had changed was dovecot from 1.1rc3 to 1.1rc4. I already enabled mail_debug, but I got any erros after that.
Does anyone have any idea of what is happening?
Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: <ffffffff8019df99>{sys_inotify_rm_watch+280}
It is a kernel level crash in function sys_inotify_rm_watch.;
*/*/fs/inotify_user.c - inotify support for userspace/*/*
I don't think its dovecot's fault. Have you installed any vendor specific/propertary storage drivers?
Just using Hitachi's HDLM for multipath and failover.
It's looks like problem with storage drivers (kernel modules) ; crash after attempt to free inotify data. As workaroud I may sugest to use difrenet notify method. (configure option "--with-notify=")
Uldis
p.s. I have 9 production servers running different versions of SuSE linux and never have such a problem. Something like this I got only once: after adding LSI pseudo raid adapter useing LSI propertary drivers./**/*/*/
/*/*
Atenciosamente,
Raphael Bittencourt S. Costa Engenharia
ALOG Data Centers do Brasil Excelência em Projetos de Hosting R Voluntários da Pátria 360 - RJ - CEP 22270-010 Telefone: 21 3083-3364 - Fax: 21 3083-3300 http://www.alog.com.br
That is true, it was a considence. I got an error after that. Now I'm lost. :-| What more tests can I do to know where the problem is? I want do more tests before recompile dovecot with --with-notify.
On Thu, 2008-04-03 at 23:41 +0300, Timo Sirainen wrote:
On Thu, 2008-04-03 at 16:55 -0300, Raphael Bittencourt S. Costa wrote:
The erro stops when I uncomment
mail_max_userip_connections = 10
does it make any sence?
The default is 10, so commenting or uncommenting it should make no difference at all.
Raphael Bittencourt S. Costa wrote:
I moved the maildir location from Hitachi's storage to local sas disk and got the same erro using imaptest. Debug doesn't show any usable information. :-(
Any ideas?
To trace your problem you need to debug kernel... as it is kernel level crash. Try ask for help in kernel.org newsgroups.
from your logs: Pid: 24319, comm: imap Tainted: P U 2.6.16.21-0.8-smp #1
this line means - kernel is "tainted": P - A module with a Proprietary license has been loaded U - An Unsupported module has been loaded, i.e. a module which is not supported by Novell (SuSE specific flag)
[ see: https://secure-support.novell.com/KanisaPlatform/Publishing/250/3582750_f.SA... ]
Try unload propertary and unsupported modules ; then run imaptest again.
Uldis,
I'll do it right now and make more tests.
Thanks,
On Fri, 2008-04-04 at 01:56 +0300, Uldis Pakuls wrote:
Raphael Bittencourt S. Costa wrote:
I moved the maildir location from Hitachi's storage to local sas disk and got the same erro using imaptest. Debug doesn't show any usable information. :-(
Any ideas?
To trace your problem you need to debug kernel... as it is kernel level crash. Try ask for help in kernel.org newsgroups.
from your logs: Pid: 24319, comm: imap Tainted: P U 2.6.16.21-0.8-smp #1
this line means - kernel is "tainted": P - A module with a Proprietary license has been loaded U - An Unsupported module has been loaded, i.e. a module which is not supported by Novell (SuSE specific flag)
[ see: https://secure-support.novell.com/KanisaPlatform/Publishing/250/3582750_f.SA... ]
Try unload propertary and unsupported modules ; then run imaptest again.
Raphael Costa
The problem still remains after I removed hdlm driver.
I've tested on Suse10 with kernel 2.6.16.21-0.8-smp x86_64 and 2.6.16.27-0.9-smp x86_64, with and without hdlm driver. What kernel version do you use on your suse servers? What is the impact on performance if I compile using --with-notify=none? If I couldn't solve this, I'll problably try Debian.
Code: f0 ff 4d 20 0f 94 c0 31 db 84 c0 74 54 48 8b 5d 28 f0 ff 4b RIP <ffffffff8019e43d>{sys_inotify_rm_watch+280} RSP <ffff810119a13f38> CR2: 0000000000000020 <1>Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: <ffffffff8019e43d>{sys_inotify_rm_watch+280} PGD 11a947067 PUD 11716f067 PMD 0 Oops: 0002 [127] SMP last sysfs file: /devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:00.0/power/state CPU 3 Modules linked in: ipv6 bonding button battery ac apparmor aamatch_pcre ext3 jbd loop usbhid shpchp ide_cd cdrom hw_random pci_hotplug ehci_hcd uhci_hcd bnx2 usbcore reiserfs dm_snapshot edd dm_mod fan thermal processor lpfc scsi_transport_fc sg megaraid_sas piix sd_mod scsi_mod ide_disk ide_core Pid: 6753, comm: imap Not tainted 2.6.16.27-0.9-smp #1 RIP: 0010:[<ffffffff8019e43d>] <ffffffff8019e43d>{sys_inotify_rm_watch +280} RSP: 0018:ffff810119a13f38 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff8101189a4af8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff810117d6f958 RBP: 0000000000000000 R08: 00000000000500fe R09: 0000000000027f70 R10: 0000000000000000 R11: 0000000000000206 R12: ffff8101189a4ac0 R13: ffff810117d6f958 R14: ffff810117d6f750 R15: ffff810119fd5c80 FS: 00002ad47d849ae0(0000) GS:ffff81012b00aa40(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000119d2d000 CR4: 00000000000006e0 Process imap (pid: 6753, threadinfo ffff810119a12000, task ffff810129d8c0c0) Stack: 000000012d6a86c0 ffffffff8017b6f3 00000000005e4ee0 00000000005db860 00000000005e4e40 0000000000000000 00000000ffffffff 00000000005e4c90 00007fff2d6b08d0 ffffffff8010a7be Call Trace: <ffffffff8017b6f3>{sys_read+69} <ffffffff8010a7be>{system_call+126}
Code: f0 ff 4d 20 0f 94 c0 31 db 84 c0 74 54 48 8b 5d 28 f0 ff 4b RIP <ffffffff8019e43d>{sys_inotify_rm_watch+280} RSP <ffff810119a13f38> CR2: 0000000000000020
On Fri, 2008-04-04 at 01:56 +0300, Uldis Pakuls wrote:
Raphael Bittencourt S. Costa wrote:
I moved the maildir location from Hitachi's storage to local sas disk and got the same erro using imaptest. Debug doesn't show any usable information. :-(
Any ideas?
To trace your problem you need to debug kernel... as it is kernel level crash. Try ask for help in kernel.org newsgroups.
from your logs: Pid: 24319, comm: imap Tainted: P U 2.6.16.21-0.8-smp #1
this line means - kernel is "tainted": P - A module with a Proprietary license has been loaded U - An Unsupported module has been loaded, i.e. a module which is not supported by Novell (SuSE specific flag)
[ see: https://secure-support.novell.com/KanisaPlatform/Publishing/250/3582750_f.SA... ]
Try unload propertary and unsupported modules ; then run imaptest again.
Raphael Bittencourt S. Costa wrote:
The problem still remains after I removed hdlm driver.
I've tested on Suse10 with kernel 2.6.16.21-0.8-smp x86_64 and 2.6.16.27-0.9-smp x86_64, with and without hdlm driver. What kernel version do you use on your suse servers?
2.6.16.13-0.4 (ppc) 2.6.16.21-0.8 (x86_64) 2.6.16.27-0.4-smp (x86) 2.6.16.27-0.8-smp (x86_64) 2.6.16.53-0.16-smp (x86) 2.6.18.8-0.7-bigsmp (x86) 2.6.21-0.9-smp (x86) 2.6.22.17-0.1-default (x86-64)
All SuSE - SLES9; SLES10 or OpenSUSE 10.x
My hardware mostly is IBM i-series and x-series servers (x86). and IBM RS6000 PPC box. Some them come with preinstalled SuSE linux, some are my own setup. In fact I have tested all dovecot versions since v 0.9.x and never have any problems related to inotify kernel API.
About inotify.: Inotify kernel api is included in "vanilla" kernel since 2.6.17-rc1-mm1 ; before this there were only patches (one of them SuSE patch).
since 2.6.19 it is intended to replace older dnotify api. 2.6.18 include a lots of patches/bugfixes against 2.6.17 code.
Linux kernel developers recommends use of dnotify if kernel is older than 2.16.17 As your case shows - there is problems with patched 2.16.16 code... try use recommended dnotify.
btw: only now I noticed; may SLES9 and SuSE 9.3 uses dnotify...:)
What is the impact on performance if I compile using --with-notify=none? If I couldn't solve this, I'll problably try Debian.
donotify is more resource expensive - inotify allows monitoring of both files and directories via a single open fd. so before you try --with-notify=none, try --with-notify=dnotify
Uldis
On Sat, 2008-04-05 at 14:22 +0300, Uldis Pakuls wrote:
donotify is more resource expensive - inotify allows monitoring of both files and directories via a single open fd. so before you try --with-notify=none, try --with-notify=dnotify
Using dnotify dovecot's performance was very bad. The system consumes 90% of cpu with just a few imap process running (about 30 process). The performance using --with-notify=none was the same as using inotify, so it could be the solution for me.
Raphael Costa
On 4/7/2008, Raphael Bittencourt S. Costa (raphaelbscosta@gmail.com) wrote:
Using dnotify dovecot's performance was very bad. The system consumes 90% of cpu with just a few imap process running (about 30 process). The performance using --with-notify=none was the same as using inotify, so it could be the solution for me.
dnotify has been deprecated for a long time, and inotify has been recommended to be used instead for just as long...
--
Best regards,
Charles
Charles Marcus wrote:
On 4/7/2008, Raphael Bittencourt S. Costa (raphaelbscosta@gmail.com) wrote:
Using dnotify dovecot's performance was very bad. The system consumes 90% of cpu with just a few imap process running (about 30 process). The performance using --with-notify=none was the same as using inotify, so it could be the solution for me.
dnotify has been deprecated for a long time, and inotify has been recommended to be used instead for just as long...
You are using SLES10 - isn't ? Have you contacted SuSE developers about this inotify problem? Even if there is something wrong with dovecot it MUST NOT cause kernel level crash, just return error to dovecot...
On Tue, 2008-04-08 at 00:16 +0300, Uldis Pakuls wrote:
Charles Marcus wrote:
On 4/7/2008, Raphael Bittencourt S. Costa (raphaelbscosta@gmail.com) wrote:
Using dnotify dovecot's performance was very bad. The system consumes 90% of cpu with just a few imap process running (about 30 process). The performance using --with-notify=none was the same as using inotify, so it could be the solution for me.
dnotify has been deprecated for a long time, and inotify has been recommended to be used instead for just as long...
You are using SLES10 - isn't ?
Yes.
Have you contacted SuSE developers about this inotify problem?
Not yet.
Even if there is something wrong with dovecot it MUST NOT cause kernel level crash, just return error to dovecot...
I will try RedHat ES5. Unfornately I have to use Suse or RedHat to have a better support on problems with the storage from Hitachi.
participants (5)
-
Charles Marcus
-
Raphael Bittencourt S. Costa
-
Raphael Bittencourt S. Costa
-
Timo Sirainen
-
Uldis Pakuls