[Dovecot] Panic: Trying to allocate 2147483648 bytes
Hi all,
I recently upgraded from courier to dovecot 1.1.12 on a Solaris 9 system with about 100 users. We have been testing dovecot for sometime in a mixed Linux/Solaris environment and are aware of the index endianess issue with multiple archs. To solve this, we run with INDEX=MEMORY (as seen in the docs) so that only Linux clients have anything to do with the indexes. We generally have much more of them anyways. So far everything is working smoothly, but when someone does a search through directory with a large number of emails, dovecot dies and prints the following message:
[ID 107833 mail.crit] Panic: Trying to allocate 2147483648 bytes
I googled around and saw this same issue elsewhere, with NFS indexes usually being the problem. However, we have indexes turned off, so I don't understand why the same problem is still happening.
Just to be sure, I ran with:
mmap_disable = yes dotlock_use_excl = no mail_nfs_storage = yes
But there has been no change, I can still reproduce it at will.
Setting mail_nfs_index = yes caused my connections to stop working, so that didn't help either. Besides, I wouldn't think that would be the answer considering we set INDEX=MEMORY.
This thread also seems to be related, but we're definitely running with this code in 1.1.12, yet it still happens:
http://dovecot.org/list/dovecot/2008-September/033615.html
I grabbed a backtrace if it is helpful:
Also, here is dovecot -n:
# 1.1.12: /usr/local/etc/dovecot.conf # OS: SunOS 5.9 sun4u protocols: imap imaps pop3 pop3s ssl_cert_file: /usr/local/ssl/certs/nbcs.key+crt.pem ssl_key_file: /usr/local/ssl/certs/nbcs.key+crt.pem login_dir: /usr/local/var/run/dovecot/login login_executable(default): /usr/local/libexec/dovecot/imap-login login_executable(imap): /usr/local/libexec/dovecot/imap-login login_executable(pop3): /usr/local/libexec/dovecot/pop3-login login_greeting_capability(default): yes login_greeting_capability(imap): yes login_greeting_capability(pop3): no mail_location: maildir:~/Maildir:INDEX=MEMORY mmap_disable: yes mail_nfs_storage: yes mail_executable(default): /usr/local/libexec/dovecot/imap mail_executable(imap): /usr/local/libexec/dovecot/imap mail_executable(pop3): /usr/local/libexec/dovecot/pop3 mail_plugins(default): quota imap_quota mail_plugins(imap): quota imap_quota mail_plugins(pop3): quota pop3_quota mail_plugin_dir(default): /usr/local/lib/dovecot/imap mail_plugin_dir(imap): /usr/local/lib/dovecot/imap mail_plugin_dir(pop3): /usr/local/lib/dovecot/pop3 pop3_uidl_format(default): %08Xu%08Xv pop3_uidl_format(imap): %08Xu%08Xv pop3_uidl_format(pop3): UID%u-%v namespace: type: private separator: . prefix: INBOX. inbox: yes list: yes subscriptions: yes auth default: passdb: driver: pam args: * userdb: driver: passwd plugin: quota: fs
Any help would be appreciated!
Thanks, -Dave
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
On Thu, 2009-03-19 at 15:25 -0400, David Halik wrote:
So far everything is working smoothly, but when someone does a search through directory with a large number of emails, dovecot dies and prints the following message:
[ID 107833 mail.crit] Panic: Trying to allocate 2147483648 bytes
So you can easily reproduce this?
I grabbed a backtrace if it is helpful:
Yes, very helpful! Is it possible to also examine variable values?
Could you try applying this patch: http://hg.dovecot.org/dovecot-1.1/rev/98a59fd61d7f
and also the attached patch. What error does it fail in?
Timo Sirainen wrote:
On Thu, 2009-03-19 at 15:25 -0400, David Halik wrote:
So far everything is working smoothly, but when someone does a search through directory with a large number of emails, dovecot dies and prints the following message:
[ID 107833 mail.crit] Panic: Trying to allocate 2147483648 bytes
So you can easily reproduce this?
Everytime and on different systems with a similar setup. Since I can't do debugging on the production server, I'm using a server with a similar setup. The only difference is probably physical memory and swap size. I'm now getting a different size in the panic error, but you get the idea:
Mar 20 13:32:10 er0.rutgers.edu IMAP(dhalik): : [ID 107833 mail.crit] Panic: Trying to allocate 2147483648 bytes
Also, I've been keeping an eye on top while this was running and it seems that Dovecot is growing at an amazing rate, which is why it runs out of room I guess:
8081 dhalik 1 58 0 1030M 871M run 0:33 53.29% imap
This was while doing a full Pine text search for a word in a folder with about 1300 emails. Shortly after it died. Keep in mind that I am only able to reproduce this problem with the homedir on NFS and INDEX=MEMORY on. The same setup on local disk had no issues.
I grabbed a backtrace if it is helpful:
Yes, very helpful! Is it possible to also examine variable values?
I'd love to, but it seems that is all the I can get out of Sun Studio unless you know some tricks. I verified that I am building with full debug symbols (-g -xs) and that there is no optimization.
Could you try applying this patch: http://hg.dovecot.org/dovecot-1.1/rev/98a59fd61d7f
and also the attached patch. What error does it fail in?
I applied both patches, rebuilt dovecot 1.1.12 with full debug, and I'm not seeing anything different. Was there supposed to be a new error? It still exhibits the same behavior and eventually core dumps when it runs out of room, same as before.
Here's the most recent traceback:
I appreciate the help Timo.
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
On Fri, 2009-03-20 at 13:57 -0400, David Halik wrote:
Also, I've been keeping an eye on top while this was running and it seems that Dovecot is growing at an amazing rate, which is why it runs out of room I guess:
8081 dhalik 1 58 0 1030M 871M run 0:33 53.29% imap
OK, so it isn't really the kind of bug I first thought it would be. Instead it's allocating/leaking way too much memory.
This was while doing a full Pine text search for a word in a folder with about 1300 emails. Shortly after it died. Keep in mind that I am only able to reproduce this problem with the homedir on NFS and INDEX=MEMORY on. The same setup on local disk had no issues.
Can you test what exactly is the difference? INDEX=MEMORY or being in NFS? I don't see how either of those would make a difference here.
To me it's currently looking like Solaris iconv() is returning some weird results in some condition.
OK, so it isn't really the kind of bug I first thought it would be. Instead it's allocating/leaking way too much memory.
I agree. I watched it closer again this time and it filled up all available memory within 30 seconds.
Can you test what exactly is the difference? INDEX=MEMORY or being in NFS? I don't see how either of those would make a difference here.
To me it's currently looking like Solaris iconv() is returning some weird results in some condition.
You're right, it doesn't look like an INDEX=MEMORY issue. Here's the breakdown:
Server1
mmap_disable = no mail_nfs_storage = no mail_nfs_index=no
Local Disk + Normal Index = No Panic Local Disk + Memory Index = No Panic
Server2
mmap_disable = yes mail_nfs_storage = yes mail_nfs_index=yes
NFS + Normal Index = Panic NFS + Memory Index = Panic
So at first I thought NFS is definitely effecting it. Here is a top from the server1 without NFS three minutes into a search through 6000 emails:
21014 dhalik 1 10 0 7920K 6352K cpu/0 2:46 46.21% imap
That looks much better.
So I tried the same thing on server2 with NFS and it filled up then panicked.
But just to be sure I moved my homedir to local disk temporarily, and you are totally right. The problem happened again! So now I'm stumped. It looks like it is neither NFS or INDEX, but an issue with Solaris as you suggested. It just seems odd that I can't reproduce it on one of the servers, yet the other two do it every single time. As far as I know the servers have the same libraries as well:
Panic Server:
bash-2.05$ uname -a SunOS 5.9 Generic_118558-39 sun4u sparc SUNW,Ultra-80 bash-2.05$ ldd /usr/local/libexec/dovecot/imap libiconv.so.2 => /usr/local/lib/libiconv.so.2 libdl.so.1 => /usr/lib/libdl.so.1 librt.so.1 => /usr/lib/librt.so.1 libnsl.so.1 => /usr/lib/libnsl.so.1 libsocket.so.1 => /usr/lib/libsocket.so.1 libsendfile.so.1 => /usr/lib/libsendfile.so.1 libc.so.1 => /usr/lib/libc.so.1 libaio.so.1 => /usr/lib/libaio.so.1 libmd5.so.1 => /usr/lib/libmd5.so.1 libmp.so.2 => /usr/lib/libmp.so.2 /usr/platform/SUNW,Ultra-80/lib/libc_psr.so.1 /usr/platform/SUNW,Ultra-80/lib/libmd5_psr.so.1
Non Panic Server:
bash-2.05$ uname -a SunOS 5.9 Generic_118558-39 sun4u sparc SUNW,Ultra-250 bash-2.05$ ldd /usr/local/libexec/dovecot/imap libiconv.so.2 => /usr/local/lib/libiconv.so.2 libdl.so.1 => /usr/lib/libdl.so.1 librt.so.1 => /usr/lib/librt.so.1 libnsl.so.1 => /usr/lib/libnsl.so.1 libsocket.so.1 => /usr/lib/libsocket.so.1 libsendfile.so.1 => /usr/lib/libsendfile.so.1 libc.so.1 => /usr/lib/libc.so.1 libaio.so.1 => /usr/lib/libaio.so.1 libmd5.so.1 => /usr/lib/libmd5.so.1 libmp.so.2 => /usr/lib/libmp.so.2 /usr/platform/SUNW,Ultra-250/lib/libc_psr.so.1 /usr/platform/SUNW,Ultra-250/lib/libmd5_psr.so.1
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
On Fri, 2009-03-20 at 15:04 -0400, David Halik wrote:
But just to be sure I moved my homedir to local disk temporarily, and you are totally right. The problem happened again! So now I'm stumped. It looks like it is neither NFS or INDEX, but an issue with Solaris as you suggested. It just seems odd that I can't reproduce it on one of the servers, yet the other two do it every single time. As far as I know the servers have the same libraries as well:
Maybe you could try copying the libraries and using LD_LIBRARY_PATH (or whatever it is in Solaris) to force using those libraries instead of the default ones? You could also try what it logs with this patch when crashing: diff -r 5deb3ee1d655 src/lib-charset/charset-iconv.c --- a/src/lib-charset/charset-iconv.c Fri Mar 20 18:23:12 2009 -0400 +++ b/src/lib-charset/charset-iconv.c Fri Mar 20 18:25:37 2009 -0400 @@ -145,6 +145,7 @@ /* force buffer to grow */ used = dest->used; size = buffer_get_size(dest) - used + 1; + i_warning("charset grows: used=%u, add=%u", (int)dest->used, (int)size); (void)buffer_append_space_unsafe(dest, size); buffer_set_used_size(dest, used); }
On 3/20/2009 6:26 PM, Timo Sirainen wrote:
Maybe you could try copying the libraries and using LD_LIBRARY_PATH (or whatever it is in Solaris) to force using those libraries instead of the default ones?
I could try that as a last resort, but I don't think it will help. We build our own packages here which you can find at rpm.rutgers.edu, so I checked the version and its the same libiconv package on both machines. I even reverted to an earlier libiconv 1.11, but the problem still persisted.
You could also try what it logs with this patch when crashing:
Here's the logs it generated before dying:
Mar 21 18:42:37 er0.rutgers.edu IMAP(dhalik): : [ID 183078 mail.warning] Growing pool 'Cache fields' with: 2048 Mar 21 18:42:50 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8130, add=63 Mar 21 18:42:50 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8157, add=8228 Mar 21 18:42:50 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8161, add=24608 Mar 21 18:42:50 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8141, add=57396 Mar 21 18:42:50 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8152, add=122921 Mar 21 18:42:54 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8131, add=254014 Mar 21 18:42:54 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8158, add=516131 Mar 21 18:42:54 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8159, add=1040418 Mar 21 18:42:54 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8139, add=2089014 Mar 21 18:42:54 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8153, add=4186152 Mar 21 18:43:04 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8132, add=8380477 Mar 21 18:43:04 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8158, add=16769059 Mar 21 18:43:04 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8161, add=33546272 Mar 21 18:43:05 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8141, add=67100724 Mar 21 18:43:05 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8153, add=134209576 Mar 21 18:43:08 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8130, add=268427327 Mar 21 18:43:10 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8159, add=536862754 Mar 21 18:43:32 er0.rutgers.edu IMAP(dhalik): : [ID 512293 mail.warning] charset grows: used=8159, add=1073733666 Mar 21 18:43:57 er0.rutgers.edu IMAP(dhalik): : [ID 107833 mail.crit] Panic: Trying to allocate 2147483648 bytes
Great, thanks! I'll try it now and see how it goes. We've actually been debugging this the last couple of days and it does seem to be related to how E2BIG from iconv is handled. We set a watch on charset_to_utf8_try and while most of the time it returned probably, every once in awhile it would get an E2BIG and bump the memory over and over.
Timo Sirainen wrote:
On Sat, 2009-03-21 at 18:58 -0400, David Halik wrote:
Mar 21 18:43:57 er0.rutgers.edu IMAP(dhalik): : [ID 107833 mail.crit] Panic: Trying to allocate 2147483648 bytes
Attached patch probably helps?
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
Unfortunately, the patch didn't help, BUT I've discovered some very interesting things along the way that I think you'd like to hear:
- The problem stems from certain emails with odd or badly formed characters.
The reason I wasn't seeing it on one machine was because I was using a different folder to test with. Once I moved over that folder the same issue occurred. This got me thinking...
- This is *not* a localized Solaris issue, I can replicate the problem on my Fedora 10 i386 workstation with the stock F10 1.1.10 package.
On Linux at least for me), however, it doesn't grow and grow, it just dies and core dumps immediately once it hits an email it doesn't like:
Mar 24 17:03:11 gunslinger IMAP(dhalik): : Panic: Trying to allocate 2147483648 bytes
Here is the backtrace from Fedora 10:
To help you out I'm going to give you the email folder that I have been using to replicate this. It's just a maildir format spam folder, so no sensitive emails... just lots and lots of junk. =) You should be able to replicate it without a problem.
http://jla.rutgers.edu/~dhalik/spamfolder.tar.gz
Steps to reproduce:
$ /usr/sbin/dovecot --exec-mail imap
- PREAUTH [CAPABILITY IMAP4rev1 SASL-IR SORT THREAD=REFERENCES MULTIAPPEND UNSELECT LITERAL+ IDLE CHILDREN NAMESPACE LOGIN-REFERRALS UIDPLUS LIST-EXTENDED I18NLEVEL=1 QUOTA] Logged in as dhalik 1 SELECT INBOX.AUTO-DELETED-SPAM
- FLAGS (\Answered \Flagged \Deleted \Seen \Draft $Forwarded)
- OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft $Forwarded)] Flags permitted.
- 1468 EXISTS
- 2 RECENT
- OK [UNSEEN 1] First unseen.
- OK [UIDVALIDITY 1214858211] UIDs valid
- OK [UIDNEXT 33722] Predicted next UID 1 OK [READ-WRITE] Select completed. 2 SEARCH BODY "berry" Aborted (core dumped)
Thanks again Timo, -Dave
Timo Sirainen wrote:
On Sat, 2009-03-21 at 18:58 -0400, David Halik wrote:
Mar 21 18:43:57 er0.rutgers.edu IMAP(dhalik): : [ID 107833 mail.crit] Panic: Trying to allocate 2147483648 bytes
Attached patch probably helps?
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
Any thoughts on this Timo? Just wondering if you were able to reproduce the problem and saw the same behavior. We can look into it more if need be, but since I can replicate it on both Fedora and Solaris, I'm guessing it's a fundamental bug with how certain malformed emails are searched.
Thanks, -Dave
David Halik wrote:
Unfortunately, the patch didn't help, BUT I've discovered some very interesting things along the way that I think you'd like to hear:
- The problem stems from certain emails with odd or badly formed characters.
The reason I wasn't seeing it on one machine was because I was using a different folder to test with. Once I moved over that folder the same issue occurred. This got me thinking...
- This is *not* a localized Solaris issue, I can replicate the problem on my Fedora 10 i386 workstation with the stock F10 1.1.10 package.
On Linux at least for me), however, it doesn't grow and grow, it just dies and core dumps immediately once it hits an email it doesn't like:
Mar 24 17:03:11 gunslinger IMAP(dhalik): : Panic: Trying to allocate 2147483648 bytes
Here is the backtrace from Fedora 10:
To help you out I'm going to give you the email folder that I have been using to replicate this. It's just a maildir format spam folder, so no sensitive emails... just lots and lots of junk. =) You should be able to replicate it without a problem.
http://jla.rutgers.edu/~dhalik/spamfolder.tar.gz
Steps to reproduce:
$ /usr/sbin/dovecot --exec-mail imap
- PREAUTH [CAPABILITY IMAP4rev1 SASL-IR SORT THREAD=REFERENCES MULTIAPPEND UNSELECT LITERAL+ IDLE CHILDREN NAMESPACE LOGIN-REFERRALS UIDPLUS LIST-EXTENDED I18NLEVEL=1 QUOTA] Logged in as dhalik 1 SELECT INBOX.AUTO-DELETED-SPAM
- FLAGS (\Answered \Flagged \Deleted \Seen \Draft $Forwarded)
- OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft $Forwarded)] Flags permitted.
- 1468 EXISTS
- 2 RECENT
- OK [UNSEEN 1] First unseen.
- OK [UIDVALIDITY 1214858211] UIDs valid
- OK [UIDNEXT 33722] Predicted next UID 1 OK [READ-WRITE] Select completed. 2 SEARCH BODY "berry" Aborted (core dumped)
Thanks again Timo, -Dave
Timo Sirainen wrote:
On Sat, 2009-03-21 at 18:58 -0400, David Halik wrote:
Mar 21 18:43:57 er0.rutgers.edu IMAP(dhalik): : [ID 107833 mail.crit] Panic: Trying to allocate 2147483648 bytes
Attached patch probably helps?
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
On Mar 26, 2009, at 1:00 PM, David Halik wrote:
Any thoughts on this Timo? Just wondering if you were able to
reproduce the problem and saw the same behavior. We can look into it
more if need be, but since I can replicate it on both Fedora and
Solaris, I'm guessing it's a fundamental bug with how certain
malformed emails are searched.
Interesting that the only two people who reported this were running on
Solaris. Anyway, fixed:
http://hg.dovecot.org/dovecot-1.1/rev/a2013dedbad0
I suppose that could be thought of as a DoS attack, but since it
requires user to search messages and have a very special kind of
email, and by default imap process size is limited to 256 MB, I don't
think I'll bother releasing v1.1.14 because of it.
That did the trick! No more memory allocation issues. That you very much for the patch and your help Timo. Should we expect the patch to be in the final 1.2 release, or in a future version?
-Dave
Timo Sirainen wrote:
On Mar 26, 2009, at 1:00 PM, David Halik wrote:
Any thoughts on this Timo? Just wondering if you were able to reproduce the problem and saw the same behavior. We can look into it more if need be, but since I can replicate it on both Fedora and Solaris, I'm guessing it's a fundamental bug with how certain malformed emails are searched.
Interesting that the only two people who reported this were running on Solaris. Anyway, fixed: http://hg.dovecot.org/dovecot-1.1/rev/a2013dedbad0
I suppose that could be thought of as a DoS attack, but since it requires user to search messages and have a very special kind of email, and by default imap process size is limited to 256 MB, I don't think I'll bother releasing v1.1.14 because of it.
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
Yes, it'll be in next 1.1 and 1.2 releases.
On Mar 27, 2009, at 9:11 AM, David Halik wrote:
That did the trick! No more memory allocation issues. That you very
much for the patch and your help Timo. Should we expect the patch to
be in the final 1.2 release, or in a future version?-Dave
Timo Sirainen wrote:
On Mar 26, 2009, at 1:00 PM, David Halik wrote:
Any thoughts on this Timo? Just wondering if you were able to
reproduce the problem and saw the same behavior. We can look into
it more if need be, but since I can replicate it on both Fedora
and Solaris, I'm guessing it's a fundamental bug with how certain
malformed emails are searched.Interesting that the only two people who reported this were running
on Solaris. Anyway, fixed: http://hg.dovecot.org/dovecot-1.1/rev/a2013dedbad0I suppose that could be thought of as a DoS attack, but since it
requires user to search messages and have a very special kind of
email, and by default imap process size is limited to 256 MB, I
don't think I'll bother releasing v1.1.14 because of it.--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
Timo Sirainen wrote:
On Mar 26, 2009, at 1:00 PM, David Halik wrote:
Any thoughts on this Timo? Just wondering if you were able to reproduce the problem and saw the same behavior. We can look into it more if need be, but since I can replicate it on both Fedora and Solaris, I'm guessing it's a fundamental bug with how certain malformed emails are searched.
Interesting that the only two people who reported this were running on Solaris. Anyway, fixed: http://hg.dovecot.org/dovecot-1.1/rev/a2013dedbad0
In my case the error occured only sometimes. I made several tests with the patch applied and the errror seems to be gone.
Thanks. Martin
I suppose that could be thought of as a DoS attack, but since it requires user to search messages and have a very special kind of email, and by default imap process size is limited to 256 MB, I don't think I'll bother releasing v1.1.14 because of it.
Martin Preen, Universität Freiburg, Institut für Informatik Georges-Koehler-Allee 52, Raum 00-006, 79110 Freiburg, Germany
phone: ++49 761 203-8250 preen@informatik.uni-freiburg.de fax: ++49 761 203-8242 www.informatik.uni-freiburg.de
participants (3)
-
David Halik
-
Martin Preen
-
Timo Sirainen