[Dovecot] Unexpected behaviour when deleteing a big mailbox
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello,
my last tests left me with a mailbox with > 80'000 messages in Maildir. Dovecot served them fine. No problem. I usually use IMAP's "Mark messages a deleted" function and expunge later.
Because the "Move to Trash" is default in many MUAs, I decided to try it on this mailbox.
I do think that it is hard to detect and prevent such problem, but could it be implemented that when the client connection times out, the COPY process is terminated and rolled back? I mean, as long as the messages are copied/linked to */tmp nothing is done, yet, hence, a simple cleanup of */tmp would result in a rollback. Due to the timeout, the EXPUNGE wasn't issued so far and the source mailbox is unchanged. Or, is it possible to prevent two COPYs of an overlapping message range into the same destination folder simultaneously? Or, can I prevent to COPY such a huge range of messages at once?
I know 80'000 messages are much, but I have users with an INBOX of at least 40'000, eventually moving all files into an archive mailbox.
I'm also surprised to find that I didn't get two full copies of the source mailbox.
==============
Start situation: 4.8G .box1 (I didn't recorded the individual subdirs, unfortunately.)
find .box1 -type f | wc -l
84273
Newly created .Trash.
===
I started Thunderbird, let sync the mailbox, selected them all, and pressed "Delete". Some time later: Connection timed out. I, wondered, and hit "Delete" a second time. Again, connection timed out.
===
I was surprised, however, to see that the Dovecot log with still active telling me that mails are copied to Trash.
I ended up with two processes copying all messages to .Trash/tmp on the server. The process I traced a bit later linked any file after the other. What I wondered right from the beginning was, that
du -hs .??*
displayed about 3.5GB in .Trash, but still 4.8GB in .box1. So, why are there 3.5GB additional files in .Trash, when I configured Dovecot to use hardlinks?
The end situation is that .box1 contains all messages still, but any Unseen messages are moved to .box1/cur, and .Trash contains:
7.4G .Trash/cur 791M .Trash/new 76K .Trash/tmp
find .Trash/ -type f | wc -l
121713
The size nearly doubled.
du -hs .box1 ; du -sh .Trash/
4.8G .box1 8.2G .Trash/
====
Last lines in Dovecot log: dovecot: Mar 21 11:46:15 Info: IMAP(dvtest) [11535]: copy -> Trash: uid=83964, msgid=<xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx@xx-xxxx-xxxxx-xxxx.xx>, box=box1 dovecot: Mar 21 11:46:15 Info: IMAP(dvtest) [11535]: copy -> Trash: uid=83965, msgid=<xxxxxxxx.xxxxxxx@xx-xxxx-xxxxx-xxxx.xx>, box=box1 dovecot: Mar 21 11:46:29 Info: IMAP(dvtest) [11535]: Disconnected: Mailbox is in inconsistent state, please relogin.
Last lines of strace: fstat64(13, {st_mode=S_IFREG|0600, st_size=1341104, ...}) = 0 stat64("/var/cache/dovecot/31045/index/.box1/dovecot.index", {st_mode=S_IFREG|0600, st_size=1341104, ...}) = 0 alarm(120) = 0 fcntl64(13, F_SETLKW64, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}, 0xbfe562d4) = 0 alarm(0) = 120 fstat64(13, {st_mode=S_IFREG|0600, st_size=1341104, ...}) = 0 mmap2(NULL, 1341104, PROT_READ|PROT_WRITE, MAP_SHARED, 13, 0) = 0xb3ffc000 munmap(0xb4285000, 1341104) = 0 write(1, "* 7998 FETCH (FLAGS ($Label2 $La"..., 2067) = 2067 write(1, "* 8091 FETCH (FLAGS (\\Answered \\"..., 2056) = -1 EPIPE (Broken pipe)
- --- SIGPIPE (Broken pipe) @ 0 (0) --- munmap(0xb7cbc000, 1341104) = 0 fcntl64(13, F_SETLK64, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}, 0xbfe56394) = 0 write(2, "\1IDisconnected: Mailbox is in in"..., 66) = 66
/usr/local/dovecot.rc27/sbin/dovecot -n
/usr/local/dovecot.rc27/etc/dovecot.conf
base_dir: /var/run/dovecot/ log_path: /var/log/dovecot/dovecot.log protocols: imap imaps pop3 pop3s ssl_ca_file: /etc/ssl/certs/ca.crt ssl_cert_file(default): /etc/ssl/certs/imap.pem ssl_cert_file(imap): /etc/ssl/certs/imap.pem ssl_cert_file(pop3): /etc/ssl/certs/pop3.pem ssl_key_file(default): /etc/ssl/private/imap.key ssl_key_file(imap): /etc/ssl/private/imap.key ssl_key_file(pop3): /etc/ssl/private/pop3.key disable_plaintext_auth: no verbose_ssl: yes login_dir: /var/run/dovecot//login login_executable(default): /usr/local/dovecot.rc27/libexec/dovecot/imap-login login_executable(imap): /usr/local/dovecot.rc27/libexec/dovecot/imap-login login_executable(pop3): /usr/local/dovecot.rc27/libexec/dovecot/pop3-login login_log_format_elements: %p: user=<%u> method=%m rip=%r lip=%l %c verbose_proctitle: yes first_valid_uid: 1000 mail_location: maildir:%h/MailDir:CONTROL=/var/cache/dovecot/%i/control:INDEX=/var/cache/dovecot/%i/index mail_debug: yes dotlock_use_excl: yes maildir_copy_with_hardlinks: yes maildir_copy_preserve_filename: yes mail_drop_priv_before_exec: yes mail_executable(default): /usr/local/dovecot.rc27/libexec/dovecot/imap mail_executable(imap): /usr/local/dovecot.rc27/libexec/dovecot/imap mail_executable(pop3): /usr/local/dovecot.rc27/libexec/dovecot/pop3 mail_plugins(default): quota imap_quota mail_log zlib mail_plugins(imap): quota imap_quota mail_log zlib mail_plugins(pop3): quota mail_log mail_plugin_dir(default): /usr/local/dovecot.rc27/lib/dovecot/imap mail_plugin_dir(imap): /usr/local/dovecot.rc27/lib/dovecot/imap mail_plugin_dir(pop3): /usr/local/dovecot.rc27/lib/dovecot/pop3 mail_log_prefix: %Us(%u) [%p]: mail_log_max_lines_per_sec: 30 pop3_uidl_format(default): pop3_uidl_format(imap): pop3_uidl_format(pop3): %u pop3_client_workarounds(default): pop3_client_workarounds(imap): pop3_client_workarounds(pop3): oe-ns-eoh namespace: type: private separator: . inbox: yes hidden: yes auth default: mechanisms: plain login cache_size: 10 username_chars: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890 verbose: yes debug: yes passdb: driver: ldap args: /usr/local/dovecot.rc26/etc/dovecot-ldap.conf passdb: driver: shadow userdb: driver: prefetch userdb: driver: ldap args: /usr/local/dovecot.rc26/etc/dovecot-ldap.conf userdb: driver: passwd-file args: /etc/passwd socket: type: listen client: path: /var/run/dovecot/auth-client mode: 432 master: path: /var/run/dovecot/auth-master mode: 432 group: mail plugin: quota: fs
Bye,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBRgFBGy9SORjhbDpvAQLBcQf+OAqiufFlyPqJTyM3Ylkmiae2afXPe2wl bNWg6AyDrdvFw+rtdc5vnYukQUSk2a8zQBggt3Rfb1zZQb0pszrzLdR+fOwnW2uE kDyHRhyy88aVpM848oQKYnlWdNHFrdwbyp2myU4JxJhu0bXsAhyMgf9txshwe5cd 23O1x+gkWUp+1wuz84uTep3Kcs1wVXWANEN5RT4YMI3MOqMrcyB+P0lj0luy0HTH HE9T6mLqzGt0Cw6yfm+1W8bU3Wnt3C5je15gT++OPd5YmDvpQs+fOjavDIITsuOf jBck4N+KjwOM+FzP8Ea1WNWCvObJKdeiISkoFWFO9bafw5Yue9hFNA== =yyP5 -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Wed, 21 Mar 2007, Steffen Kaiser wrote:
Just to add:
Without the "mail_log" plugin, the 80'000 message are copied almost in time to Trash; the timeout occured while renaming the files from tmp to cur or new.
==
I noticed with EXPUNGE that the client feedback is sent after all messages are deleted, when this is the same with COPY, could one sent some feedback to the client in order to keep the connection open?
Bye,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBRgFRiC9SORjhbDpvAQLKrgf9GBOjl3zK/28wLpRDIKm4On/jxqcKxNKO Cm8yG02uFaS1164rvTZwmEY3wPU1dolJJ9U/49P25NIlRIdXyKhPh4FJh82Ujwtr peMa1g4AM+yn9WLjrPpi5L2RCUUlSYUaoIHHPCJEQipwTjypp8sVPj0RKxBynbrZ J7N5aWwzEhXc4DwT4z5ejqQT21jmiV3VJ2iE48d/JkDSHptJjYslH0VlDEMSxuAF n3wldvB7bfkTDJLQHYP1Jxd98+v9wHMnFsZYyA6hTdUgWY6qm9SV466AflXOpQM4 7uGASJR974ttY6M998ZV8pdY146y7BbmTEEeOfOQg6Q8HGBIYPwGdQ== =S6C1 -----END PGP SIGNATURE-----
On Wed, 2007-03-21 at 16:38 +0100, Steffen Kaiser wrote:
I noticed with EXPUNGE that the client feedback is sent after all messages are deleted, when this is the same with COPY, could one sent some feedback to the client in order to keep the connection open?
Does the attached patch help?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Wed, 21 Mar 2007, Timo Sirainen wrote:
On Wed, 2007-03-21 at 16:38 +0100, Steffen Kaiser wrote:
I noticed with EXPUNGE that the client feedback is sent after all messages are deleted, when this is the same with COPY, could one sent some feedback to the client in order to keep the connection open?
Does the attached patch help?
Yes!
The connection stays alive the whole time.
I did two tests:
- I closed Thunderbird after 15min. The server performed the COPY command. They stayed in box1 and appeared in Trash. (Due to the COPY/Mark delete/Expunge sequence got broken). 
- I moved all messages again into Trash, this time waited. 
During expunging there is this problem:
dovecot: Mar 22 13:35:20 Info: IMAP(dvtest) [18381]: expunged: uid=83803, msgid=<xxxxxxxx.xxxxxxx@xx-xxxx-xxxxx-xxxx.xx>, box=box1 dovecot: Mar 22 13:35:20 Info: IMAP(dvtest) [18381]: expunged: uid=83804, msgid=<xxxxxxxx.xxxxx@xx-xxxx-xxxxx-xxxx.xx>, box=box1 dovecot: Mar 22 13:35:39 Error: IMAP(dvtest) [18381]: Maildir /mnt/mailcache/dvtest/.box1 sync: UID inserted in the middle of mailbox (16251 > 6588, file = 1174389173.P6520Q1M362610.ux-2s11-9:2,Fdc) dovecot: Mar 22 13:35:39 Info: IMAP(dvtest) [18381]: Disconnected: Mailbox is in inconsistent state, please relogin.
[[snip log of a login as dvtest]] dovecot: Mar 22 13:35:39 Error: IMAP(dvtest) [18381]: Maildir /mnt/mailcache/dvtest/.box1 sync: UID inserted in the middle of mailbox (16251 > 6588, file = 1174389173.P6520Q1M362610.ux-2s11-9:2,Fdc) dovecot: Mar 22 13:35:40 Error: IMAP(dvtest) [18384]: Corrupted index cache file /var/cache/dovecot/31045/index/.box1/dovecot.index.cache: indexid changed
- -> These are the only "Error:" lines in the server log.
102 message were still in box1. (Well, of more than 80'000!)
====
Then I disabled log throddling, pointed pine to the Trash folder with 167'606 messages, marked them as deleted and expunged them. And they were gone.
====
Then I started up Thunderbird again, in order to check Trash etc. and, eventually, deleted the file 102 messages from box1.:
dovecot: Mar 22 14:05:09 Error: IMAP(dvtest) [18444]: Maildir /mnt/mailcache/dvtest/.box1 sync: UID < next_uid (16251 < 83805, file = 1174390126.P30568Q0M945620.ux-2s11-9:2,RSTbde) dovecot: Mar 22 14:05:09 Info: IMAP(dvtest) [18444]: Disconnected: Mailbox is in inconsistent state, please relogin. dovecot: Mar 22 14:05:09 Error: IMAP(dvtest) [18444]: file client.c: line 404 (_client_input): assertion failed: (!client->handling_input) dovecot: Mar 22 14:05:09 Error: IMAP(dvtest) [18444]: Raw backtrace: imap [0x80b58e1] -> imap [0x80b57fc] -> imap [0x805bdf7] -> imap [0x8058406] -> imap [0x8058669] -> imap [0x805870c] -> imap(cmd_idle+0x144) [0x8058894]
- -> imap [0x805b628] -> imap [0x805b6b7] -> imap(_client_input+0x6c) [0x805bd3c] -> imap(io_loop_handler_run+0x152) [0x80bb8b2] -> imap(io_loop_run+0x28) [0x80bab78] -> imap(main+0x4b0) [0x8063d50] -> /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xc8) [0xb7e0eea8] -> imap [0x8056871] dovecot: Mar 22 14:05:09 Error: child 18444 (imap) killed with signal 6
gdb tells: Core was generated by `imap [dvtest 10.20.10.63]'. Program terminated with signal 6, Aborted. #0 0xffffe410 in __kernel_vsyscall () (gdb) bt full #0 0xffffe410 in __kernel_vsyscall () No symbol table info available. #1 0xb7e22811 in raise () from /lib/tls/i686/cmov/libc.so.6 No symbol table info available. #2 0xb7e23fb9 in abort () from /lib/tls/i686/cmov/libc.so.6 No symbol table info available. #3 0x080b58ea in i_internal_panic_handler (fmt=0x80c5a08 "file %s: line %d (%s): assertion failed: (%s)", args=0xbfbab1f4 "\003\026\r\b\224\001") at failures.c:403 backtrace = 0x80d6448 "imap [0x80b58e1] -> imap [0x80b57fc] -> imap [0x805bdf7] -> imap [0x8058406] -> imap [0x8058669] -> imap [0x805870c] -> imap(cmd_idle+0x144) [0x8058894] -> imap [0x805b628] -> imap [0x805b6b7] -> imap"... #4 0x080b57fc in i_panic (format=0x80c5a08 "file %s: line %d (%s): assertion failed: (%s)") at failures.c:183 args = 0xbfbab1f4 "\003\026\r\b\224\001" #5 0x0805bdf7 in _client_input (context=0x80e10c8) at client.c:404 client = (struct client *) 0x80e10c8 cmd = <value optimized out> __PRETTY_FUNCTION__ = "_client_input" #6 0x08058406 in idle_finish (ctx=0x80e4350, done_ok=false) at cmd-idle.c:71 client = (struct client *) 0x80e10c8 #7 0x08058669 in cmd_idle_continue (cmd=0x80e110c) at cmd-idle.c:220 client = (struct client *) 0x80e10c8 ctx = (struct cmd_idle_context *) 0x80e4350 #8 0x0805870c in idle_sync_now (box=<value optimized out>, ctx=0x80e4350) at cmd-idle.c:157 __PRETTY_FUNCTION__ = "idle_sync_now" #9 0x08058894 in cmd_idle (cmd=0x80e110c) at cmd-idle.c:273 client = (struct client *) 0x80e10c8 ctx = (struct cmd_idle_context *) 0x80e4350 interval = <value optimized out> #10 0x0805b628 in client_handle_input (cmd=0x80e110c) at client.c:335 client = (struct client *) 0x80e10c8 __PRETTY_FUNCTION__ = "client_handle_input" #11 0x0805b6b7 in client_handle_input (cmd=0x80e110c) at client.c:392 client = (struct client *) 0x80e10c8 __PRETTY_FUNCTION__ = "client_handle_input" #12 0x0805bd3c in _client_input (context=0x80e10c8) at client.c:435 client = (struct client *) 0x80e10c8 cmd = (struct client_command_context *) 0x80e110c __PRETTY_FUNCTION__ = "_client_input" #13 0x080bb8b2 in io_loop_handler_run (ioloop=0x80de9b0) at ioloop-poll.c:199 ctx = (struct ioloop_handler_context *) 0x80de9d8 pollfd = <value optimized out> tv = {tv_sec = 0, tv_usec = 620373} io = (struct io *) 0x81160e8 t_id = 2 msecs = <value optimized out> ret = 0 call = <value optimized out> #14 0x080bab78 in io_loop_run (ioloop=0x80de9b0) at ioloop.c:323 No locals. #15 0x08063d50 in main (argc=Cannot access memory at address 0x480c ) at main.c:287 home = 0x0
Bye,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBRgKDDS9SORjhbDpvAQJa3wf+PDXIniMhbRvxXEuKSGiy1mibA0TjVgEs 3VdlcmMaKrDa0yDNUI69y01sODbkfkeLznsu/UBH5p/qroTo5tWOsPz01BJjHHJn p1tXppkKxT9UdI7FjGTIu2yyc/SLEjHRrM+NRg8vQgPoSIVFbYi3uaDRDO/9WShr p4Tt1nJ8zyvm+fTL5VzdNLOd4iHjBIvUOLPfIxgyyE10nkwEgcaqwPiU4SOCuk4l yqVedrDP1xGmH0JyoMkehrGP3w+WITMK6PuwtpeWX+YXE4USjuijewqILHQ1gIuy z72spc/SQKBrZTaI3PTRGtKhdYeQ37btKlDH9r4oUNcwPjGZ//BYfg== =dny2 -----END PGP SIGNATURE-----
On Thu, 2007-03-22 at 14:22 +0100, Steffen Kaiser wrote:
dovecot: Mar 22 13:35:39 Error: IMAP(dvtest) [18381]: Maildir /mnt/mailcache/dvtest/.box1 sync: UID inserted in the middle of mailbox (16251 > 6588, file = 1174389173.P6520Q1M362610.ux-2s11-9:2,Fdc)
Are you using ext3? I just found a bug in it that could cause this with large maildirs.
dovecot: Mar 22 14:05:09 Error: IMAP(dvtest) [18444]: file client.c: line 404 (_client_input): assertion failed: (!client->handling_input)
This crash has been fixed in CVS for a while now.
On Fri, Mar 23, 2007 at 07:27:40PM +0200, Timo Sirainen wrote:
On Thu, 2007-03-22 at 14:22 +0100, Steffen Kaiser wrote:
dovecot: Mar 22 13:35:39 Error: IMAP(dvtest) [18381]: Maildir /mnt/mailcache/dvtest/.box1 sync: UID inserted in the middle of mailbox (16251 > 6588, file = 1174389173.P6520Q1M362610.ux-2s11-9:2,Fdc)
Are you using ext3? I just found a bug in it that could cause this with large maildirs.
Timo, you (and others) may well consider this off-topic, but would you mind telling us more about this ext3 bug? I have some very large ext3 filesystems (not maildirs) and am concerned.
Cheers, Terry.
dovecot: Mar 22 14:05:09 Error: IMAP(dvtest) [18444]: file client.c: line 404 (_client_input): assertion failed: (!client->handling_input)
This crash has been fixed in CVS for a while now.
--
On Fri, 2007-03-23 at 17:37 +0000, T. Horsnell wrote:
On Fri, Mar 23, 2007 at 07:27:40PM +0200, Timo Sirainen wrote:
On Thu, 2007-03-22 at 14:22 +0100, Steffen Kaiser wrote:
dovecot: Mar 22 13:35:39 Error: IMAP(dvtest) [18381]: Maildir /mnt/mailcache/dvtest/.box1 sync: UID inserted in the middle of mailbox (16251 > 6588, file = 1174389173.P6520Q1M362610.ux-2s11-9:2,Fdc)
Are you using ext3? I just found a bug in it that could cause this with large maildirs.
Timo, you (and others) may well consider this off-topic, but would you mind telling us more about this ext3 bug? I have some very large ext3 filesystems (not maildirs) and am concerned.
On 23.3.2007, at 19.40, Timo Sirainen wrote:
dovecot: Mar 22 13:35:39 Error: IMAP(dvtest) [18381]: Maildir /mnt/mailcache/dvtest/.box1 sync: UID inserted in the middle of
mailbox (16251 > 6588, file = 1174389173.P6520Q1M362610.ux-2s11-9:2,Fdc)Are you using ext3? I just found a bug in it that could cause
this with large maildirs.Timo, you (and others) may well consider this off-topic, but would
you mind telling us more about this ext3 bug? I have some very large ext3 filesystems (not maildirs) and am concerned.
So not a bug after all, just something I had never heard of before.
This makes my duplicate fixing code pretty difficult to do correctly.
I guess I'll go remove it completely then, it's not like anyone else
does it either. :) Then again, if there do exist duplicates, it
starts giving annoying error messages.
Of course duplicates shouldn't normally happen either, but they do
with maildir_copy_preserve_filenames=yes and stress testing.. I guess
I'll leave this fix post-v1.0.
On Sat, Mar 24, 2007 at 01:30:07AM +0200, Timo Sirainen wrote:
On 23.3.2007, at 19.40, Timo Sirainen wrote:
dovecot: Mar 22 13:35:39 Error: IMAP(dvtest) [18381]: Maildir /mnt/mailcache/dvtest/.box1 sync: UID inserted in the middle of
mailbox (16251 > 6588, file = 1174389173.P6520Q1M362610.ux-2s11-9:2,Fdc)Are you using ext3? I just found a bug in it that could cause
this with large maildirs.Timo, you (and others) may well consider this off-topic, but would
you mind telling us more about this ext3 bug? I have some very large ext3 filesystems (not maildirs) and am concerned.So not a bug after all, just something I had never heard of before.
Wow, that is surprising to me too.
FWIW, I also tried this on a couple of FreeBSD systems, with the same result: in 'rename( from, to )', if to is already the same file as from (hard link), from is not removed. OTOH I tried it on an old BSD/OS (4.x) system, and from *is* removed in that case. The latter's man page claims to conform to a 1988 POSIX, whereas the others say 1996.
I wonder what the rationale for that newer behavior is? It just seems to complicate things, so I imagine there must be a reason for it.
mm
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Fri, 23 Mar 2007, Timo Sirainen wrote:
Are you using ext3? I just found a bug in it that could cause this with large maildirs.
You are right, I had been using ext3 with this test. However, I'm thinking to switch over to XFS, because ext3 does not like directories with many files in it, most nagging is the fact that ext3 does not shrink directories.
Bye,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBRgkTwC9SORjhbDpvAQLxfwf9Fhyp0scM7YgO7fafnwNSMZO9D//9nGbt z/AE3b9ldrElBXHSt4U3Gb7LUPG34+myogIQ+t/Y4TC7JL9K9DTeL/x9yFtG068e qLPG8Vipgu0W2LEiJa27wF8D5KawX1bD+42biNu4pHoG0xWVzwdsch8Jp2wpeygl Dd4+wGbNexhoFGihHRB+85DjPsmGl0n0BCKn/OlJYdNSi9SaSXnsVidGSuILxnIU bNanngV2Xe9ul22UQq0EnoxX2ud1RA7Y3V8iLZmPiA0hHU9KaMAh4g+cW46FhUni 672iY5Bbp1xXhx9hAhKfdc8PetxhOxresTlUxGFhMyl9Ydf5uEhnkw== =i39H -----END PGP SIGNATURE-----
On 27.3.2007, at 15.53, Steffen Kaiser wrote:
Are you using ext3? I just found a bug in it that could cause this
with large maildirs.You are right, I had been using ext3 with this test.
It wasn't ext3's fault actually after all what I noticed. In your
other test the problems could be explained by
maildir_copy_preserve_filenames=yes but I'm not actually sure about
this. Can you still reproduce the errors with rc28?
However, I'm thinking to switch over to XFS, because ext3 does not
like directories with many files in it, most nagging is the fact
that ext3 does not shrink directories.
xfs is a lot slower than ext3 with maildir at least in my tests.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tue, 27 Mar 2007, Timo Sirainen wrote:
It wasn't ext3's fault actually after all what I noticed. In your other test the problems could be explained by maildir_copy_preserve_filenames=yes but I'm not actually sure about this. Can you still reproduce the errors with rc28?
I currently run a test with RC28, but it is very slow, because of lots of " stalled for 1356 secs in APPEND".
I see that you haven't included the patch: http://dovecot.org/patches/debug/imap-append-hang-debug.diff in RC28, is this intentional?
xfs is a lot slower than ext3 with maildir at least in my tests.
Doh, hm, I'm back to ext3 then.
When Ext3 opens a directory with many many files, the whole system slows down, an user managed to issue a tremendous slow down via NFS on the server. :-(
Bye.
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBRgvLBS9SORjhbDpvAQIUKAf+OApephiHRSsYU9Etv+w22skQ8zxuRERp QbbKCy1akFX1Z2KDO8gMiv/jfhAm5lkw3KOFTQdyxBicA4BZoCTuLyyw7HspI7WF vfxsTxTgVqj9lVHDwPIOliGpDe/u0FGY0+hyNgyjEFDVDTc9l+H4wMiPndBOMHs2 uerVcnYjXVGDUzZnMCXOIIWGPk8chOQuMRavRfVs35bw62tKJiaIILBE73yZvTav 4jWCo/jLvUqIbg/7BrZ41X0FBwTF3JTRy46AAGZj3PQ3COAerHzWrkY/PSAJ3n+2 ljnk7gRSJnIsoZLsICYydgaEnuSj10IGBtwtSbFXbayQi7//dPpVAQ== =6Iac -----END PGP SIGNATURE-----
On 29/03/2007 15:19, Steffen Kaiser wrote:
On Tue, 27 Mar 2007, Timo Sirainen wrote:
xfs is a lot slower than ext3 with maildir at least in my tests.
Doh, hm, I'm back to ext3 then.
When Ext3 opens a directory with many many files, the whole system slows down, an user managed to issue a tremendous slow down via NFS on the server. :-(
That's probably NFS's fault, not ext3's.
Cheers,
John.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Thu, 29 Mar 2007, Steffen Kaiser wrote:
I let run "imaptest blablah copybox=box1" half a day. It ran through, no cores, no crashes, it still ran when I came back this morning.
There tons of:
dovecot: Mar 30 08:49:27 Error: IMAP(dvtest2) [12578]: Transaction log file /var/cache/dovecot/31022/index/.box1/dovecot.index.log: marked corrupted dovecot: Mar 30 08:49:28 Error: IMAP(dvtest2) [12581]: Corrupted transaction log file /var/cache/dovecot/31022/index/.box1/dovecot.index.log: Append with UID 168554, but next_uid = 168555 dovecot: Mar 30 08:49:28 Error: IMAP(dvtest2) [12488]: Transaction log file /var/cache/dovecot/31022/index/.box1/dovecot.index.log: marked corrupted
All in the copybox, no error related to the work-in box.
=============
box1 (the copybox) contains 168778 messages. When I first selected the box1, there were some "Fixed a duplicate link" Warnings, then Thunderbird tried to download all the message headers. It never returned to something useful state. straceing the imap process I saw that Dovecot reads all messages fully, header & body, then sends the client:
- 49136 FETCH (UID 49235 RFC822.SIZE 7426 FLAGS (\Recent) BODY[HEADER.FIELDS (FROM TO CC SUBJECT DATE MESSAGE-ID PRIORITY X-PRIORITY REFERENCES NEWSGROUPS IN-REPLY-TO CONTENT-TYPE)] {668} And the headers.
I killed thunderbird, eventually, re-opend the mailbox and tried to delete the mails all at once.
=============
When I tried to delete all the messages through the "Trashbox", the copy-to-Trash process fails with (see end of mail). Then the process starts again with uid=1, again the same error is issued at the same uid. After the first "Out-of-mem" condition, I see "delete uid 1" in the log, but the messages are still there and the COPY is still performed again.
There is no Time-out during this situation ;-)
Dunno, I simply suggest that the admin may limit the number of items a COPY command may be performed for. I mean, one can move/copy/delete-to-trash mails in chunks of 50'000 messages - nobody sane has that many message in one mailbox, hasn't one?
============
ulimit -a
core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited max nice (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 16383 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 max rt priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 16383 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15131]: copy -> Trash: uid=159985, msgid=<xxxxxxxx.xxxxxxx@xx-xxxx-xxxxx-xxxx.xx>, box=box1 dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15131]: copy -> Trash: uid=159986, msgid=<xxxxxxxx.xxxxxxx@xx-xxxx-xxxxx-xxxx.xx>, box=box1 dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15131]: copy -> Trash: uid=159987, msgid=<xxxxxxxx.xxxxxxx@xx-xxxx-xxxxx-xxxx.xx>, box=box1 dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15131]: copy -> Trash: uid=159988, msgid=<xxxxxxxx.xxxxxxx@xx-xxxx-xxxxx-xxxx.xx>, box=box1 dovecot: Mar 30 15:01:45 Error: IMAP(dvtest2) [15131]: block_alloc(): Out of memory dovecot: Mar 30 15:01:45 Error: child 15131 (imap) returned error 83 (Out of memory) dovecot: Mar 30 15:01:45 Info: auth(default): client in: AUTH 1 PLAIN service=IMAP lip=10.20.10.166 rip=10.20.10.63 dovecot: Mar 30 15:01:45 Info: auth(default): client out: CONT 1 dovecot: Mar 30 15:01:45 Info: auth(default): client in: CONT<hidden> dovecot: Mar 30 15:01:45 Info: auth(default): cache(dvtest2,10.20.10.63): hit dovecot: Mar 30 15:01:45 Info: auth(default): cache(dvtest2,10.20.10.63): User unknown dovecot: Mar 30 15:01:45 Info: auth(default): cache(dvtest2,10.20.10.63): hit dovecot: Mar 30 15:01:45 Info: auth(default): client out: OK 1 user=dvtest2 dovecot: Mar 30 15:01:45 Info: auth(default): master in: REQUEST 34 15132 1 dovecot: Mar 30 15:01:45 Info: auth(default): prefetch(dvtest2,10.20.10.63): passdb didn't return userdb entries dovecot: Mar 30 15:01:45 Info: auth(default): ldap(dvtest2,10.20.10.63): user search: base=dc=fh-bonn-rhein-sieg,dc=de scope=subtree filte r=(&(objectClass=fhMailAlias)(uid=dvtest2)) fields=homeDirectory,uidNumber,gidNumber,mailLocationDovecot,uid dovecot: Mar 30 15:01:45 Error: auth(default): ldap(dvtest2,10.20.10.63): Authenticated user not found dovecot: Mar 30 15:01:45 Info: auth(default): master out: USER 34 dvtest2 uid=31022 gid=1003 home=/home/dvtest2 dovecot: Mar 30 15:01:45 Info: imap-login: Login: 15132:, user=<dvtest2>, method=PLAIN, rip=10.20.10.63, lip=10.20.10.166 dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: Loading modules from directory: /usr/local/dovecot.rc28_patched/lib/dovecot/imap dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: Module loaded: /usr/local/dovecot.rc28_patched/lib/dovecot/imap/lib10_quota_plugin.s o dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: Module loaded: /usr/local/dovecot.rc28_patched/lib/dovecot/imap/lib11_imap_quota_plu gin.so dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: Module loaded: /usr/local/dovecot.rc28_patched/lib/dovecot/imap/lib20_mail_log_plugi n.so dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: Module loaded: /usr/local/dovecot.rc28_patched/lib/dovecot/imap/lib20_zlib_plugin.so dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: Effective uid=31022, gid=1003, home=/home/dvtest2 dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: Namespace: type=private, prefix=, sep=., inbox=yes, hidden=yes, subscriptions=no dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: maildir: data=/home/dvtest2/MailDir:CONTROL=/var/cache/dovecot/31022/control:INDEX=/ var/cache/dovecot/31022/index dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: maildir: root=/home/dvtest2/MailDir, index=/var/cache/dovecot/31022/index, control=/ var/cache/dovecot/31022/control, inbox= dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: fs quota add storage dir = /home/dvtest2/MailDir dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: fs quota block device = /dev/mapper/mailstud-mailstud dovecot: Mar 30 15:01:45 Info: IMAP(dvtest2) [15133]: fs quota mount point = /home dovecot: Mar 30 15:01:48 Info: IMAP(dvtest2) [15133]: copy -> Trash: uid=1, msgid=<xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx@xxx-xxx.xxx.xx-x xxx-xxxxx-xxxx.xx>, box=box1 dovecot: Mar 30 15:01:48 Warning: IMAP(dvtest2) [15133]: Fixed a duplicate link: /home/dvtest2/MailDir/.box1/cur/1175202599.P10595Q0M55318 7.ux-2s11-9:2, -> /home/dvtest2/MailDir/.box1/cur/1175202599.P10595Q0M553187.ux-2s11-9:2,a dovecot: Mar 30 15:01:48 Warning: IMAP(dvtest2) [15133]: Fixed a duplicate link: /home/dvtest2/MailDir/.box1/cur/1175234953.P18264Q0M95709 8.ux-2s11-9:2, -> /home/dvtest2/MailDir/.box1/cur/1175234953.P18264Q0M957098.ux-2s11-9:2,Fe dovecot: Mar 30 15:01:48 Warning: IMAP(dvtest2) [15133]: Fixed a duplicate link: /home/dvtest2/MailDir/.box1/cur/1175206465.P9628Q0M401201 .ux-2s11-9:2,FSe -> /home/dvtest2/MailDir/.box1/cur/1175206465.P9628Q0M401201.ux-2s11-9:2,DFSae dovecot: Mar 30 15:01:48 Warning: IMAP(dvtest2) [15133]: Fixed a duplicate link: /home/dvtest2/MailDir/.box1/cur/1175194240.P27391Q0M69691 6.ux-2s11-9:2,F -> /home/dvtest2/MailDir/.box1/cur/1175194240.P27391Q0M696916.ux-2s11-9:2,FRd
===
sbin/dovecot -n
/usr/local/dovecot.rc28_patched/etc/dovecot.conf
base_dir: /var/run/dovecot/ log_path: /var/log/dovecot/dovecot.log protocols: imap imaps pop3 pop3s ssl_ca_file: /etc/ssl/certs/ca.crt ssl_cert_file(default): /etc/ssl/certs/imap.pem ssl_cert_file(imap): /etc/ssl/certs/imap.pem ssl_cert_file(pop3): /etc/ssl/certs/pop3.pem ssl_key_file(default): /etc/ssl/private/imap.key ssl_key_file(imap): /etc/ssl/private/imap.key ssl_key_file(pop3): /etc/ssl/private/pop3.key disable_plaintext_auth: no verbose_ssl: yes login_dir: /var/run/dovecot//login login_executable(default): /usr/local/dovecot.rc28_patched/libexec/dovecot/imap-login login_executable(imap): /usr/local/dovecot.rc28_patched/libexec/dovecot/imap-login login_executable(pop3): /usr/local/dovecot.rc28_patched/libexec/dovecot/pop3-login login_log_format_elements: %p: user=<%u> method=%m rip=%r lip=%l %c verbose_proctitle: yes first_valid_uid: 1000 mail_location: maildir:%h/MailDir:CONTROL=/var/cache/dovecot/%i/control:INDEX=/var/cache/dovecot/%i/index mail_debug: yes dotlock_use_excl: yes maildir_copy_with_hardlinks: yes maildir_copy_preserve_filename: yes mail_drop_priv_before_exec: yes mail_executable(default): /usr/local/dovecot.rc28/libexec/dovecot/rawlog /usr/local/dovecot.rc28/libexec/dovecot/imap mail_executable(imap): /usr/local/dovecot.rc28/libexec/dovecot/rawlog /usr/local/dovecot.rc28/libexec/dovecot/imap mail_executable(pop3): /usr/local/dovecot.rc28_patched/libexec/dovecot/pop3 mail_plugins(default): quota imap_quota mail_log zlib mail_plugins(imap): quota imap_quota mail_log zlib mail_plugins(pop3): quota mail_log mail_plugin_dir(default): /usr/local/dovecot.rc28_patched/lib/dovecot/imap mail_plugin_dir(imap): /usr/local/dovecot.rc28_patched/lib/dovecot/imap mail_plugin_dir(pop3): /usr/local/dovecot.rc28_patched/lib/dovecot/pop3 mail_log_prefix: %Us(%u) [%p]: mail_log_max_lines_per_sec: 0 pop3_uidl_format(default): pop3_uidl_format(imap): pop3_uidl_format(pop3): %u pop3_client_workarounds(default): pop3_client_workarounds(imap): pop3_client_workarounds(pop3): oe-ns-eoh namespace: type: private separator: . inbox: yes hidden: yes auth default: mechanisms: plain login cache_size: 10 username_chars: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890 verbose: yes debug: yes passdb: driver: ldap args: /usr/local/dovecot.rc26/etc/dovecot-ldap.conf passdb: driver: shadow userdb: driver: prefetch userdb: driver: ldap args: /usr/local/dovecot.rc26/etc/dovecot-ldap.conf userdb: driver: passwd-file args: /etc/passwd socket: type: listen client: path: /var/run/dovecot/auth-client mode: 432 master: path: /var/run/dovecot/auth-master mode: 432 group: mail plugin: quota: fs
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBRg0TFi9SORjhbDpvAQIy4Qf+JZmYsgVmApeowTm9vz4WKYNgMiixR6FI 9PailHkZv/Nmxt12hxVXKAtV3iaNIY4E7bdy1CahvojuFav0GD66wjCHcHeyiUop G7wFWRPqd5AP9aEEFXF8h+FitXO4OV5CSSQl25Ssg/ZwtCf9JnYsXnCgQOTc3fot MOp6sczv4FPXotagNsmQHvy929WQqoS+Erv/vFfZiFLt/xN/uE3zorVNHftybo62 xW/7FqbPiUXfw7jhB3y7Xd7A3sxHa7sRMzoWv+KQtFpxZpecY/NkWMzeLB0t3z4l G5xQEVxB4GCbyOniXC/xZMWqAmi+HPz9I18aqbsI0b4SzTOnSKryiQ== =NlFU -----END PGP SIGNATURE-----
On Fri, 2007-03-30 at 15:39 +0200, Steffen Kaiser wrote:
dovecot: Mar 30 08:49:27 Error: IMAP(dvtest2) [12578]: Transaction log file /var/cache/dovecot/31022/index/.box1/dovecot.index.log: marked corrupted dovecot: Mar 30 08:49:28 Error: IMAP(dvtest2) [12581]: Corrupted transaction log file /var/cache/dovecot/31022/index/.box1/dovecot.index.log: Append with UID 168554, but next_uid = 168555 dovecot: Mar 30 08:49:28 Error: IMAP(dvtest2) [12488]: Transaction log file /var/cache/dovecot/31022/index/.box1/dovecot.index.log: marked corrupted
These and the duplicate link errors are because of maildir_copy_preserve_filename=yes. With =no it shouldn't log any errors.
box1 (the copybox) contains 168778 messages. When I first selected the box1, there were some "Fixed a duplicate link" Warnings, then Thunderbird tried to download all the message headers. It never returned to something useful state. straceing the imap process I saw that Dovecot reads all messages fully, header & body, then sends the client:
- 49136 FETCH (UID 49235 RFC822.SIZE 7426 FLAGS (\Recent) BODY[HEADER.FIELDS (FROM TO CC SUBJECT DATE MESSAGE-ID PRIORITY X-PRIORITY REFERENCES NEWSGROUPS IN-REPLY-TO CONTENT-TYPE)] {668} And the headers.
For all the messages I guess? So was there a Dovecot bug related to this, or was it just that Thunderbird doesn't like that many messages?
When I tried to delete all the messages through the "Trashbox", the copy-to-Trash process fails with (see end of mail). Then the process starts again with uid=1, again the same error is issued at the same uid. After the first "Out-of-mem" condition, I see "delete uid 1" in the log, but the messages are still there and the COPY is still performed again.
Hmm. So copying 180k messages eats 250MB of memory? That sounds a bit much, but I guess it's possible.. Maildir code takes around 80-100 bytes/message, index file code takes 8 bytes/message + more for keywords. But even assuming about the worst case scenarios, it shouldn't take more than 30MB. Wonder where the rest is going.
Dunno, I simply suggest that the admin may limit the number of items a COPY command may be performed for. I mean, one can move/copy/delete-to-trash mails in chunks of 50'000 messages - nobody sane has that many message in one mailbox, hasn't one?
Right, so this practically never happens with real users. I don't want to add any more settings that are mostly useless.
You can anyway make it work by growing mail_process_size.
On Thu, 2007-03-29 at 16:19 +0200, Steffen Kaiser wrote:
It wasn't ext3's fault actually after all what I noticed. In your other test the problems could be explained by maildir_copy_preserve_filenames=yes but I'm not actually sure about this. Can you still reproduce the errors with rc28?
I currently run a test with RC28, but it is very slow, because of lots of " stalled for 1356 secs in APPEND".
I see that you haven't included the patch: http://dovecot.org/patches/debug/imap-append-hang-debug.diff in RC28, is this intentional?
If that patch changed something, then there's probably a bug in imaptest.. I've not applied the patch because it doesn't really fix anything.
participants (6)
- 
                
                John Robinson
- 
                
                Mark E. Mallett
- 
                
                Steffen Kaiser
- 
                
                T. Horsnell
- 
                
                Timo Sirainen
- 
                
                Tomi Hakala