[Dovecot] [dovecot] Dovecot Migration so far
We've been doing our own beta testing of Dovecot 2.0-beta4 IMAP (both direct IMAP and webmail-fronted), and we've had nothing but glowing compliments from our beta users/victims. I've also been running a lone box with the latest mercurial build and that's been running just fine too. All mail data is on NFS, but I've put indexes on the local disk, which has worked just dandy so far.
Everything's been great and all I've seen amiss are a few random problems -- all of which are are barely impacting us and I mention only for the sake of feedback. All boxes are Debian Lenny i386, running 2.6.33.1.
One oddity is that I see lots of messages like this (I've obfuscated paths/IPs/usernames, so assume paths are all actually correct) for a variety of different mailboxes prior to their indexes directory getting initially created:
Apr 02 13:25:24 imap(mail@box): Error: creat(/var/tmp/Indexes/14/75/mail@box/dovecot.mailbox.log) failed: No such file or directory
Judging by the mtimes of the containing directories, the directory was created a couple of seconds after the error gets logged, so I'm guessing mailbox_log_alloc is just getting called too early. It obviously just happens that first time. Doesn't seem to actually be hurting anything, that I can tell.
One other kind of weird thing I saw was that originally I had my indexes on a dedicated partition with ext4 (Debian Lenny, i386, 2.6.33.1), I was seeing scattered messages like this:
Mar 29 18:02:38 imap(mail@box): Error: FETCH [] for mailbox INBOX UID 3244 got too little data: 1116 vs 1087383 Mar 29 18:02:38 imap(mail@box): Error: Corrupted index cache file /var/tmp/Indexes/12/dc/mail@box/.INBOX/dovecot.index.cache: Broken virtual size for mail UID 3244
The message corresponding with UID 3244 was around 1116 bytes (+32 bytes, IIRC), so no clue where the 1087383. In all the cases of that error I saw, the lefthand number of bytes was very closer to correct and the righthand was way way off, and the S= flag in the filename was correct too. I'm guessing that the random righthand number was just due to corrupted cache. When I reformatted that partition as ext3, the error went almost completely away (just a few total over the past few days, as opposed to a few per hour when it was on ext4). I don't even have an educated guess as to why that'd happen. A wild guess would be that ext4's delayed allocation was causing problems.
Even under ext4, it was still pretty rare and only, say, 1 out of 100 imap logins. And ext3 performance on the indexes partition so far has been just fine, so I don't plan on switching back to ext4 -- though if anyone has any experience with putting indexes on reiserfs and/or xfs versus ext3, I'd love to hear your thoughts.
I've also seen a couple of these in the past few days (on beta4, not repo version -- haven't been running repo version long enough to say if it'd do the same):
Mar 31 21:52:54 imap(mail@box): Error: fdatasync(/mail/username/mail@box/Maildir/dovecot-uidlist) failed: Input/output error Mar 31 21:52:54 imap(mail@box): Error: Raw backtrace: /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot.so.0 [0xb77c0b71] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot.so.0 [0xb77c0bef] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot.so.0(i_error+0) [0xb77c0ea8] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(mail_index_sync_begin_to+0x9a9) [0xb7837ca9] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(mail_index_sync_begin+0x56) [0xb7837d56] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(maildir_sync_index_begin+0x51) [0xb78734d1] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0 [0xb78716eb] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(maildir_storage_sync_force+0x4d) [0xb78719bd] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(maildir_uidlist_lookup+0x96) [0xb7877646] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0 [0xb787215a] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(maildir_sync_index+0x7f0) [0xb78729d0] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0 [0xb78715b6] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(maildir_storage_sync_init+0x162) [0xb7871b92] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(mailbox_sync_init+0x39) [0xb7815759] -> dovecot/imap mail@box 215.108.28.102 UID [0x805efa4] -> dovecot/imap mail@box 215.108.28.102 UID [0x805f2a3] -> dovecot/imap mail@box 215.108.28.102 UID [0x8056de6] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot.so.0 [0xb77d5d4a] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0xf5) [0xb77cd1d5] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot.so.0(io_loop_run+0x30) [0xb77cc220] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot.so.0(master_service_run+0x2a) [0xb77b9e1a] -> dovecot/imap mail@box 215.108.28.102 UID [0x805f8e5] -> /lib/libc.so.6(__libc_start_main+0xe5) [0xb764d455] -> dovecot/imap [mail@box 215.108.28.102 UID] [0x804e621] Mar 31 21:52:54 master: Error: service(imap): child 4500 killed with signal 6 (core dumps disabled)
In 4 days of beta testing with a fair number of users, I've seen maybe 4 of these backtraces total, so not exactly a big issue.
These are all super minor things. Everything's been fantastic so far.
On Fri, 2010-04-02 at 11:35 -0700, Mark Moseley wrote:
Apr 02 13:25:24 imap(mail@box): Error: creat(/var/tmp/Indexes/14/75/mail@box/dovecot.mailbox.log) failed: No such file or directory
Finally fixed: http://hg.dovecot.org/dovecot-2.0/rev/6571b3a9c44f
Mar 29 18:02:38 imap(mail@box): Error: FETCH [] for mailbox INBOX UID 3244 got too little data: 1116 vs 1087383
Are you getting these anymore?
Mar 31 21:52:54 imap(mail@box): Error: fdatasync(/mail/username/mail@box/Maildir/dovecot-uidlist) failed: Input/output error
This is caused by some kind of an NFS communication error or kernel bug or something. Really shouldn't be happening ever.
/opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(mail_index_sync_begin_to+0x9a9) [0xb7837ca9] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(mail_index_sync_begin+0x56) [0xb7837d56] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(maildir_sync_index_begin+0x51) [0xb78734d1] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0 [0xb78716eb] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(maildir_storage_sync_force+0x4d) [0xb78719bd] ->
I think I have fixed this.
On Wed, Jul 7, 2010 at 8:08 AM, Timo Sirainen <tss@iki.fi> wrote:
On Fri, 2010-04-02 at 11:35 -0700, Mark Moseley wrote:
Apr 02 13:25:24 imap(mail@box): Error: creat(/var/tmp/Indexes/14/75/mail@box/dovecot.mailbox.log) failed: No such file or directory
Finally fixed: http://hg.dovecot.org/dovecot-2.0/rev/6571b3a9c44f
Is that revision pre-rc1 or contained in it?
Mar 29 18:02:38 imap(mail@box): Error: FETCH [] for mailbox INBOX UID 3244 got too little data: 1116 vs 1087383
Are you getting these anymore?
I've been running beta5 for a while and not been seeing those. The closest thing I get now is "Error: FETCH for mailbox INBOX UID 4322 failed to read message input: Stale NFS file handle", but that's obviously not Dovecot's fault.
Mar 31 21:52:54 imap(mail@box): Error: fdatasync(/mail/username/mail@box/Maildir/dovecot-uidlist) failed: Input/output error
This is caused by some kind of an NFS communication error or kernel bug or something. Really shouldn't be happening ever.
NFS hiccup sounds pretty plausible :) They're pretty rare anyway.
/opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(mail_index_sync_begin_to+0x9a9) [0xb7837ca9] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(mail_index_sync_begin+0x56) [0xb7837d56] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(maildir_sync_index_begin+0x51) [0xb78734d1] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0 [0xb78716eb] -> /opt/dovecot-2.0.beta4/lib/dovecot/libdovecot-storage.so.0(maildir_storage_sync_force+0x4d) [0xb78719bd] ->
I think I have fixed this.
Going out of town tomorrow morning. I'll check when I get back. I'm eager to roll out 2.0rc1 (though beta5 has been completely stable for me) but didn't think it'd be a good idea right before I got on a plane :)
thx!
On 2010-07-07 1:19 PM, Mark Moseley wrote:
Going out of town tomorrow morning. I'll check when I get back. I'm eager to roll out 2.0rc1 (though beta5 has been completely stable for me) but didn't think it'd be a good idea right before I got on a plane :)
Aww, come on! Where's your sense of adventure?! ;)
On Wed, 2010-07-07 at 10:19 -0700, Mark Moseley wrote:
On Wed, Jul 7, 2010 at 8:08 AM, Timo Sirainen <tss@iki.fi> wrote:
On Fri, 2010-04-02 at 11:35 -0700, Mark Moseley wrote:
Apr 02 13:25:24 imap(mail@box): Error: creat(/var/tmp/Indexes/14/75/mail@box/dovecot.mailbox.log) failed: No such file or directory
Finally fixed: http://hg.dovecot.org/dovecot-2.0/rev/6571b3a9c44f
Is that revision pre-rc1 or contained in it?
It's post-rc1. I fixed it a few minutes before sending the mail.
I've been running beta5 for a while and not been seeing those. The closest thing I get now is "Error: FETCH for mailbox INBOX UID 4322 failed to read message input: Stale NFS file handle", but that's obviously not Dovecot's fault.
You should be running the new director I think. :)
On Wed, Jul 7, 2010 at 12:22 PM, Timo Sirainen <tss@iki.fi> wrote:
On Wed, 2010-07-07 at 10:19 -0700, Mark Moseley wrote:
On Wed, Jul 7, 2010 at 8:08 AM, Timo Sirainen <tss@iki.fi> wrote:
On Fri, 2010-04-02 at 11:35 -0700, Mark Moseley wrote:
Apr 02 13:25:24 imap(mail@box): Error: creat(/var/tmp/Indexes/14/75/mail@box/dovecot.mailbox.log) failed: No such file or directory
Finally fixed: http://hg.dovecot.org/dovecot-2.0/rev/6571b3a9c44f
Is that revision pre-rc1 or contained in it?
It's post-rc1. I fixed it a few minutes before sending the mail.
Looks good. I haven't seen any yet.
I've been running beta5 for a while and not been seeing those. The closest thing I get now is "Error: FETCH for mailbox INBOX UID 4322 failed to read message input: Stale NFS file handle", but that's obviously not Dovecot's fault.
You should be running the new director I think. :)
I've only got one instance of this error so far and none of the 'got too little data' errors. I've been overwhelmingly happy with just running things on NFS so far. Usually my mail users (and there's millions of them) go berserk over mail issues but there's been an almost disconcerting quiet regarding IMAP ever since we moved to Dovecot.
The only oddities I've found in my logs so far since upgrading to rc2 (or at least I didn't notice before rc2) are these. These are all just very minor (between them, maybe 10-15 mentions in 250k lines of logs).
Jul 20 12:17:30 imap-login: Info: Internal login failure (auth failed, 1 attempts): user=<user@mailbox.com>, method=PLAIN, rip=x.x.x.x, lip=192.168.152.37, mpid=0, TLS
Jul 20 12:17:41 auth: Error: Master requested auth for nonexisting client 16380
Just a single instance of: Jul 20 09:40:38 imap(username@mailbox.com): Warning: /path/to/Maildir/dovecot-uidlist: Duplicate file entry at line 23: 1279633198.H285683P32723.servername,S=13870:2,ad (uid 669 -> 670)
In both cases, there's only a few of each, in the space of 250k lines of logs, so very rare, but slightly puzzling. I'm guessing in #2, the process must've died/gotten killed. Looking at the source for #1, I have no clue (nor do we use SASL -- everything is using mysql for auth). #3 might be chalked up to just using NFS?
On Tue, 2010-07-20 at 10:55 -0700, Mark Moseley wrote:
- Jul 20 12:17:30 imap-login: Info: Internal login failure (auth failed, 1 attempts): user=<user@mailbox.com>, method=PLAIN, rip=x.x.x.x, lip=192.168.152.37, mpid=0, TLS
This is Info message, so it should be accompanied with Error/Fatal/Panic. Most likely this one:
- Jul 20 12:17:41 auth: Error: Master requested auth for nonexisting client 16380
What happens is:
imap/pop3-login process authenticates user via auth process successfully and forwards the connection to imap/pop3 process.
imap/pop3 process asks auth process to verify that the user is authenticated and asks for userdb info.
[Somewhere before this imap/pop3-login either died or for some reason disconnected from auth process.]
- Auth process notices that the login process is gone and fails with this error.
It shouldn't be happening, so it's a bug. What kind of service settings do you have? Could you post "doveconf -n service" output?
- Just a single instance of: Jul 20 09:40:38 imap(username@mailbox.com): Warning: /path/to/Maildir/dovecot-uidlist: Duplicate file entry at line 23: 1279633198.H285683P32723.servername,S=13870:2,ad (uid 669 -> 670)
Probably because of NFS caching.
participants (3)
-
Charles Marcus
-
Mark Moseley
-
Timo Sirainen