[Dovecot] Large & busy site, NFS with deliver only servers
Timo / Others,
I have been working on a new installation for a fairly busy site, and after many weeks of tribulation have come to an architecture Im happy with: 2x Debian (2.6.18) - MXing machines running Postfix / MailScanner / Dovecot-LDA (A slightly patched RC17 for prettier Quota bounces) 2x Debian (2.6.18) - Mail retrieval machines running Dovecot IMAP/POP3 (Currently RC17) 3x Node Isilon NFS cluster (NetApp type devices)
I chose Dovecot-LDA over Postfix-Virtual / Maildrop / Procmail due to its
hassle free auto creation of new user maildirs, and the funkyness of built
in Sieve functionality. The on the fly indexing sounded like a bonus at the
time. However as Im sure all people who stress test with index's stored
over NFS find, index locking makes for some 'spectacular crashes' (see
others previous posts).
I have tried out the recommended mount options and locking methods to no
avail. Once under some stress, say ~60 messages a minute from the two front
end machines, understandably the index's get pickled instantly. So as a
hack way around the index locking problems, I have configured the MXing
machines to index locally to no where (/tmp) - and deal with the very
slightly slower IMAP/POP3 performance on the retrieval side.
Is there way to tell/patch Dovecot-LDA not to bother with indexing? So Im not creating unused index files.
Cheers
Dean Manners
On Mon, 2007-01-22 at 22:08 +1100, Dean Manners wrote:
Is there way to tell/patch Dovecot-LDA not to bother with indexing? So Im not creating unused index files.
mail_location = ...:INDEX=MEMORY
If you want the same dovecot.conf to use indexes for pop3/imap, put the mail_location inside protocol lda {}
Thanks Timo. It definitely seemed to make some improvement, but I still end up crashing out after a couple of minutes with:
Jan 22 23:30:48 mailfilter01 deliver(drawkcab@drawkcab.com): seq = 1161, rec->uid = 0, first_new_seq = 1161, records = 1160 Jan 22 23:30:48 mailfilter01 deliver(drawkcab@drawkcab.com): Raw backtrace: /usr/lib/dovecot/deliver(i_syslog_panic_handler+0x2b) [0x80a6d8b ] -> /usr/lib/dovecot/deliver [0x80a6ba9] -> /usr/lib/dovecot/deliver [0x808a102] -> /usr/lib/dovecot/deliver [0x808a203] -> /usr/lib/doveco t/deliver [0x808a2f6] -> /usr/lib/dovecot/deliver(mail_index_transaction_commit+0x43) [0x808a663] -> /usr/lib/dovecot/deliver(index_transact ion_commit+0x28) [0x8080808] -> /usr/lib/dovecot/deliver(maildir_transaction_commit+0x29) [0x805c7f9] -> /usr/lib/dovecot/modules/lda/lib10_ quota_plugin.so [0xb7dc8dc5] -> /usr/lib/dovecot/deliver(mailbox_transaction_commit+0x20) [0x8099b70] -> /usr/lib/dovecot/deliver(deliver_sa ve+0xac) [0x805628c] -> /usr/lib/dovecot/modules/lda/lib90_cmusieve_plugin.so [0xb7daba0e] -> /usr/lib/dovecot/modules/lda/lib90_cmusieve_pl ugin.so [0xb7dba48d] -> /usr/lib/dovecot/modules/lda/lib90_cmusieve_plugin.so(sieve_execute_bytecode +0x12e) [0xb7dba95e] -> /usr/lib/dovecot /modules/lda/lib90_cmusieve_plugin.so(cmu_sieve_run
Tried disabling sieve and maildir plugins but still crashed out with the same message. Config used is: http://www.drawkcab.com/dumpyard/deliverydovecot.conf.txt http://www.drawkcab.com/dumpyard/deliverydovecot-sql.conf.txt
Dean Manners
-----Original Message----- From: Timo Sirainen [mailto:tss@iki.fi] Sent: Monday, January 22, 2007 10:38 PM To: Dean Manners Cc: dovecot@dovecot.org Subject: Re: [Dovecot] Large & busy site, NFS with deliver only servers
On Mon, 2007-01-22 at 22:08 +1100, Dean Manners wrote:
Is there way to tell/patch Dovecot-LDA not to bother with indexing? So Im not creating unused index files.
mail_location = ...:INDEX=MEMORY
If you want the same dovecot.conf to use indexes for pop3/imap, put the mail_location inside protocol lda {}
On Tue, 2007-01-23 at 00:44 +1100, Dean Manners wrote:
Thanks Timo. It definitely seemed to make some improvement, but I still end up crashing out after a couple of minutes with:
Jan 22 23:30:48 mailfilter01 deliver(drawkcab@drawkcab.com): seq = 1161, rec->uid = 0, first_new_seq = 1161, records = 1160
Weird, I thought I had gotten rid of this problem already. Could you try if these patches change this assert to something else:
http://dovecot.org/list/dovecot-cvs/2007-January/007487.html http://dovecot.org/list/dovecot-cvs/2007-January/007488.html
http://www.drawkcab.com/dumpyard/deliverydovecot-sql.conf.txt
:index=memory doesn't really do anything. It must be uppercased.
Timo/others, Further to my index problems with dovecot-lda over NFS when busy. Is it possible to also stop the writing of dovecot-uidlist? I have tried a shot in the dark with :INDEX=MEMORY:UIDLIST=MEMORY but it doesn't seem to stop the writing of the dovecot-uidlist upon delivery. Which when busy, causes deliver processes to hang (and usually timeout/defer to the mta) with many log entries such as:
Feb 5 13:15:07 mailfilter01 deliver(stresstest@stresstest.com): rename(/var/mailstore/stresstest.com/1/stresstest@stresstest.com/.Postal/dov ecot-uidlist.lock, /var/mailstore/stresstest.com/1/stresstest@stresstest.com/.Postal/dovecot-ui dlist) failed: No such file or directory
I know taking out the uidlist and indexing takes away the key benefits of deliver. However the only way I can see being able to deliver concurrently to NFS mailstore is by blindly dropping in the messages ?
Thanks
Dean Manners
-----Original Message----- From: dovecot-bounces@dovecot.org [mailto:dovecot-bounces@dovecot.org] On Behalf Of Timo Sirainen Sent: Monday, January 22, 2007 10:38 PM To: Dean Manners Cc: dovecot@dovecot.org Subject: Re: [Dovecot] Large & busy site, NFS with deliver only servers
On Mon, 2007-01-22 at 22:08 +1100, Dean Manners wrote:
Is there way to tell/patch Dovecot-LDA not to bother with indexing? So Im not creating unused index files.
mail_location = ...:INDEX=MEMORY
If you want the same dovecot.conf to use indexes for pop3/imap, put the mail_location inside protocol lda {}
On Mon, 2007-02-05 at 14:03 +1100, Dean Manners wrote:
Timo/others, Further to my index problems with dovecot-lda over NFS when busy. Is it possible to also stop the writing of dovecot-uidlist? I have tried a shot in the dark with :INDEX=MEMORY:UIDLIST=MEMORY but it doesn't seem to stop the writing of the dovecot-uidlist upon delivery. Which when busy, causes deliver processes to hang (and usually timeout/defer to the mta) with many log entries such as:
Feb 5 13:15:07 mailfilter01 deliver(stresstest@stresstest.com): rename(/var/mailstore/stresstest.com/1/stresstest@stresstest.com/.Postal/dov ecot-uidlist.lock, /var/mailstore/stresstest.com/1/stresstest@stresstest.com/.Postal/dovecot-ui dlist) failed: No such file or directory
So why exactly is it busy? Are there tons of deliveries to this address, or is the whole NFS system just hanging which is causing these?
Were there other errors before this one? I'd guess there was something about overriding the .lock file?
I know taking out the uidlist and indexing takes away the key benefits of deliver. However the only way I can see being able to deliver concurrently to NFS mailstore is by blindly dropping in the messages ?
Why are you using deliver then? :)
There anyway isn't an option to not update the uidlist file.
-----Original Message----- From: dovecot-bounces@dovecot.org [mailto:dovecot-bounces@dovecot.org] On Behalf Of Timo Sirainen Sent: Monday, February 05, 2007 11:42 PM To: Dean Manners Cc: 'Dovecot Mailing List' Subject: Re: [Dovecot] Large & busy site, NFS with deliver only servers
So why exactly is it busy? Are there tons of deliveries to this address, or is the whole NFS system just hanging which is causing these?
Multiple deliveries (multiple delivering servers), or delivery whilst IMAP/POP(ing). The process(s) seem to go into a frenzied loop state of NFS GETATTR trying to wait for dovecot-uidlist.lock.
Were there other errors before this one? I'd guess there was something about overriding the .lock file?
Feb 5 13:07:36 mailfilter01 deliver(stresstest@stresstest.com): file_dotlock_replace(/var/mailstore/stresstest.com/1/stresstest@stresstest.c om/doveco t-uidlist) failed: No such file or directory Feb 5 13:07:36 mailfilter01 deliver(stresstest@stresstest.com): Our dotlock file /var/mailstore/stresstest.com/1/stresstest@stresstest.com/dovecot-ui dlist.lock was deleted (kept it 0 secs) Feb 5 13:07:36 mailfilter01 deliver(stresstest@stresstest.com): /var/mailstore/stresstest.com/1/stresstest@stresstest.com/dovecot-uidlist: next_uid w as lowered (18 -> 17)
Why are you using deliver then? :)
Because it neatly fits in the criteria; Folder sorting on delivery, Quotas, MySQL, Auto maildir creation, Very active developer :) Have considered try maildrop, however that obviously creates additional work to fulfill the above criteria.
There anyway isn't an option to not update the uidlist file.
Hmmmm ok, perhaps theres something that can be done to reduce the chance of it happening by having the MTA (Postfix) limit concurrent deliveries. Thanks
On Tue, 2007-02-06 at 00:38 +1100, Dean Manners wrote:
Feb 5 13:07:36 mailfilter01 deliver(stresstest@stresstest.com): Our dotlock file /var/mailstore/stresstest.com/1/stresstest@stresstest.com/dovecot-ui dlist.lock was deleted (kept it 0 secs)
I don't think this looks good. Either the NFS server really lost the file, or something decided to delete it even though the file was just created.
Are the clocks in all the servers synchronized? They should be less than 1 second different.
participants (2)
-
Dean Manners
-
Timo Sirainen