[Dovecot] still problem with load
hi, we've got (and getting more) serious problem with load. I'm just calculate that ther is about 300 mailbox with 20Gb mails. and the whole system is _very_ slow and just getting slower and slower. it's a fast lan every client has 100Mb connection to the server with gigabit crad and gigabit switches. when I push the delete button in a mail I usualy have to wait 10,20 or 30 seconds (!!!) to deleted and show me the next one. dovecot eat so much io to the hard disk, that we can't do too much. currently at the same time about 100 clients are connected mostly with OE6. the server is a P4 server with 1GB ram and about 1TB hard disk. there is only one samba and dovecot on the server. and the load use to be something like this:
10:47:38 up 3 days, 15:48, 2 users, load average: 4.64, 3.58, 3.43 230 processes: 225 sleeping, 5 running, 0 zombie, 0 stopped CPU states: 6.3% user 8.9% system 0.0% nice 0.0% iowait 84.6% idle Mem: 1022840k av, 1013588k used, 9252k free, 0k shrd, 80320k buff 826372k actv, 47324k in_d, 16904k in_c Swap: 4193608k av, 80612k used, 4112996k free 783088k cached
as you can see it use almost all ram (as the 2.4 kernmel used to do) but no cache. so the ram is enough. the more interesting part is the load while the idle usualy between 80-90% the whole system doesn't use any cpu (or just minimal). BUT at the same time the load is around 3,4,5 (!!!) which is too much, and more than acceptable. what's worse the reason is the io load on the hard disks. every process are waiting for io. in this case I usualy service dovecot stop killall imap <wait about one minutes> service dovecot start and the load goes back to 0.3-0.6. after an hour the same happend and I have to do the same.:-((( what's more in this case samba case that all (!!!) clients stops for a few seconds. since everybody use OE6 they always download all messages with huge mailboxes what's more dovecot index the whole system all the time. I assume this cause the load. IMHO 300 mailbox and 100 concurrent user is not such a big thing. is there any way to restrict dovecot not to use so much io? thanks in advance.
-- Levente "Si vis pacem para bellum!"
On Wed, 2003-05-21 at 11:58, Farkas Levente wrote:
since everybody use OE6 they always download all messages with huge mailboxes
OE6 shouldn't do that, unless UIDVALIDITY changes. And it shouldn't change.
what's more dovecot index the whole system all the time.
Dovecot should neither do that unless it thinks there's something wrong with the indexes. Don't you get any error messages in logs?
If it really is rebuilding the indexes all the time without errors, try what happens if you use in-memory indexes? ie. add :INDEX=MEMORY to default_mail_env setting. Although that pretty much requires 0.99.10-test releases since there was some bugs in it before.
And if OE6 reloads everything, check if the UIDVALIDITY keeps changing in dovecot-uidlist. It's in the first line, middle number.
IMHO 300 mailbox and 100 concurrent user is not such a big thing.
No, it shouldn't be.
Timo Sirainen wrote:
On Wed, 2003-05-21 at 11:58, Farkas Levente wrote:
since everybody use OE6 they always download all messages with huge mailboxes
OE6 shouldn't do that, unless UIDVALIDITY changes. And it shouldn't change.
what's more dovecot index the whole system all the time.
Dovecot should neither do that unless it thinks there's something wrong with the indexes. Don't you get any error messages in logs?
If it really is rebuilding the indexes all the time without errors, try what happens if you use in-memory indexes? ie. add :INDEX=MEMORY to default_mail_env setting. Although that pretty much requires 0.99.10-test releases since there was some bugs in it before.
And if OE6 reloads everything, check if the UIDVALIDITY keeps changing in dovecot-uidlist. It's in the first line, middle number.
no it doesn't changing:-(
IMHO 300 mailbox and 100 concurrent user is not such a big thing.
No, it shouldn't be.
ok so what can be the reason? if I stop dovecot, the load fall down to about 0.3. if I start dovecot and about 50 people connect to it it goes up to 3.x-4.x. which actualy irritate me and the samba server stops for a few seconds for every 10-20 minutes and it took about 20-30 second to delete a mail in mozilla an show me the next (simple text) messages. any tip? how can I test it or help to you to find out the reason? it there any way to measure the total amount of read and write issued by all dovecot process? or...?
-- Levente "Si vis pacem para bellum!"
On Wed, 2003-05-21 at 13:19, Farkas Levente wrote:
IMHO 300 mailbox and 100 concurrent user is not such a big thing. No, it shouldn't be.
ok so what can be the reason?
Well, check what OE6 is talking to Dovecot. Is it fetching headers or bodies for all messages? It should do that to only new messages.
Rawlog or some network sniffers could tell what exactly is happening. http://dovecot.procontrol.fi/bugreport.html#sniffing
Also is Dovecot really rebuilding the indexes all the time? How do you know? Try the in-memory indexes, at least then it's not rebuilding any indexes to disk.
it there any way to measure the total amount of read and write issued by all dovecot process? or...?
Dovecot doesn't count them, but your OS might. I don't know really.
imap(lfarkas): May 21 12:20:01 Error: Corrupted index file /home/lfarkas/Maildir/.INBOX/.imap.index: index.next_uid (64) > uidlist.next_uid (63)
Hmm.. I was going to suggest trying 0.99.10-test, but I guess you're using it now? Looks like I have similiar error messages in my log :) Have to fix those.
You didn't say how long this load problem has been going on, was it only recently or for longer now?
Timo Sirainen wrote:
Well, check what OE6 is talking to Dovecot. Is it fetching headers or bodies for all messages? It should do that to only new messages.
Rawlog or some network sniffers could tell what exactly is happening. http://dovecot.procontrol.fi/bugreport.html#sniffing
Also is Dovecot really rebuilding the indexes all the time? How do you know? Try the in-memory indexes, at least then it's not rebuilding any indexes to disk.
actually it was just a tip. what is sure, if I stop dovecot the load fall down and there is no cpu load at all.
imap(lfarkas): May 21 12:20:01 Error: Corrupted index file /home/lfarkas/Maildir/.INBOX/.imap.index: index.next_uid (64) > uidlist.next_uid (63)
Hmm.. I was going to suggest trying 0.99.10-test, but I guess you're using it now? Looks like I have similiar error messages in my log :) Have to fix those.
this is the today morning cvs.
You didn't say how long this load problem has been going on, was it only recently or for longer now?
recently one or two week, but eg todays I see load around 8-9...
-- Levente "Si vis pacem para bellum!"
Timo Sirainen wrote:
On Wed, 2003-05-21 at 11:58, Farkas Levente wrote:
since everybody use OE6 they always download all messages with huge mailboxes
OE6 shouldn't do that, unless UIDVALIDITY changes. And it shouldn't change.
what's more dovecot index the whole system all the time.
Dovecot should neither do that unless it thinks there's something wrong with the indexes. Don't you get any error messages in logs?
If it really is rebuilding the indexes all the time without errors, try what happens if you use in-memory indexes? ie. add :INDEX=MEMORY to default_mail_env setting. Although that pretty much requires 0.99.10-test releases since there was some bugs in it before.
And if OE6 reloads everything, check if the UIDVALIDITY keeps changing in dovecot-uidlist. It's in the first line, middle number.
IMHO 300 mailbox and 100 concurrent user is not such a big thing.
No, it shouldn't be.
hops when I delete you next message (about drac), I've got an error message that my server disconnected and the following error message in the log file:
imap(lfarkas): May 21 12:20:01 Error: Corrupted index file /home/lfarkas/Maildir/.INBOX/.imap.index: index.next_uid (64) > uidlist.next_uid (63)
there is only on connection to the server and just only me who read my mailbox.
-- Levente "Si vis pacem para bellum!"
Farkas Levente wrote:
Timo Sirainen wrote:
On Wed, 2003-05-21 at 11:58, Farkas Levente wrote:
since everybody use OE6 they always download all messages with huge mailboxes
OE6 shouldn't do that, unless UIDVALIDITY changes. And it shouldn't change.
what's more dovecot index the whole system all the time.
Dovecot should neither do that unless it thinks there's something wrong with the indexes. Don't you get any error messages in logs?
If it really is rebuilding the indexes all the time without errors, try what happens if you use in-memory indexes? ie. add :INDEX=MEMORY to default_mail_env setting. Although that pretty much requires 0.99.10-test releases since there was some bugs in it before.
And if OE6 reloads everything, check if the UIDVALIDITY keeps changing in dovecot-uidlist. It's in the first line, middle number.
IMHO 300 mailbox and 100 concurrent user is not such a big thing.
No, it shouldn't be.
hops when I delete you next message (about drac), I've got an error message that my server disconnected and the following error message in the log file:
imap(lfarkas): May 21 12:20:01 Error: Corrupted index file /home/lfarkas/Maildir/.INBOX/.imap.index: index.next_uid (64) > uidlist.next_uid (63)
there is only on connection to the server and just only me who read my mailbox.
and a few more error log: ----------------------imap(pbalkanyi): May 21 13:04:17 Error: Corrupted index file /home/pbalkanyi/Maildir/.INBOX/.imap.index: index.next_uid (287) > uidlist.next_uid (286) imap(pbalkanyi): May 21 13:04:17 Error: Couldn't lock created modify log file /home/pbalkanyi/Maildir/.INBOX/.imap.index.log imap-login: May 21 13:04:33 Info: Login: pbalkanyi [192.168.1.164] imap(pbalkanyi): May 21 13:04:33 Error: IndexID mismatch for modify log file /home/pbalkanyi/Maildir/.INBOX/.imap.index.log imap(zkempf): May 21 13:05:17 Error: Corrupted index file /home/zkempf/Maildir/.INBOX/.imap.index: index.next_uid (180) > uidlist.next_uid (179) imap(zkempf): May 21 13:05:17 Error: Couldn't lock created modify log file /home/zkempf/Maildir/.INBOX/.imap.index.log imap-login: May 21 13:05:25 Info: Login: zkempf [192.168.0.134] imap(zkempf): May 21 13:05:25 Error: IndexID mismatch for modify log file /home/zkempf/Maildir/.INBOX/.imap.index.log imap(pbalkanyi): May 21 13:05:48 Error: Warning: Inconsistency - Index /home/pbalkanyi/Maildir/.INBOX/.imap.index was rebuilt while we had it open
-- Levente "Si vis pacem para bellum!"
On Wed, May 21, 2003 at 14:44:45 +0300, Timo Sirainen wrote:
On Wed, 2003-05-21 at 13:26, Farkas Levente wrote:
imap(lfarkas): May 21 12:20:01 Error: Corrupted index file /home/lfarkas/Maildir/.INBOX/.imap.index: index.next_uid (64) > uidlist.next_uid (63)
Fixed in 0.99.10-test5.
It was bug, not feature? :)
Confirmed - got rid of these error messages.
participants (3)
-
Farkas Levente
-
Sebastian Pachuta
-
Timo Sirainen