[Dovecot] quick question

David Halik dhalik at jla.rutgers.edu
Fri Jan 22 22:34:18 EET 2010


On 01/22/2010 01:15 PM, Brandon Davidson wrote:
>
> We have a much similar setup - 8 POP/IMAP servers running RHEL 5.4,
> Dovecot 1.2.9 (+ patches), F5 BigIP load balancer cluster
> (active/standby) in a L4 profile distributing connections round-robin,
> maildirs on two Netapp Filers (clustered 3070s with 54k RPM SATA disks),
> 10k peak concurrent connections for 45k total accounts. We used to run
> with the noac mount option, but performance was abysmal, and we were
> approaching 80% CPU utilization on the filers at peak load. After
> removing noac, our CPU is down around 30%, and our NFS ops/sec rate is
> maybe 1/10th of what it used to be.
>    

Wow, that's almost the exact same setup we use, except we have 10 
IMAP/POP and a clustered pair of FAS920's with 10K drives which are 
getting replaced in a few weeks. We also have a pair of clustered 
3050's, but they're not running dovecot (yet).

You're right about noac though, it absolutely destroyed our netapps. Of 
course the corruption was all but eliminated, but the filer performance 
was so bad our users immediately noticed. Definitely not an option.

> The downside to this is that we've started seeing significantly more
> crashing and mailbox corruption. Timo's latest patch seems to have fixed
> the crashing, but the corruption just seems to be the cost of
> distributing users at random across our backend servers.
>    

Yep, I agree. Like I said in the last email, we'll going to deal with it 
for now and see if anyone really notices. I can live with it if the 
users don't care.

Timo, speaking of which, I'm guessing everyone is happy with the latest 
patches, any ETA on 1.2.10? ;)

> We've thought about enabling IP-based session affinity on the load
> balancer, but this would concentrate the load of our webmail clients, as
> well as not really solving the problem for users that leave clients open
> on multiple systems.
>    

We currently have IP session 'sticky' on our L4's and it didn't help all 
that much. yes, it reduces thrashing on the backend, but ultimately it 
won't help the corruption. Like you said, multiple logins will still go 
to different servers when the IP's are different.

How if your webmail architecture setup? We're using imapproxy to spread 
them them out across the same load balancer, so essentially all traffic 
from outside and inside get's balanced. The trick is we have an internal 
load balanced virtual IP that spreads the load out for webmail on 
private IP space. If they were to go outside they would get NAT'd as one 
outbound IP, so we just go inside and get the benefit of balancing.


> Anyway, that's where we're at with the issue. As a data point for your
> discussion with your boss:
> * With 'noac', we would see maybe 1 or two 'corrupt' errors a day. Most
> of these were related to users going over quota.
> * After removing 'noac', we saw 5-10 'Corrupt' errors and 20-30 crashes
> a day. The crashes were highly visible to the users, as their mailbox
> would appear to be empty until the rebuild completed.
> * Since applying the latest patch, we've seen no crashes, and 60-70
> 'Corrupt' errors a day. We have not had any new user complaints.
>    

That's where we are, and as long as the corruptions stay user invisible, 
I'm fine with it. Crashes seem to be the only user visible issue so far, 
with "noac" being out of the question unless they buy a ridiculously 
expensive filer.

-- 
================================
David Halik
System Administrator
OIT-CSS Rutgers University
dhalik at jla.rutgers.edu
================================



More information about the dovecot mailing list