Thanks for the detailed answer (it helps)!
Cheers,
Jan
-----Oorspronkelijk bericht----- Van: dovecot-bounces+jan.vandenberg=isp.solcon.nl@dovecot.org [mailto:dovecot-bounces+jan.vandenberg=isp.solcon.nl@dovecot.org] Namens mikkel@euro123.dk Verzonden: vrijdag 22 februari 2008 12:36 Aan: dovecot@dovecot.org Onderwerp: Re: [Dovecot] Dovecot Sieve scalability
Hi,
So you call on a single binary "deliver". This binary then looks for a .dovecot.sieve file in the user maildir, _compiles_ it to .dovecot.sievec file, and than drops the mail according to the rules. Calling a single binary: that's isn't threaded, doesn't fork, and that has to compile a file (everytime?) it's called isn't scalable. Multiply this with 100 mails(calls) per second and this is bound for problems.
The Dovecot/Sieve implementation is functional but not very elegant or robust. But this can be explained because Dovecot is build for _retrieving_ mail (imap/pop) and _not_ delivering mail.
Any thoughts on this? Are there people out there with large Dovecot+Sieve implementations (100k+ users). Are there benchmarks available; how well does it perform under heavy load (mails/sec)?
This does not appear to be a problem. I think Dovecot/Sieve is smart enough to only do the compilation if .dovecot.sieve is newer than .dovecot.sievec
I'm not sure if this qualifies for your definition of a "large" installations but here is an example of the installation that I manage.
It may not be directly transferable to your setup since the level of activity varies a lot from user base to user base. As of this writing there are some 25 concurrent pop3 sessions and some 75 concurrent IMAP sessions (which I guess is relatively low activity considered the size of user base).
Anyway I have ~75.000 e-mail addresses on one Dovecot installation and an estimated 200.000 deliveries a day (no guarantee here since deliveries haven't been measured for a long time). This is all located on two SAN luns each consisting of 5x500GB Hitachi SATA disks (RAID-5). These to LUNs are NFS exported (ZFS on Solaris) over Gbit ethernet and mounted as storage for Dovecot (both indexes and mail storage). The total size of Maildir storage+indexes+cache files is ~200GB data.
Given the fact that all this happens with the spindles of only ~8 disks (not counting the parity drives) each rated to just ~60 IOPS, and the fact that all operations are done through NFS, I must say that I'm impressed with the performance of Dovecot (although ZFS cache and SAN cache helps a bit). Postfix is the MTA, but local delivery, pop3 and IMAP is all dovecot. The last majority of users have sieve scripts in their homedirs. Albeit only simple ones like putting mails into Junk if the title contains one specific keywork.
If you are curious about the CPU usage as well; this activity results in 5-7% percent CPU usage on the server for Dovecot and Postfix. And another 3-5% CPU usage on the mysql database that holds all the account information (for both postfix and dovecot). The server is one Sun T2000 with 8 cores each running 1Ghz. Total RAM usage for this setup is about 6-7 gigs (including the mysql cluster snatching 3 gigs and Postfix smtpd processes eating 2-3 gigs).
That being said the bottleneck is mail delivery. The postfix queue is handled on separate local disks (2x73GB 10k RPM SAS in RAID-1) and under heavy load the queue can build op about twice as fast as the delivery to the Maildirs. So if the amount of daily deliveries where do double for instance then I'd definitely have to add at least one more LUN to handle it.
I think that the IO penalty for delivery is more due to updating indexes than handling Sieve though...anyway it's performing nicely :)
Hope this helps
Regards, Mikkel