On 04/07/2011 06:49 PM, Stan Hoeppner wrote:
Eric Shubert put forth on 4/7/2011 4:04 PM:
On 04/07/2011 12:45 PM, Stan Hoeppner wrote:
Kádár Tamás (KTamas) put forth on 4/7/2011 9:33 AM:
Hi
Okay so we've been having this issue since forever and I figured why the heck not ask it here since it's mostly related to dovecot. The thing is, we have a huge amount of public folders (at the moment, around 1100). Now, with dovecot indexing and caching we're mostly okay, also being careful with things like Thunderbird 3's default 'Download everything' option ("Keep messages for this account on this computer") and such. However once in a while, someone goes rogue, we install a new version of thunderbird, someone accidentally sets up an email account in thunderbird mistakingly leaving the download everything option on. This causes high IO on our server, and a single user can quickly more or less kill the whole server in mere minutes, load average quickly spiking to 30-40-50 and everything becomes sloooooow (obviously).
Is there any way I can limit the amount of IO a certain user can use?
TTBOMK Dovecot has no data rate limiting controls, neither globally nor per user. As I see it you have a few of options:
- Switch to mdbox or mbox storage format to decrease IOs per email
- Beef up the server with many more RAID spindles
- Eliminate all unnecessary filesystem metadata and logging activity for instance, atime, if not done already
- Switch to a filesystem matched to your workload EXT3/4/Reiser are not optimal for high concurrency multi-user server workloads switch to XFS or JFS if currently using EXT3/4
- Install a traffic shaper in front of, or on, the Dovecot server Configure it to clamp any TCP sessions that exceed, say, 10 Mbit/s for more than 10 seconds down to 1 Mbit/s for a 1 minute duration. Tune to taste until you get the desired results.
Option 5 is the most sure fire way to solve the problem you describe. It is the most direct solution, and likely free (not including your time to set it up), assuming netfilter etc scripts are available to accomplish this.
I liked option 5 as being the best choice for this, at first.
Then I began to wonder, when a client does this, is there a process created for each mail directory? I'm not familiar enough with the IMAP protocol or dovecot to know off hand. If so though, could that be a problem?
This is irrelevant WRT traffic shaping. Traffic shaping simply counts IP packets per unit time for a packet matching source+destination address combo X. When the defined threshold is exceeded, future X packets are delayed Y seconds to achieve max data rate Z. This occurs many layers below IMAP in the OSI model, thus the action is transparent to Dovecot. In layman's terms, this is akin to sticking the offending client PC on the international space station for a minute at a time. The dramatically increased packet latency causes a dramatic drop in throughput, thus a dramatically decreased IO load on the server.
Expecting that Timo has probably addressed this already (he's very thorough to say the least), I looked through the (v2.0.11) configuration parameters, and found: #mail_max_userip_connections = 10
This exists in 1.x as well, but it doesn't apply to this situation.
in the 20-imap.conf file. This leads me to believe that yes, there can reasonably be multiple processes per user. With 1100 folders to download, I can see where the downloading of all that might overload a system. I would guess that reducing this value somewhat would tend to limit the impact any one user could have on the system as a whole. At the same time however, overall performance may suffer the lower this value is.
Last I checked, by default, TBird opens 5 IMAP connections per session. In my experience, four of the five go unused 99.999% of the time. Only one is used for over 99.999% of IMAP traffic. AIUI, the extra 4 are used primarily for things like IMAP IDLE notifications on non-selected folders. A quick glance at the TIME+ column in top will demonstrate this. Thus I've been configuring TBird clients to use only 1 cached connection for quite some time.
This doesn't limit client bandwidth or minimize server load, but it cuts down on sleeping processes by a factor of 5, and saves a small amount of memory. I say small because Linux only stores binary code once in memory no matter how many processes you have running. Also, all the IMAP processes will be sharing buffer cache. Thus, you don't really save significant memory by cutting the IMAP processes by 5x, but I find managing one IMAP process per user much more elegant than 5 processes per user.
BL, your bottleneck (high load) might be more related to the number of processes the user has at once, more so than the i/o or bandwidth demands they're causing. Reduce the # of processes, and the others go down too.
This is not correct. Number of same user IMAP processes has no bearing on the OP's issue. See above. Email, whether SMTP or IMAP, is disk seek bound, rarely, if ever, CPU or memory bound, unless a server machine is simply horribly architected or running insanely complex sieve rules etc.
The OP's problem is that a single client PC attempts to download the entire 1100 public folder set and a single user mailbox, apparently many hundreds of MBs or GBs in size. Over a GbE link this could be as much as 50 to 90 MB/s. As the data being read from disk is not sequential, this generates a substantial random IO load on the server, exacerbated by the Maildir storage format, requiring one file read per message.
The OP is simply running out of disk IOPS in this scenario. The two sledgehammer methods for fixing this are rate limiting the client at the packet level, which in turn reduces the disk seek load, or substantially beefing up the disk subsystem. The latter is cheaper (in $$), more direct, and more predictable. As my address RHS would suggest, I'm all for beefing up hardware. But in this case, it's not the optimal solution.
Thanks for the great explanation, Stan.
Just to clarify, did you mean to say that the former is cheaper in $$?
-- -Eric 'shubes'