[Dovecot] IO rate quotas?

Fri Apr 8 05:54:17 EEST 2011

On 04/07/2011 06:49 PM, Stan Hoeppner wrote:
> Eric Shubert put forth on 4/7/2011 4:04 PM:
>> On 04/07/2011 12:45 PM, Stan Hoeppner wrote:
>>> Kádár Tamás (KTamas) put forth on 4/7/2011 9:33 AM:
>>>> Hi
>>>>
>>>> Okay so we've been having this issue since forever and I figured why
>>>> the heck not ask it here since it's mostly related to dovecot. The
>>>> thing is, we have a huge amount of public folders (at the moment,
>>>> around 1100). Now, with dovecot indexing and caching we're mostly
>>>> okay, also being careful with things like Thunderbird 3's default
>>>> 'Download everything' option ("Keep messages for this account on this
>>>> computer") and such. However once in a while, someone goes rogue, we
>>>> install a new version of thunderbird, someone accidentally sets up an
>>>> email account in thunderbird mistakingly leaving the download
>>>> everything option on. This causes high IO on our server, and a single
>>>> user can quickly more or less kill the whole server in mere minutes,
>>>> load average quickly spiking to 30-40-50 and everything becomes
>>>> sloooooow (obviously).
>>>>
>>>> Is there any way I can limit the amount of IO a certain user can use?
>>>
>>> TTBOMK Dovecot has no data rate limiting controls, neither globally nor
>>> per user.  As I see it you have a few of options:
>>>
>>> 1.  Switch to mdbox or mbox storage format to decrease IOs per email
>>> 2.  Beef up the server with many more RAID spindles
>>> 3.  Eliminate all unnecessary filesystem metadata and logging activity
>>>       for instance, atime, if not done already
>>> 4.  Switch to a filesystem matched to your workload EXT3/4/Reiser are
>>>       not optimal for high concurrency multi-user server workloads
>>>       switch to XFS or JFS if currently using EXT3/4
>>> 5.  Install a traffic shaper in front of, or on, the Dovecot server
>>>       Configure it to clamp any TCP sessions that exceed, say, 10 Mbit/s
>>>       for more than 10 seconds down to 1 Mbit/s for a 1 minute duration.
>>>       Tune to taste until you get the desired results.
>>>
>>> Option 5 is the most sure fire way to solve the problem you describe.
>>> It is the most direct solution, and likely free (not including your time
>>> to set it up), assuming netfilter etc scripts are available to
>>> accomplish this.
>>>
>>
>> I liked option 5 as being the best choice for this, at first.
>>
>> Then I began to wonder, when a client does this, is there a process
>> created for each mail directory? I'm not familiar enough with the IMAP
>> protocol or dovecot to know off hand. If so though, could that be a
>> problem?
>
> This is irrelevant WRT traffic shaping.  Traffic shaping simply counts
> IP packets per unit time for a packet matching source+destination
> address combo X.  When the defined threshold is exceeded, future X
> packets are delayed Y seconds to achieve max data rate Z.  This occurs
> many layers below IMAP in the OSI model, thus the action is transparent
> to Dovecot.  In layman's terms, this is akin to sticking the offending
> client PC on the international space station for a minute at a time.
> The dramatically increased packet latency causes a dramatic drop in
> throughput, thus a dramatically decreased IO load on the server.
>
>> Expecting that Timo has probably addressed this already (he's very
>> thorough to say the least), I looked through the (v2.0.11) configuration
>> parameters, and found:
>>    #mail_max_userip_connections = 10
>
> This exists in 1.x as well, but it doesn't apply to this situation.
>
>> in the 20-imap.conf file. This leads me to believe that yes, there can
>> reasonably be multiple processes per user. With 1100 folders to
>> download, I can see where the downloading of all that might overload a
>> system. I would guess that reducing this value somewhat would tend to
>> limit the impact any one user could have on the system as a whole. At
>> the same time however, overall performance may suffer the lower this
>> value is.
>
> Last I checked, by default, TBird opens 5 IMAP connections per session.
>   In my experience, four of the five go unused 99.999% of the time.  Only
> one is used for over 99.999% of IMAP traffic.  AIUI, the extra 4 are
> used primarily for things like IMAP IDLE notifications on non-selected
> folders.  A quick glance at the TIME+ column in top will demonstrate
> this.  Thus I've been configuring TBird clients to use only 1 cached
> connection for quite some time.
>
> This doesn't limit client bandwidth or minimize server load, but it cuts
> down on sleeping processes by a factor of 5, and saves a small amount of
> memory.  I say small because Linux only stores binary code once in
> memory no matter how many processes you have running.  Also, all the
> IMAP processes will be sharing buffer cache.  Thus, you don't really
> save significant memory by cutting the IMAP processes by 5x, but I find
> managing one IMAP process per user much more elegant than 5 processes
> per user.
>
>> BL, your bottleneck (high load) might be more related to the number of
>> processes the user has at once, more so than the i/o or bandwidth
>> demands they're causing. Reduce the # of processes, and the others go
>> down too.
>
> This is not correct.  Number of same user IMAP processes has no bearing
> on the OP's issue.  See above.  Email, whether SMTP or IMAP, is disk
> seek bound, rarely, if ever, CPU or memory bound, unless a server
> machine is simply horribly architected or running insanely complex sieve
> rules etc.
>
> The OP's problem is that a single client PC attempts to download the
> entire 1100 public folder set and a single user mailbox, apparently many
> hundreds of MBs or GBs in size.  Over a GbE link this could be as much
> as 50 to 90 MB/s.  As the data being read from disk is not sequential,
> this generates a substantial random IO load on the server, exacerbated
> by the Maildir storage format, requiring one file read per message.
>
> The OP is simply running out of disk IOPS in this scenario.  The two
> sledgehammer methods for fixing this are rate limiting the client at the
> packet level, which in turn reduces the disk seek load, or substantially
> beefing up the disk subsystem.  The latter is cheaper (in $$), more
> direct, and more predictable.  As my address RHS would suggest, I'm all
> for beefing up hardware.  But in this case, it's not the optimal solution.
>

Thanks for the great explanation, Stan.

Just to clarify, did you mean to say that the former is cheaper in $$?

-- 
-Eric 'shubes'