[Dovecot] mdbox compression
I was wondering if I should add compression support to mdbox one mail at a time or one file (~2MB) at a time. The tradeoffs are:
- one mail at a time allows quickly seeking to wanted mail inside the file, but it can't compress mails as well
- one file at a time compresses better, but seeking is slow because it can only be done by uncompressing all the data until the wanted offset is reached
I did a quick test for this with 27 MB of my old INBOX mails:
(note the -b option, so it doesn't count wasted fs space) mdbox/storage% du -sb . 15120350 .
Maildir/cur% du -sb .
16517320 .
% echo 1-15120350/16517320|bc -l .08457606924125705623
So, compressed mdboxes take 8.5% less space. This was with regular gzip compression with default level. With bzip2 -9 compression the difference was 10%.
Any thoughts on if 8-10% is significant enough improvement to make seeking performance worse? Or perhaps I should just implement both ways.. :)
On Fri, Feb 5, 2010 at 4:36 PM, Timo Sirainen <tss@iki.fi> wrote:
I was wondering if I should add compression support to mdbox one mail at a time or one file (~2MB) at a time. The tradeoffs are:
- one mail at a time allows quickly seeking to wanted mail inside the file, but it can't compress mails as well
- one file at a time compresses better, but seeking is slow because it can only be done by uncompressing all the data until the wanted offset is reached
I did a quick test for this with 27 MB of my old INBOX mails:
(note the -b option, so it doesn't count wasted fs space) mdbox/storage% du -sb . 15120350 .
Maildir/cur% du -sb . 16517320 .
% echo 1-15120350/16517320|bc -l .08457606924125705623
So, compressed mdboxes take 8.5% less space. This was with regular gzip compression with default level. With bzip2 -9 compression the difference was 10%.
Any thoughts on if 8-10% is significant enough improvement to make seeking performance worse? Or perhaps I should just implement both ways.. :)
Isn't the real difference even smaller?
15120350/28311552 = .534 16517320/28311552 = .583
So that's just under 5%.
Either way, I'd say go with compressing each mail individually for quick seeking.
Also, if you were compressing the whole file of mails as a single stream, wouldn't you have to recompress and rewrite the whole file for each new mail delivered?
Matt
On 6.2.2010, at 3.03, Matt Reimer wrote:
Isn't the real difference even smaller?
15120350/28311552 = .534 16517320/28311552 = .583
So that's just under 5%.
Well, sure, if you're comparing it to uncompressed data. :) But I think it made more sense to compare the two compression possibilities.
Either way, I'd say go with compressing each mail individually for quick seeking.
Maybe.. but if the I/O times dominated by disk seeks, it probably wouldn't make much of a difference if it reads 2 MB or a few kB from the file. Then there's also the extra latency and CPU usage from uncompression, but perhaps that wouldn't be all that much either. And it would be even lower if the file sizes were set smaller, like 200 kB.
But then of course with SSDs the I/O isn't dominated by seeks, so maybe this makes less sense there..
Also, if you were compressing the whole file of mails as a single stream, wouldn't you have to recompress and rewrite the whole file for each new mail delivered?
I was thinking that the compression would be delayed so that it would be done only after mdbox already decided that it wouldn't write any more data to it. But it's actually possible to append more data to .gz files (the compression wouldn't be any better then though).
On 6.2.2010, at 3.23, Timo Sirainen wrote:
I was thinking that the compression would be delayed so that it would be done only after mdbox already decided that it wouldn't write any more data to it.
Oh, and this is actually why I was thinking that maybe it could be a good idea. If it's only done for older mails, they aren't accessed that often. So maybe a hybrid solution would be a good idea for mdbox users with alt storage:
- primary storage: SSD disks, mdbox file size = 100k, compress each mail separately
- alt storage: spinning disks, mdbox file size = 2 MB, compress the entire file
Mails would be moved to alt storage after n days, perhaps dynamically depending on available SSD disk space.
SSDs can read data pretty fast though, so it would be nice to look at some benchmarks that read tons of emails concurrently compressed vs. uncompressed. Is the bottleneck CPU or I/O? Hmm. A quick test with my Intel SSD shows that it can read 243 MB/s from a single large file, while zlib input is only 100 MB/s with Macbook's one CPU core. Faster CPUs and more cores would make zlib faster though.
On Fri, Feb 5, 2010 at 5:46 PM, Timo Sirainen <tss@iki.fi> wrote:
On 6.2.2010, at 3.23, Timo Sirainen wrote:
I was thinking that the compression would be delayed so that it would be done only after mdbox already decided that it wouldn't write any more data to it.
Oh, and this is actually why I was thinking that maybe it could be a good idea. If it's only done for older mails, they aren't accessed that often. So maybe a hybrid solution would be a good idea for mdbox users with alt storage:
- primary storage: SSD disks, mdbox file size = 100k, compress each mail separately
- alt storage: spinning disks, mdbox file size = 2 MB, compress the entire file
Mails would be moved to alt storage after n days, perhaps dynamically depending on available SSD disk space.
SSDs can read data pretty fast though, so it would be nice to look at some benchmarks that read tons of emails concurrently compressed vs. uncompressed. Is the bottleneck CPU or I/O? Hmm. A quick test with my Intel SSD shows that it can read 243 MB/s from a single large file, while zlib input is only 100 MB/s with Macbook's one CPU core. Faster CPUs and more cores would make zlib faster though.
Nice!
Matt
It's the real question of where the file is stored. For me, using nfs,
using compression given me several times the performance of nfs, vs
not using compression, for html files.
I would imagine the same benifits with email.
I would say something very close to the same with mdbox, if it's going
over the network to a database server, compression will should clearly
speed things up.
Now if your storing it on the local machine, then yes, compression
will do nothing but slow things down, but can give you increased
storage space.
Quoting Timo Sirainen <tss@iki.fi>:
On 6.2.2010, at 3.23, Timo Sirainen wrote:
I was thinking that the compression would be delayed so that it
would be done only after mdbox already decided that it wouldn't
write any more data to it.Oh, and this is actually why I was thinking that maybe it could be a
good idea. If it's only done for older mails, they aren't accessed
that often. So maybe a hybrid solution would be a good idea for
mdbox users with alt storage:
- primary storage: SSD disks, mdbox file size = 100k, compress each
mail separately- alt storage: spinning disks, mdbox file size = 2 MB, compress the
entire fileMails would be moved to alt storage after n days, perhaps
dynamically depending on available SSD disk space.SSDs can read data pretty fast though, so it would be nice to look
at some benchmarks that read tons of emails concurrently compressed
vs. uncompressed. Is the bottleneck CPU or I/O? Hmm. A quick test
with my Intel SSD shows that it can read 243 MB/s from a single
large file, while zlib input is only 100 MB/s with Macbook's one CPU
core. Faster CPUs and more cores would make zlib faster though.
Timo Sirainen put forth on 2/5/2010 6:36 PM:
So, compressed mdboxes take 8.5% less space. This was with regular gzip compression with default level. With bzip2 -9 compression the difference was 10%.
Any thoughts on if 8-10% is significant enough improvement to make seeking performance worse? Or perhaps I should just implement both ways.. :)
Given the cost of mechanical storage today (1TB for less than $100 USD) I can't see why anyone would want to implement compression. The cases I can think of would be folks using strictly SSD (if there are any), those doing backups, or very large sites. Then again, I'm thinking most such backup solutions implement their own compression anyway so it makes no difference in that case except possibly LAN/SAN bandwidth in moving compresses vs uncompressed data.
I would think only really large sites would consider compression. 10% space savings for 1 million mailboxen might add up to some significant storage hardware dollar savings, not to mention the power savings. This is just a guess as I've never worked in such an environment. If a projected infrastructure build out is calling for a $1 million back end clustered shared storage array for mailboxen (think NetApp, IBM, SGI), and this compression cuts your number of required spindles by 10%, that's potentially a $100,000 savings. In today's economy, folks would be seriously looking at keeping that $100,000 in their pocket book.
Very large sites would probably want maximum compression while retaining maximum performance. You didn't state the CPU burn difference between the two methods, or the total CPU burn for either method. If one burns 50% CPU and the other 60%, on a loaded system, say 500 concurrent users, the relative difference is minor, but both are so horrible WRT CPU that no one would use them. If the relative load is 10% for the first method and 12% for the other, then I'd say some people would gladly adopt the 2nd, slightly less efficient method.
-- Stan
participants (4)
-
Matt Reimer
-
Patrick Domack
-
Stan Hoeppner
-
Timo Sirainen