Greetings,
I noticed a while back someone posted a patch/plugin that allowed Dovecot to
use compressed mbox files. I'm now wondering how far that would put us from having compressed maildir? I have a server with more CPU than disk space, and while I can buy more HDD space, my backup solution doesn't make that practical.
It seems to me that when looking for a message file, if it ends in .gz unpack
it, and otherwise everything acts as normal. Worst case, this is one strcat() and a stat() slower to find.
Newly delivered messages could remain unpacked, and a cron job could come by whenever to compress old/large/un-looked-at-for-months messages. So, new and frequently referenced messages would be as fast as ever, and older messages would be slower.
I would love to dive in an do this myself, but 1- my time is very very limited,
working two jobs, and 2- I'm not running 1.0 yet, as it apparently still doesn't support my Thunderbird users tagging their messages (am I wrong? please tell me I'm wrong... I want to upgrade! :)
-- Curtis Maloney cmaloney@cardgate.net
Curtis Maloney wrote:
in this case it'd be better to use some kind of compressed filesystem: http://lists.q-linux.com/pipermail/plug/2002-September/020687.html
-- Levente "Si vis pacem para bellum!"
Farkas Levente wrote:
in this case it'd be better to use some kind of compressed filesystem: http://lists.q-linux.com/pipermail/plug/2002-September/020687.html
Which might be of some use, if I were running Linux. I'm not.
Next?
-- Curtis
On Wednesday 15 June 2005 09:24, Curtis Maloney wrote:
What are you using? If its FreeBSD, you could use the md device.
http://www.freebsd.org/cgi/man.cgi?mdconfig
You can use this to create a vnode (file) backed memory disc, put your maildirs, and whatever else you like in it and it will be compressed.
HTH,
Dominic GoodforBusiness.co.uk I.T. Services for SMEs in the UK.
On 166, 06 15, 2005 at 04:54:21PM +1000, Curtis Maloney wrote:
Been here, done that :) You can try the attached patch, it's a little ugly, but works on the production server for more than 2 months.
-- Andrey Panin | Linux and UNIX system administrator pazke@donpac.ru | PGP key: wwwkeys.pgp.net
Andrey Panin wrote:
Thanks, Andrey.
You're the only person who hasn't taken this as an opportunity to tell me why I'm wrong or give me a quick lesson in admining. I wanted to float the idea, not have people try to solve my space issues.
Perhaps I should have phrased it as the space constraints inspired the idea... oh well.
Now I just need either to back-port this to 0.99.14, or for Timo to add keywords to 1.0. (keywords is responsible for .customflags, isn't it? nobody wants to confirm this for me ):
-- Curtis Maloney cmaloney@cardgate.net
hi,
i think with you waste more space with not fully used clusters and stuff. e.g. i have: $ du -sh Maildir ; du -sh --apparent-size Maildir 134M Maildir 123M Maildir
this is not even my full mail archive. with gzipped maildir files this should get even worse as the files are even smaller and still block the cluster.
darix
Marcus Rueckert wrote:
I tried this... Solaris du doesn't appear to have --apparent-size..
I see your point, but my users have an unfortunate tendency to keep e-mails with attachments (no matter how big a stick I wield), so there would be a definite saving there.
Since several of my users have over 1GB of mail ('top' user is now around 1.6GB), I think there is a potential to save considerable space.
Besides... when was the last time you saw an e-mail below 512bytes? There is _some_ saving to be made, if not a lot. But a little saving over a few million e-mails? Well, it all ads up.
-- Curtis Maloney cmaloney@cardgate.net
Hi Curtis,
I tried this... Solaris du doesn't appear to have --apparent-size..
FreeBSD also :-(
Well i the "compressed" maildir system can add a check to see if file is > to FS block size... then try to compress it..
Anyway this can be a very good system for archiving purposes...
And IMHO I think it should be a good idea to add/merge the patch into main stream code (with a flag to allow compressed maildir)...
/Xavier
Xavier Beaudouin wrote:
Well i the "compressed" maildir system can add a check to see if file is > to FS block size... then try to compress it..
In my original post I suggested a cron job compress the files, so it's easy to vary which files get compressed, and when. Something like (untested, and pre-coffee):
find $MAILPATH ! -name "*.gz" -type f -atime +90 -size +1 -exec /usr/bin/gzip -9 {} ";"
And IMHO I think it should be a good idea to add/merge the patch into main stream code (with a flag to allow compressed maildir)...
I certainly wouldn't object, and it sounds like Andrey and others like the idea, too.
-- Curtis Maloney cmaloney@cardgate.net
On 17.6.2005, at 02:50, Curtis Maloney wrote:
The filename shouldn't change or Dovecot treats it as a new maildir file. The safest way would be something like:
touch dovecot-uidlist.lock gzip -9 cur/$name > tmp/$name mv tmp/$name cur/$name rm dovecot-uidlist.lock
The uidlist lock is needed so that filenames aren't renamed while compressing. Otherwise you could end up with two same basenames, one compressed and one uncompressed. That won't work very nicely.
It's getting a bit difficult to decide what are plugins and what are not :)
Plugins wouldn't necessarily have to be even dynamically loadable. It could be possible to just have some plugins/builtin/ directory and Dovecot would compile everything in it inside the binary.
Curtis Maloney wrote:
would't it be better if you compress only the backups ? In fact most backup solutions already do that automatically, so if you use such a compressed maildir plugin, you will gain hdd space but you won't gain any backup space since compressing two times usually doesn't improve compression ratio ?
Matthieu
if space is an issue then mbox is a better choice than maildir... a lot less space wasted on directory/inode metadata and less space wasted on the tails of every message. if you've got a 4096 byte block size filesystem without fragments then you're wasting 2048 bytes per message.
depending on the avg message size in your maildirs you might not even save much at all due to the tail problem.
-dean
On Wed, 15 Jun 2005, Curtis Maloney wrote:
participants (10)
-
Andrey Panin
-
Curtis Maloney
-
dean gaudet
-
Dominic Marks
-
Farkas Levente
-
Marcus Rueckert
-
matthieu imbert
-
Timo Sirainen
-
Tomi Hakala
-
Xavier Beaudouin