[Dovecot] Expected size of index-files?
Hello...
Im considering moving my index-files to a RAID1. How big do these files get? What should I plan for?
Thanks, Tobias
Tobias Balle-Petersen schreef:
Hello...
Im considering moving my index-files to a RAID1. How big do these files get? What should I plan for?
Thanks, Tobias
since 1 july 2007 i store all mails to this list in 1 folder (Maildir-format)
the size of this (and the indexes) is: 31M cur 80K dovecot.index 6.8M dovecot.index.cache 44K dovecot.index.log 136K dovecot.index.log.2 4.0K dovecot-keywords 164K dovecot-uidlist 4.0K new 4.0K tmp
so, on average maybe 20%-30% of the size of your messages ?
That's actually quite a lot. Consider you run 100k+ mailboxes (1GB each) and you would move to Dovecot this will be a huge chunk out of your storage. Is using dovecot index files really that much faster than running ":INDEX=MEMORY" ? Where could I find some benchmarks about this?
Cheers,
Jan
-----Oorspronkelijk bericht----- Van: dovecot-bounces+jan.vandenberg=isp.solcon.nl@dovecot.org [mailto:dovecot-bounces+jan.vandenberg=isp.solcon.nl@dovecot.org] Namens Luuk Verzonden: vrijdag 8 februari 2008 17:19 Aan: dovecot@dovecot.org Onderwerp: Re: [Dovecot] Expected size of index-files?
Tobias Balle-Petersen schreef:
Hello...
Im considering moving my index-files to a RAID1. How big do these files get? What should I plan for?
Thanks, Tobias
since 1 july 2007 i store all mails to this list in 1 folder (Maildir-format)
the size of this (and the indexes) is: 31M cur 80K dovecot.index 6.8M dovecot.index.cache 44K dovecot.index.log 136K dovecot.index.log.2 4.0K dovecot-keywords 164K dovecot-uidlist 4.0K new 4.0K tmp
so, on average maybe 20%-30% of the size of your messages ?
On Feb 8, 2008, at 6:25 PM, Jan van den Berg wrote:
That's actually quite a lot. Consider you run 100k+ mailboxes (1GB each) and you would move to Dovecot this will be a huge chunk out of your storage. Is using dovecot index files really that much faster than running ":INDEX=MEMORY" ? Where could I find some benchmarks about this?
IMAP benchmarks are difficult to produce because they depend so much
on what clients are used (and how many clients user is using, and how
user is using the clients). For webmail type setups indexes help a
lot. For Outlook/Thunderbird they help a lot less.
so, on average maybe 20%-30% of the size of your messages ?
It's actually more to do with the number of messages than the size of
the messages. dovecot.index.cache file is the largest one and it
completely depends on what IMAP client is used. It contains only the
information clients are interested in.
Timo Sirainen schreef:
On Feb 8, 2008, at 6:25 PM, Jan van den Berg wrote:
That's actually quite a lot. Consider you run 100k+ mailboxes (1GB each) and you would move to Dovecot this will be a huge chunk out of your storage. Is using dovecot index files really that much faster than running ":INDEX=MEMORY" ? Where could I find some benchmarks about this?
IMAP benchmarks are difficult to produce because they depend so much on what clients are used (and how many clients user is using, and how user is using the clients). For webmail type setups indexes help a lot. For Outlook/Thunderbird they help a lot less.
so, on average maybe 20%-30% of the size of your messages ?
It's actually more to do with the number of messages than the size of the messages. dovecot.index.cache file is the largest one and it completely depends on what IMAP client is used. It contains only the information clients are interested in.
i could not stop myself from verifying.. ;-) i moved my original indexes to 'old' started thunderbird and had dovecor recreate the indexes, and moved them to 'old2' stopped thunderbird. started outlook-express and had dovecor recreate the indexes, and moved them to 'old3' stop outlook express
the 'du -s *', now looks like this: 31M cur 4.0K new 7.2M old 3.7M old2 2.2M old3 4.0K tmp
and there is a big difference between the 'thunderbird-indexes' and the 'outlookexpress-indexes' and an even bigger if you look at my 'old'-indexes....
-- Luuk
On Fri, 2008-02-08 at 17:40 +0100, Luuk wrote:
the 'du -s *', now looks like this: 31M cur 4.0K new 7.2M old 3.7M old2 2.2M old3 4.0K tmp
and there is a big difference between the 'thunderbird-indexes' and the 'outlookexpress-indexes' and an even bigger if you look at my 'old'-indexes....
Dovecot v1.0 is a bit lazy in dropping unused fields. For example if you used a client once it would cache those new fields that the client was interested in. But it would keep them cached even if you never used that client again. v1.1 drops these fields after a month of them not being used.
Also the cached fields are marked as "permanent" or "temporary". With temporary fields the cache data is dropped for messages older than a week (or the rule is a bit more complex). So for these OE/TB type clients which fetch the field only once it should get dropped from cache from wasting space. There could be some bugs in here that cause fields to be permanent or fields not being dropped often enough. v1.1 has fixed some of these I think.
(Wow you are fast!)
"For webmail type setups indexes help a lot. For Outlook/Thunderbird they help a lot less."
Very interesting!
I'm scared to use (index) files that go sort of unnoticed (it's not calculated in the maildirsize file) and can potentially grow with no limit.
But I am also curious what ":INDEX=MEMORY" will do with 100k users. How much MB RAM will one IMAP session on average take? And will this be removed from memory (or stay cached) when an IMAP session is closed?
Cheers,
Jan
-----Oorspronkelijk bericht----- Van: Timo Sirainen [mailto:tss@iki.fi] Verzonden: vrijdag 8 februari 2008 17:33 Aan: Jan van den Berg CC: Dovecot Mailing List Onderwerp: Re: [Dovecot] Expected size of index-files?
On Feb 8, 2008, at 6:25 PM, Jan van den Berg wrote:
That's actually quite a lot. Consider you run 100k+ mailboxes (1GB each) and you would move to Dovecot this will be a huge chunk out of your storage. Is using dovecot index files really that much faster than running ":INDEX=MEMORY" ? Where could I find some benchmarks about this?
IMAP benchmarks are difficult to produce because they depend so much
on what clients are used (and how many clients user is using, and how
user is using the clients). For webmail type setups indexes help a
lot. For Outlook/Thunderbird they help a lot less.
so, on average maybe 20%-30% of the size of your messages ?
It's actually more to do with the number of messages than the size of
the messages. dovecot.index.cache file is the largest one and it
completely depends on what IMAP client is used. It contains only the
information clients are interested in.
"For webmail type setups indexes help a lot. For Outlook/Thunderbird they help a lot less."
Very interesting!
I'm scared to use (index) files that go sort of unnoticed (it's not calculated in the maildirsize file) and can potentially grow with no limit.
Diskspace is a lot cheaper than memory. I would not worry about index disk size. In pre 1.0 era we used to use memory indexes because of regular corruption in the indexes, but im very glad we now can use permanent indexes. We bind customers to a specific imap server, and on each server (out of 20) I see about 10GB in indexes.
Surely not all of your customers use imap. And if you really want to, you could opt to delete old files. Dovecot should be able to handle that gracefully.
Cor
On Fri, 2008-02-08 at 17:45 +0100, Jan van den Berg wrote:
I'm scared to use (index) files that go sort of unnoticed (it's not calculated in the maildirsize file) and can potentially grow with no limit.
Only some truly badly behaving clients can cause them to grow infinitely. This would require the client to fetch/search more and more unique message headers. For example field1, field2, field3, .. field1000. I don't think this is worth worrying about, although I guess I should add code to prevent this. Added to TODO.
But I am also curious what ":INDEX=MEMORY" will do with 100k users. How much MB RAM will one IMAP session on average take?
The problem isn't how much memory it'll take. It's that Dovecot may use a lot more disk I/O because it has to read and parse messages to find some data instead of doing a couple of small reads from cache file to get the same data.
The memory usage is actually less without indexes, because then no fields are ever cached and the space used by dovecot.index.cache isn't mapped to memory.
And will this be removed from memory (or stay cached) when an IMAP session is closed?
Operating system's buffer cache may still contain all the message files' contents that were read during the session (which may have been avoided with indexes on disk), but that's all.
Thanks for the replies this cleared up a lot.
Right I think I will be using Dovecot for IMAP (less clients) with index files and keep on using Courier for POP3 (most clients). According to my tests this works OK on the same mailbox.
Cheers,
Jan
-----Oorspronkelijk bericht----- Van: Timo Sirainen [mailto:tss@iki.fi] Verzonden: vrijdag 8 februari 2008 17:58 Aan: Jan van den Berg CC: Dovecot Mailing List Onderwerp: Re: [Dovecot] Expected size of index-files?
On Fri, 2008-02-08 at 17:45 +0100, Jan van den Berg wrote:
I'm scared to use (index) files that go sort of unnoticed (it's not calculated in the maildirsize file) and can potentially grow with no limit.
Only some truly badly behaving clients can cause them to grow infinitely. This would require the client to fetch/search more and more unique message headers. For example field1, field2, field3, .. field1000. I don't think this is worth worrying about, although I guess I should add code to prevent this. Added to TODO.
But I am also curious what ":INDEX=MEMORY" will do with 100k users. How much MB RAM will one IMAP session on average take?
The problem isn't how much memory it'll take. It's that Dovecot may use a lot more disk I/O because it has to read and parse messages to find some data instead of doing a couple of small reads from cache file to get the same data.
The memory usage is actually less without indexes, because then no fields are ever cached and the space used by dovecot.index.cache isn't mapped to memory.
And will this be removed from memory (or stay cached) when an IMAP session is closed?
Operating system's buffer cache may still contain all the message files' contents that were read during the session (which may have been avoided with indexes on disk), but that's all.
Tobias Balle-Petersen schreef:
Hello...
Im considering moving my index-files to a RAID1. How big do these files get? What should I plan for?
Thanks, Tobias
since 1 july 2007 i store all mails to this list in 1 folder (Maildir-format)
the size of this (and the indexes) is: 31M cur 80K dovecot.index 6.8M dovecot.index.cache 44K dovecot.index.log 136K dovecot.index.log.2 4.0K dovecot-keywords 164K dovecot-uidlist 4.0K new 4.0K tmp
so, on average maybe 20%-30% of the size of your messages ?
On Fri, 8 Feb 2008, Jan van den Berg wrote:
That's actually quite a lot. Consider you run 100k+ mailboxes (1GB each) and you would move to Dovecot this will be a huge chunk out of your storage. Is using dovecot index files really that much faster than running ":INDEX=MEMORY" ? Where could I find some benchmarks about this?
Cheers,
Jan
I think a lot would depend on your average message size. Here's another data point. At my old company's server:
43,438 messages 2,350,260,769 bytes per-message ~ 54K
~3,000 folders: 486 dovecot-keywords 2675 dovecot-uidlist 2967 dovecot.index 2456 dovecot.index.cache 2967 dovecot.index.log 183 dovecot.index.log.2 251 maildirfolder 26 subscriptions
total non-message file sizes: 93,010,227 bytes / 2,350,260,769 bytes =~ 3.9 percent
And, taking into account what Timo wrote about it being client-dependent, most of the users there are using Outlook.
Best, Ben
participants (6)
-
Benjamin R. Haskell
-
Cor Bosman
-
Jan van den Berg
-
Luuk
-
Timo Sirainen
-
Tobias Balle-Petersen