[Dovecot] Courier migrating issues: indexes, maildirsize, update query
Hi,
(my first post to the list)
I'm in the process of testing Dovecot to see whether it meets our needs to replace our current Courier setup which serves well over 100.000 mailboxes (pop3 and imap: mysql with NFS) So far dovecot seems pretty straight forward; however I ran into a couple of things that I'm curious about.
What's the deal with dovecot.index/dovecot.index.log/dovecot.index.cache. I understand Dovecot uses this for POP3 primarily but this will no doubt cause a lot of overhead on our platform. Also getting this to work with our NFS setup would be a pain (locking etc.). I've been following this discussion: http://www.dovecot.org/list/dovecot/2006-January/thread.html#10758; and it ends with C. Malony suggesting that an disable index option would be in place; but no further action is taken. Basically I want to get rid of the index files (courier also doesn't use any). Any suggestions; will there be a disable-index option?
One of the reasons we want to get rid of Courier is because the maildirsize is often incorrect. Whether this is because of a bad Courier implementation or NFS issues or whatever, I haven't figured out yet. But I'd like to test this function in Dovecot; but there seems to be very little documentation. I read the link http://wiki.dovecot.org/Quota/Dict but this info is too scarce. Any pointers on where to go?
In our MySQL setup we have a field with the latest poptime and pophost (IP). This info can come in handy when troubleshooting or filtering out inactive mailboxes. With our Courier setup this field gets updated by a rather elaborate script that checks the logs and runs updates queries. I could get this to work for Dovecot; but with Dovecot being more actively in development, I wonder could this be a feature; like after a user_query, another update query?
Cheers,
Jan
On Wed, 2007-05-23 at 12:57 +0200, Jan van den Berg wrote:
I'm in the process of testing Dovecot to see whether it meets our needs to replace our current Courier setup which serves well over 100.000 mailboxes (pop3 and imap: mysql with NFS) So far dovecot seems pretty straight forward; however I ran into a couple of things that I'm curious about.
- What's the deal with dovecot.index/dovecot.index.log/dovecot.index.cache. I understand Dovecot uses this for POP3 primarily
I guess you meant to say "for IMAP", which is correct.
but this will no doubt cause a lot of overhead on our platform.
Yes, they bring a bit too much extra overhead for standard download +delete POP3 users. v1.1's indexes will work better with POP3 (and most likely also with NFS).
Also getting this to work with our NFS setup would be a pain (locking etc.). I've been following this discussion: http://www.dovecot.org/list/dovecot/2006-January/thread.html#10758; and it ends with C. Malony suggesting that an disable index option would be in place; but no further action is taken. Basically I want to get rid of the index files (courier also doesn't use any). Any suggestions; will there be a disable-index option?
Have you read http://wiki.dovecot.org/NFS?
You can anyway disable indexes with appending :INDEX=MEMORY to mail_location setting, but it's probably not such a good idea. POP3 requires that the messages' virtual sizes are known. Dovecot initially calculates by reading the whole message's contents and then storing the size to dovecot.index.cache file. If you've disabled indexes, it means that all the messages' contents are read at the beginning of each POP3 session.
Courier solves this by keeping the message sizes in courierpop3dsizelist file. I'm thinking about doing something similar for Dovecot v1.1 also. Alternative way would be to add the file's virtual size into the maildir filename itself (see ,W=<vsize> in http://wiki.dovecot.org/MailboxFormat/Maildir), but Dovecot doesn't add them internally (but it does use them if they exist) and I'm not sure how to configure other MDAs to do this.
- One of the reasons we want to get rid of Courier is because the maildirsize is often incorrect. Whether this is because of a bad Courier implementation or NFS issues or whatever, I haven't figured out yet. But
Dovecot's maildirsize implementation works like Courier's, so it's probable that it's just as broken with NFS:
/* We rely on O_APPEND working in here. That isn't NFS-safe, but it
isn't necessarily that bad because the file is recreated once in
a while, and sooner if corruption causes calculations to go
over quota. This is also how Maildir++ spec specifies it should be
done.. */
I'd like to test this function in Dovecot; but there seems to be very little documentation. I read the link http://wiki.dovecot.org/Quota/Dict but this info is too scarce. Any pointers on where to go?
Dict quota doesn't work very well with multiple simultaneous connections either with v1.0. This has already been fixed for upcoming v1.1 though.
So, I don't think Dovecot's POP3 implementation is going to help you in any way. Some people are even using Dovecot IMAP + Courier POP3 combination for now.
- In our MySQL setup we have a field with the latest poptime and pophost (IP). This info can come in handy when troubleshooting or filtering out inactive mailboxes. With our Courier setup this field gets updated by a rather elaborate script that checks the logs and runs updates queries. I could get this to work for Dovecot; but with Dovecot being more actively in development, I wonder could this be a feature; like after a user_query, another update query?
http://wiki.dovecot.org/PostLoginScripting at least works. I've thought also about some post_login_query but I don't know if it's worth the trouble.
Hi Timo,
Thanks a lot for the quick response (you sure are active on the list). I wasn't aware of the PostLoginScript option I will sure give this a try. So from what I understand Dovecot isn't there yet either when it comes to quota handling (maildirsize). I'll give it another thought on what will work best for us.
About the indexes; this thoroughly confusing. I don't understand why Dovecot IMAP wants to use index files for a maildir++ implementation (this seems to defy the point of a maildir). Therefore I figured these files were used for POP3. But you made this clear now; that it is indeed used for IMAP.
Still I can't get it to work (index=memory) this is my line:
mail_location = maildir:/var/spool/mail/%1u/%2u/%u:INDEX=MEMORY
I added INDEX=MEMORY later; all was working fine before. Now however IMAP or POP3 no matter what I do; the dovecot.index files are still generated. I kill the dovecot daemon; remove the index files, start the deamon and either way (POP3 or IMAP) the indexes will re-appear after connecting to a mailbox. (Note: this isn't a NFS caching thing). With a trace I can see these files are created: 9634 rename("/var/spool/mail/t/e/test-05/dovecot.index.log.newlock", "/var/spool/mail/r/o/roka-05/dovecot.index.log") = 0 9634 rename("/var/spool/mail/t/e/test-05/dovecot.index.tmp", "/var/spool/mail/r/o/roka-05/dovecot.index") = 0 9634 rename("/var/spool/mail/t/e/test-05/dovecot.index.cache.lock", "/var/spool/mail/r/o/roka-05/dovecot.index.cache") = 0
Cheers,
Jan
-----Oorspronkelijk bericht----- Van: Timo Sirainen [mailto:tss@iki.fi] Verzonden: woensdag 23 mei 2007 13:39 Aan: Jan van den Berg CC: dovecot@dovecot.org Onderwerp: Re: [Dovecot] Courier migrating issues: indexes, maildirsize,update query
On Wed, 2007-05-23 at 12:57 +0200, Jan van den Berg wrote:
I'm in the process of testing Dovecot to see whether it meets our needs to replace our current Courier setup which serves well over 100.000 mailboxes (pop3 and imap: mysql with NFS) So far dovecot seems pretty straight forward; however I ran into a couple of things that I'm curious about.
- What's the deal with dovecot.index/dovecot.index.log/dovecot.index.cache. I understand Dovecot uses this for POP3 primarily
I guess you meant to say "for IMAP", which is correct.
but this will no doubt cause a lot of overhead on our platform.
Yes, they bring a bit too much extra overhead for standard download +delete POP3 users. v1.1's indexes will work better with POP3 (and most likely also with NFS).
Also getting this to work with our NFS setup would be a pain (locking etc.). I've been following this discussion: http://www.dovecot.org/list/dovecot/2006-January/thread.html#10758; and it ends with C. Malony suggesting that an disable index option would be in place; but no further action is taken. Basically I want to get rid of the index files (courier also doesn't use any). Any suggestions; will there be a disable-index option?
Have you read http://wiki.dovecot.org/NFS?
You can anyway disable indexes with appending :INDEX=MEMORY to mail_location setting, but it's probably not such a good idea. POP3 requires that the messages' virtual sizes are known. Dovecot initially calculates by reading the whole message's contents and then storing the size to dovecot.index.cache file. If you've disabled indexes, it means that all the messages' contents are read at the beginning of each POP3 session.
Courier solves this by keeping the message sizes in courierpop3dsizelist file. I'm thinking about doing something similar for Dovecot v1.1 also. Alternative way would be to add the file's virtual size into the maildir filename itself (see ,W=<vsize> in http://wiki.dovecot.org/MailboxFormat/Maildir), but Dovecot doesn't add them internally (but it does use them if they exist) and I'm not sure how to configure other MDAs to do this.
- One of the reasons we want to get rid of Courier is because the maildirsize is often incorrect. Whether this is because of a bad Courier implementation or NFS issues or whatever, I haven't figured out yet. But
Dovecot's maildirsize implementation works like Courier's, so it's probable that it's just as broken with NFS:
/* We rely on O_APPEND working in here. That isn't NFS-safe, but
it isn't necessarily that bad because the file is recreated once in a while, and sooner if corruption causes calculations to go over quota. This is also how Maildir++ spec specifies it should be done.. */
I'd like to test this function in Dovecot; but there seems to be very little documentation. I read the link http://wiki.dovecot.org/Quota/Dict but this info is too scarce. Any pointers on where to go?
Dict quota doesn't work very well with multiple simultaneous connections either with v1.0. This has already been fixed for upcoming v1.1 though.
So, I don't think Dovecot's POP3 implementation is going to help you in any way. Some people are even using Dovecot IMAP + Courier POP3 combination for now.
- In our MySQL setup we have a field with the latest poptime and pophost (IP). This info can come in handy when troubleshooting or filtering out inactive mailboxes. With our Courier setup this field gets updated by a rather elaborate script that checks the logs and runs updates queries. I could get this to work for Dovecot; but with Dovecot being more actively in development, I wonder could this be a feature; like after a user_query, another update query?
http://wiki.dovecot.org/PostLoginScripting at least works. I've thought also about some post_login_query but I don't know if it's worth the trouble.
On Wed, 2007-05-23 at 14:36 +0200, Jan van den Berg wrote:
About the indexes; this thoroughly confusing. I don't understand why Dovecot IMAP wants to use index files for a maildir++ implementation (this seems to defy the point of a maildir).
Hmm. I suppose I should write a wiki page about the index files.. Done: http://wiki.dovecot.org/IndexFiles
Still I can't get it to work (index=memory) this is my line:
mail_location = maildir:/var/spool/mail/%1u/%2u/%u:INDEX=MEMORY
I added INDEX=MEMORY later; all was working fine before. Now however IMAP or POP3 no matter what I do; the dovecot.index files are still generated.
What do you use as userdb? If you return "mail" from there it overrides mail_location.
You're right I had this in my query: concat('maildir:', maildrop, username) as mail
I deleted it and now this seems to work for me: mail_location = maildir:/var/spool/mail/%1u/%1.1u/%u:INDEX=MEMORY
I must say I am _stunned_ by the/your reaction speed on the maillist (thanks for the wiki this explains a lot).
"Each mailbox has its own separate index files. If the index files are disabled, the same structures are still kept in the memory" I suppose that this is how Courier also works; or would this imply that Dovecot will be a RAM hog?
Cheers,
Jan -----Oorspronkelijk bericht----- Van: Timo Sirainen [mailto:tss@iki.fi] Verzonden: woensdag 23 mei 2007 15:33 Aan: Jan van den Berg CC: dovecot@dovecot.org Onderwerp: Re: [Dovecot] Courier migrating issues: indexes, maildirsize,update query
On Wed, 2007-05-23 at 14:36 +0200, Jan van den Berg wrote:
About the indexes; this thoroughly confusing. I don't understand why Dovecot IMAP wants to use index files for a maildir++ implementation (this seems to defy the point of a maildir).
Hmm. I suppose I should write a wiki page about the index files.. Done: http://wiki.dovecot.org/IndexFiles
Still I can't get it to work (index=memory) this is my line:
mail_location = maildir:/var/spool/mail/%1u/%2u/%u:INDEX=MEMORY
I added INDEX=MEMORY later; all was working fine before. Now however IMAP or POP3 no matter what I do; the dovecot.index files are still generated.
What do you use as userdb? If you return "mail" from there it overrides mail_location.
On Thu, 2007-05-24 at 09:58 +0200, Jan van den Berg wrote:
You're right I had this in my query: concat('maildir:', maildrop, username) as mail
I deleted it and now this seems to work for me: mail_location = maildir:/var/spool/mail/%1u/%1.1u/%u:INDEX=MEMORY
I must say I am _stunned_ by the/your reaction speed on the maillist (thanks for the wiki this explains a lot).
"Each mailbox has its own separate index files. If the index files are disabled, the same structures are still kept in the memory" I suppose that this is how Courier also works; or would this imply that Dovecot will be a RAM hog?
That's pretty much how Courier works, although I don't know what data excatly it keeps in memory. As for memory usage, I did a small benchmark with 5587 mail maildir:
1: (logged in) 2: SELECT INBOX 3: FETCH 1:* ENVELOPE 4: SORT (DATE) US-ASCII ALL 5: SEARCH SUBJECT HELLO
Dovecot v1.0.0, INDEX=MEMORY: %CPU VSZ RSS TIME 1: 0.0 6472 696 0:00 2: 0.0 7412 1712 0:00 3: 2.0 7412 1784 0:00 4: 1.8 7604 1932 0:00 5: 1.7 7604 1932 0:00
Courier 4.1.1.20060828-6 (Debian): %CPU VSZ RSS TIME 1: 0.0 18248 1208 0:00 2: 0.0 24132 2592 0:00 3: 1.4 24136 2636 0:00 4: 2.5 24664 3276 0:01 5: 3.3 24140 2776 0:02
Dovecot doesn't seem to give up the 200kB memory it used for SORT, but probably doesn't matter (it's not a memory leak anyway).
Jan van den Berg wrote:
mail_location = maildir:/var/spool/mail/%1u/%2u/%u:INDEX=MEMORY
Of interest to you might be using a scratch disk space to store the indexes on a/the local mail server; I just did a bit of math for you (well ok, a 'bc' script did the math) and with 90.5gigs of email in Maildir folders via NFS, we have 372megs of indexes on the local disks for performance gains.
mail_location = maildir:~/Maildir:INDEX=/var/spool/dovecot/indexes/%1u/%u
I'm not sure the space your 100K+ mailboxes take, but maybe you can use the above numbers to do some math and see if you can do indexes like that.
-te
-- Troy Engel | Systems Engineer Fluid, Inc | http://www.fluid.com
participants (3)
-
Jan van den Berg
-
Timo Sirainen
-
Troy Engel