On 20/06/12 12:05, Timo Sirainen wrote:
On Wed, 2012-06-20 at 11:40 +0200, Angel L. Mateo wrote:
- mmap_disable: both single and multi server configurations have mmap_disable=yes but in index file section says that you need it if you have your index files stored in nfs. I have it stored locally. Do I need mmap_disable=yes? What it's the best?
mmap_disable is used only for index files, so with local indexes use "no". (If indexes were on NFS, "no" would probably still work but I'm not sure if the performance would be better or worse. Errors would also trigger SIGBUS crashes.)
- dotlock_use_excl: it is set to no in both configurations, but the comment says that it is needed only in nfsv2. Since I have nfs3, I have it set it to yes.
"yes" is ok.
- mail_nfs_storage: In single server is set to no, but in multi server it set to yes. Since I have a director in front of my backend server, what is the recommended?
With director you can set this to "no".
Ok, I'm going to change it.
With this configuration, when I have a few connections (about 300-400 imap connections) everything is working fine, but when I disconnect the old servers and direct all my users' connections to the new servers I have lot of errors.
Real errors that show up in Dovecot logs? What kind of errors?
Lot of errors like:
Jun 20 12:42:37 myotis31 dovecot: imap(vlo): Warning: Maildir /home/otros/44/016744/Maildir/.INBOX.PRUEBAS: Synchronization took 278 seconds (0 new msgs, 0 flag change attempts, 0 expunge attempts) Jun 20 12:42:38 myotis31 dovecot: imap(vlo): Warning: Transaction log file /var/indexes/vlo/.INBOX.PRUEBAS/dovecot.index.log was locked for 279 seconds
and in the relay server, lots of timeout errors delivering to lmtp:
un 20 12:38:29 xenon14 postfix/lmtp[12004]: D48D55D4F7: to=dmv@um.es, relay=pop.um.es[155.54.212.106]:24, delay=150, delays=0.09/0/0/150, dsn=4.4.0, status=deferred (host pop.um.es[155.54.212.106] said: 451 4.4.0 Remote server not answering (timeout while waiting for reply to DATA reply) (in reply to end of DATA command))
server loads increments to over 300 points, with a very high io wait. With atop, I could see that of my 6 cores, I have one with almost 100% waiting for i/o and the other with almost 100% idle, but load of the server is very, very high.
Does the server's disk IO usage actually go a lot higher, or is it simply waiting without doing much of anything? I wonder if this is related to the inotify problems: http://dovecot.org/list/dovecot/2012-June/066474.html
Now we have rollbacked to the old servers, so I don't know. Next time
we try, I'll check this.
Another thought: Since indexes are stored locally, is it possible that the extra load comes simply from building the indexes on the new servers, while they already exist on the old ones?
I don't think so, because:
- In the old servers, we have no "director like" mechanism. One IP is always directed to the same server (during a session timeout, today could be one server and tomorrow another different), but mail is delivered randomly through one of the server.
- Since last week (when we started migration) all mail is delivered into the mailboxes by the new servers, passing through director. So new server's indexes should be updated.
mail_fsync = always
v1.1 did the equivalent of mail_fsync=optimized. You could see if that makes a difference.
I'll try this.
maildir_stat_dirs = yes
Do you actually need this? It causes unnecessary disk IO and probably not needed in your case.
My fault. I understood the explanation completely wrong. I thought that
yes should do what actually does no. I have fixed it.
default_process_limit = 1000
Since you haven't enabled high-performance mode for imap-login processes and haven't otherwise changed the service imap-login settings, this means that you can have max. 1000 simultaneous IMAP SSL/TLS connections.
I know it. I have to tune it.
-- Angel L. Mateo Martínez Sección de Telemática Área de Tecnologías de la Información _o) y las Comunicaciones Aplicadas (ATICA) / \\ http://www.um.es/atica _(___V Tfo: 868887590 Fax: 868888337