[Dovecot] Dot Lock timestmap, users disconnections from roundcube

Timo Sirainen tss at iki.fi
Sat Nov 5 18:15:05 EET 2011


Well, doesn't matter if it's NFS or not. It still looks as if Dovecot
process was stuck for 45 seconds, most likely waiting for disk I/O to
finish.. What happens is something like:

1. Get the current time ("now")
2. See if lock file exists
3. Create lock file
4. fstat() the created lock file
5. Log a warning if fstat's ctime differs from "now" more than 30
seconds. (Actually I think the 30 seconds threshold is way too generous,
it should be less than 1 second usually.)

So steps 2 and 3 took 45 seconds to finish. Basically I guess the disk
I/O load was very high at that time, or alternatively there was some
unintentional delay caused by iSCSI (kernel/network bug/problem).

On Sat, 2011-11-05 at 00:57 +0100, Maria Arrea wrote:
> Timo, we are not using NFS, we use remote iSCSI volumes with ext4.
> 
>  Regards
> 
>  Maria
> 
> ----- Original Message -----
> From: Timo Sirainen
> Sent: 11/04/11 09:59 PM
> To: Maria Arrea
> Subject: Re: [Dovecot] Dot Lock timestmap, users disconnections from roundcube
> 
>  On Thu, 2011-11-03 at 10:54 +0100, Maria Arrea wrote: > Hello. > > We are running dovecot 2.0.13 with mdbox+zlib on RHEL 5.7 x64, ext4. We use NTP. Indexes are in a iSCSI raid 10, mailboxes in raid5. No NFS. We have detected that sometimes all users get disconnected from roundcube at the same time. In dovecot logs we hundreds of lines like this: > > Nov 3 09:23:07 buzon dovecot: imap(mcrivero at mydomain): Warning: Created dotlock file's timestamp is different than current time (1320308587 vs 1320308542): /buzones/mydomain/03/67/mcrivero/subscriptions I did several fixes related to this, but they were already in v2.0.10. Note the time difference of 45 seconds. > Nov 3 09:23:07 buzon dovecot: imap(mcrivero at mydomain): Connection closed bytes=0/295 The dotlock warning isn't related to this. My guess: NFS was being extremely slow here, some operation took 45 seconds and Roundcube decided to abort before that. The "timestamp is different" check doesn't work 100% correctly if the fil
>  esystem operations take more than a second.





More information about the dovecot mailing list