[Dovecot] Dovecot ontop of glusterfs issue.

Eliezer Croitoru

21 May 2014 21 May '14

11:37 a.m.

Hey,

I am testing Glusterfs as a storage backend for dovecot as a LDA and imap server. I have seen similar lines in the logs to these: May 21 10:46:01 mailgw dovecot: imap(eliezer@ngtech.co.il): Warning: Created dotlock file's timestamp is different than current time (1400658105 vs 1400658361): /home/vmail/ngtech.co.il/eliezer/Maildir/.Mailing_lists.ceph_users/dovecot-uidlist May 21 10:46:01 mailgw dovecot: imap(eliezer@ngtech.co.il): Error: Transaction log /home/vmail/ngtech.co.il/eliezer/Maildir/dovecot.index.log: duplicate transaction log sequence (2713)

The volume is mounted only by one server with ubutntu 14.04. I have seen threads and posts about similar issue with nfs. I want to try to debug the issue but note that with the same settings of the server nfs worked fine but slower. dovecot -n output: http://pastebin.centos.org/9626/

The glusterfs is a replicated volume constructed of two bricks which is mounted only on one dovecot server. All three servers are using the same ntp pool and are synced.

Any direction is better then the state I am now.

Thanks, Eliezer

Show replies by date

Murray Trainer

22 May 22 May

7:09 a.m.

Hi Eliezer,

We had the same errors a few weeks ago. Turned out the time on our NFS server was out by over 30 secs as NTP wasn't setup correctly . Looks like the time on one of yours is out by about 250 secs (361-105).

Murray

-----Original Message----- From: dovecot [mailto:dovecot-bounces@dovecot.org] On Behalf Of Eliezer Croitoru Sent: Wednesday, 21 May 2014 4:38 PM To: dovecot@dovecot.org Subject: [Dovecot] Dovecot ontop of glusterfs issue.

Hey,

I am testing Glusterfs as a storage backend for dovecot as a LDA and imap server. I have seen similar lines in the logs to these: May 21 10:46:01 mailgw dovecot: imap(eliezer@ngtech.co.il): Warning: Created dotlock file's timestamp is different than current time (1400658105 vs 1400658361): /home/vmail/ngtech.co.il/eliezer/Maildir/.Mailing_lists.ceph_users/dovecot-u idlist May 21 10:46:01 mailgw dovecot: imap(eliezer@ngtech.co.il): Error: Transaction log /home/vmail/ngtech.co.il/eliezer/Maildir/dovecot.index.log: duplicate transaction log sequence (2713)

The glusterfs is a replicated volume constructed of two bricks which is mounted only on one dovecot server. All three servers are using the same ntp pool and are synced.

Any direction is better then the state I am now.

Thanks, Eliezer

Eliezer Croitoru

1:48 p.m.

Well manually using a crontab with ntpdate to a pool of servers should be good enough right?

Eliezer

On 05/22/2014 07:09 AM, Murray Trainer wrote:

...

Hi Eliezer,

We had the same errors a few weeks ago. Turned out the time on our NFS server was out by over 30 secs as NTP wasn't setup correctly . Looks like the time on one of yours is out by about 250 secs (361-105).

Murray

Harlan Stenn

1:56 p.m.

On 5/22/14 3:48 AM, Eliezer Croitoru wrote:

...

Well manually using a crontab with ntpdate to a pool of servers should be good enough right?

Is there a good reason you're not just running ntpd?

Ntpdate has had a number of bugs in it for a long time, they will never be fixed, and ntpdate really isn't designed for what you seem to be doing.

Harlan Stenn http://nwtime.org - Be a member!

Eliezer Croitoru

23 May 23 May

11:31 a.m.

On 05/22/2014 01:56 PM, Harlan Stenn wrote:

...

Is there a good reason you're not just running ntpd?

Ntpdate has had a number of bugs in it for a long time, they will never OK so after searching the issue it seems like: I have installed the ntp on all of the servers and due to a faliure in one of the servers it seems like the ntp was not present. This caused only one node of the glusterfs to be out of sync and only some file access transactions which came from the not-synced server were delivered with the wrong timestamp.

So it was a fault but only on one node only made it weird to find and identify. In the ls it was showing one clock time and while the file was fetched it got another timestamp.

Thanks, Eliezer

Darac Marjal

22 May 22 May

2:30 p.m.

On Thu, May 22, 2014 at 01:48:23PM +0300, Eliezer Croitoru wrote:

...

Well manually using a crontab with ntpdate to a pool of servers should be good enough right?

Not really. NTPdate steps the clock forward or backwards instantaneously. Depending on how bad your system clock is, that could be a jump of several seconds. Now, which came first? This midnight or that midnight?

NTPd, on the other hand, delicately adjusts the clock frequency so that the clock drifts back into synchronisation. So your seconds might be 0.01% shorter than real, but they still all happen in the right sequence.

ntpdate is really only any good being run once (at boot), for example if you have a clock that can't keep time while the system is off.

...

Eliezer

On 05/22/2014 07:09 AM, Murray Trainer wrote:

...
Hi Eliezer,

We had the same errors a few weeks ago. Turned out the time on our NFS server was out by over 30 secs as NTP wasn't setup correctly . Looks like the time on one of yours is out by about 250 secs (361-105).

Murray

Harlan Stenn

2:36 p.m.

On 5/22/14 4:30 AM, Darac Marjal wrote:

...

ntpdate is really only any good being run once (at boot), for example if you have a clock that can't keep time while the system is off.

I'm not aware of any cases where one needs to run ntpdate at startup before running ntpd, because one can run 'ntpd -g' at startup which will correct a very large offset. If I'm wrong I'd love to hear about it.

This should be true for ntp-stable (4.2.6) and behaves even better for ntp-dev (4.2.7).

4318

Age (days ago)

4320

Last active (days ago)

List overview

6 comments

4 participants

participants (4)

Darac Marjal
Eliezer Croitoru
Harlan Stenn
Murray Trainer