[Dovecot] mbox locking

newer
[Dovecot] Test

older
New LDA + Sieve (Was Re: [Dovecot]...

Thomas Hummel

26 Jun 2006 26 Jun '06

7:03 p.m.

Hello,

I'd like to make sure my understanding of the mbox locking strategy is correct :

"mbox_dotlock_change_timeout" directive :

a process seeing that an already locked (by another process) mbox he want to access hasn't change for this amount of time, allows himself to override the lock ? If it's using fcntl, is it only possible to override ?
when a process abort trying to get a lock after mbox_lock_timeout, does the UA sees anything ?
multiple locking methods in "mbox_write_lock" :

if I state, for instance,

mbox_write_locks = dotlock fcntl

when does exactly dotlock method get used and when does fcntl get used ? Are both used simultaneously ?

Which leads to the wiki's deadlock situation example :

Program A: fcntl locks the mbox

Program B at the same time: dotlocks the mbox

Program A continues: tries to dotlock the mbox, but since it's already dotlocked by B, it starts waiting

Program B continues: tries to fcntl lock the, but since it's already fcntl locked by A, it starts waiting

why would A dotlock the mbox since he did fcntl'ed it successfully ? if he's able to perform fcntl, he can use a shared or an exclusive lock : why use dotlock at all ?

-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Pôle informatique - systèmes et réseau

Show replies by date

Timo Sirainen

27 Jun 27 Jun

12:31 p.m.

On Mon, 2006-06-26 at 18:03 +0200, Thomas Hummel wrote:

...

Hello,

I'd like to make sure my understanding of the mbox locking strategy is correct :

"mbox_dotlock_change_timeout" directive :

a process seeing that an already locked (by another process) mbox he want to access hasn't change for this amount of time, allows himself to override the lock ?

Right. It checks that both the mbox and the dotlock file hasn't changed.

...

If it's using fcntl, is it only possible to override ?

fcntl locks aren't ever overridden, since it means the process still exists. And there isn't really any way to override them either.

...

when a process abort trying to get a lock after mbox_lock_timeout, does the UA sees anything ?

Yes, it should see "Timeout while waiting for lock" error. But it depends on the client if it shows it to the user.

...

multiple locking methods in "mbox_write_lock" :

if I state, for instance,

mbox_write_locks = dotlock fcntl

when does exactly dotlock method get used and when does fcntl get used ? Are both used simultaneously ?

It uses both. First dotlock, then fcntl.

...

Which leads to the wiki's deadlock situation example :

 # Program A: fcntl locks the mbox
 # Program B at the same time: dotlocks the mbox
 # Program A continues: tries to dotlock the mbox, but since it's already dotlocked by B, it starts waiting
 # Program B continues: tries to fcntl lock the, but since it's already fcntl locked by A, it starts waiting

Right. That's why the ordering must be correct.

...

why would A dotlock the mbox since he did fcntl'ed it successfully ? if he's able to perform fcntl, he can use a shared or an exclusive lock : why use dotlock at all ?

Because it's not guaranteed that all programs use the same locking methods. If Dovecot uses only fcntl and another program uses only dotlock, there's practically no locking at all in that case.

Thomas Hummel

2:57 p.m.

On Tue, Jun 27, 2006 at 12:31:27PM +0300, Timo Sirainen wrote:

...

On Mon, 2006-06-26 at 18:03 +0200, Thomas Hummel wrote:

...
Hello,

...

...

"mbox_dotlock_change_timeout" directive :

a process seeing that an already locked (by another process) mbox he want to access hasn't change for this amount of time, allows himself to override the lock ?

Right. It checks that both the mbox and the dotlock file hasn't changed.

Thanks for your answers.

But still I don't get the use of overriding the dotlock since the other process that put that dotlock may well have fcntl'ed the mbox too (it could use a "dotlock + fcntl" exclusive locking method, just as dovecot does - it could be another dovecot mail process).

So don't we risk to end up in a situation where the dotlock is ours but the mailbox is still fcntl-locked by another process ?

-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Pôle informatique - systèmes et réseau

Timo Sirainen

3:19 p.m.

On Tue, 2006-06-27 at 13:57 +0200, Thomas Hummel wrote:

...

On Tue, Jun 27, 2006 at 12:31:27PM +0300, Timo Sirainen wrote:

...
On Mon, 2006-06-26 at 18:03 +0200, Thomas Hummel wrote:

...
Hello,

...
...

"mbox_dotlock_change_timeout" directive :

a process seeing that an already locked (by another process) mbox he want to access hasn't change for this amount of time, allows himself to override the lock ?

Right. It checks that both the mbox and the dotlock file hasn't changed.

Thanks for your answers.

But still I don't get the use of overriding the dotlock since the other process that put that dotlock may well have fcntl'ed the mbox too (it could use a "dotlock + fcntl" exclusive locking method, just as dovecot does - it could be another dovecot mail process).

The idea is to override dotlock files that have been left around by proceses that have already died. They don't have fcntl locks or anything else. If Dovecot didn't ever override them, they'd just keep the mailbox locked until someone manually goes and deletes them.

...

So don't we risk to end up in a situation where the dotlock is ours but the mailbox is still fcntl-locked by another process ?

Actually not with Dovecot. When Dovecot is about to override the stale dotlock, it first tries to do the fcntl locking. If it succeeds, then the dotlock is overridden, if it doesn't succeed, then it fails with lock timeout. These checks have some race conditions so they're not perfect, but good enough to work practically always.

Thomas Hummel

3:46 p.m.

On Tue, Jun 27, 2006 at 03:19:23PM +0300, Timo Sirainen wrote:

...

The idea is to override dotlock files that have been left around by proceses

I see.

...

...
So don't we risk to end up in a situation where the dotlock is ours but the mailbox is still fcntl-locked by another process ?

Actually not with Dovecot. When Dovecot is about to override the stale dotlock, it first tries to do the fcntl locking. If it succeeds, then the dotlock is overridden

And then, I guess it fcntl the mbox again (in the case of an "dotlock fcntl" order), otherwise it would break the order specified, right ?

-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Pôle informatique - systèmes et réseau

Timo Sirainen

3:58 p.m.

On Tue, 2006-06-27 at 14:46 +0200, Thomas Hummel wrote:

...

...
...
So don't we risk to end up in a situation where the dotlock is ours but the mailbox is still fcntl-locked by another process ?

Actually not with Dovecot. When Dovecot is about to override the stale dotlock, it first tries to do the fcntl locking. If it succeeds, then the dotlock is overridden

And then, I guess it fcntl the mbox again (in the case of an "dotlock fcntl" order), otherwise it would break the order specified, right ?

Right. After testing the fcntl lock is dropped always.

Thomas Hummel

9:43 p.m.

On Tue, Jun 27, 2006 at 03:58:28PM +0300, Timo Sirainen wrote:

Thanks Timo,

Those questions are because I had some issues (with 0.9x which I know isn't maintained anymore) with dead process ('D' flag in BSD ps) and fcntl locks that wouldn't go (unless of course I'd reboot or change the mbox inode). I'm testing each 1.0beta since I don't want to invest effort in 0.9x and would like to have some clear understanding of mbox locking strategy in case some similar issue arrise and to be able to choose the best locking strategy.

So, one more thing :

"mbox_read_locks" defaults to fcntl and "mbox_write_locks" defaults to dotlock fcntl

Doesn't this assume that non-dovecot programs do use fcntl (maybe among additional methods) ? In that case, why wouldn't you assume the same for writes and default mbox_write_locks to fcntl only ?
Obviously, fcntl in write_locks should be exclusive, but is the lock set by fcntl in mbox_read_lock exclusive or shared ?

In other words, do you separate read and write locks :

. to allow multiple simultneous reads taking the risk to read inconsitent data (if so how do you recognize the situation to re-read the mbox) ?

. or because there are moments in a write "session" when the write dotlock is set (but the write fcntl not set yet) and reads are still allowed (because of your writing strategy) ?

Note :

in the second case, the multiple methods for write_locks would have 2 different reasons : to share a locking method with foreign programs and to have a fine grain write locking method inside dovecot.

Also should a mbox read block a mbox write ? Or should a UA trying to get a lock to read a mbox block or just return, sleep a try a bit later ?

when, as written on the Wiki procmail -v reports

dotlocking, fcntl(), lockf(), flock()

does it mean that he uses both dotlocking and fcntl (like dovecot) or one *or* the other ?

-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Pôle informatique - systèmes et réseau

Timo Sirainen

28 Jun 28 Jun

12:30 a.m.

On Tue, 2006-06-27 at 20:43 +0200, Thomas Hummel wrote:

...

On Tue, Jun 27, 2006 at 03:58:28PM +0300, Timo Sirainen wrote:

Thanks Timo,

Those questions are because I had some issues (with 0.9x which I know isn't maintained anymore) with dead process ('D' flag in BSD ps) and fcntl locks that wouldn't go (unless of course I'd reboot or change the mbox inode). I'm testing each 1.0beta since I don't want to invest effort in 0.9x and would like to have some clear understanding of mbox locking strategy in case some similar issue arrise and to be able to choose the best locking strategy.

I don't think D means dead even in BSDs? Usually it means non-interruptible sleep in kernel. I think it could mean that there's a deadlock between processes (due to different lock ordering). If there's only one process for the user in that state it means the kernel is broken. If you're using NFS it probably means your lockd got broken.

...

So, one more thing :

"mbox_read_locks" defaults to fcntl and "mbox_write_locks" defaults to dotlock fcntl

Doesn't this assume that non-dovecot programs do use fcntl (maybe among additional methods) ? In that case, why wouldn't you assume the same for writes and default mbox_write_locks to fcntl only ?

Because often Dovecot is the only one reading the mboxes, but there may be multiple different mail delivery agents each using their own locking scheme, but pretty much all of them use dotlock.

If a mail delivery agent uses only dotlocking, then it means that Dovecot's read locking doesn't work, so Dovecot could see half-written mails at the end of the mbox. But it still protects against mbox getting completely corrupted.

...

Obviously, fcntl in write_locks should be exclusive, but is the lock set by fcntl in mbox_read_lock exclusive or shared ?

It's shared.

...

In other words, do you separate read and write locks :

. to allow multiple simultneous reads taking the risk to read inconsitent data (if so how do you recognize the situation to re-read the mbox) ?

. or because there are moments in a write "session" when the write dotlock is set (but the write fcntl not set yet) and reads are still allowed (because of your writing strategy) ?

Writers should do also the fcntl locking. As long as that's done, multiple shared fcntl locks can be acquired so it's then safe to read the mbox.

...

Also should a mbox read block a mbox write ? Or should a UA trying to get a lock to read a mbox block or just return, sleep a try a bit later ?

Well, I don't really see much difference. If you want to do it anyway, you might as well wait in blocking fcntl call so you don't waste CPU.

If using flock or lockf instead of fcntl Dovecot does non-blocking call and sleeps a random time. I'm not actually sure why. I think I couldn't get the call to break after a given time with alarm() but the man page looks like that should have worked..

...

when, as written on the Wiki procmail -v reports

dotlocking, fcntl(), lockf(), flock()

does it mean that he uses both dotlocking and fcntl (like dovecot) or one *or* the other ?

It means it uses all the four in that order.

Thomas Hummel

11:52 a.m.

On Wed, Jun 28, 2006 at 12:30:06AM +0300, Timo Sirainen wrote:

...

I don't think D means dead even in BSDs? Usually it means non-interruptible sleep in kernel.

Actually, you're right :

"D Marks a process in disk (or other short term, uninter- ruptible) wait."

...

I think it could mean that there's a deadlock between processes (due to different lock ordering).

I don't think that's the case since in my case only procmail normaly writes into the mbox and the lock method ordering is the same as in dovecot. Maybe some user write directy via some UA like pine, elm or mutt though.

...

If there's only one process for the user in that state it means the kernel is broken. If you're using NFS it probably means your lockd got broken.

I should investigate this indeed.

...

Because often Dovecot is the only one reading the mboxes, but there may be multiple different mail delivery agents each using their own locking scheme, but pretty much all of them use dotlock.

I see. In my situation, I cannot be sure that dovecot is the only one which reads.

...

If a mail delivery agent uses only dotlocking, then it means that Dovecot's read locking doesn't work, so Dovecot could see half-written mails at the end of the mbox.

In such a case, does it get corrected on the next mbox read or is it too late (the UA as already seen/cached the half written message) ?

...

...

Obviously, fcntl in write_locks should be exclusive, but is the lock set by fcntl in mbox_read_lock exclusive or shared ?

It's shared.

...

Writers should do also the fcntl locking. As long as that's done, multiple shared fcntl locks can be acquired so it's then safe to read the mbox.

So you seem to confirm that a read should acutally block a write (note : this seem to make sense but I was wondering if some priority to writes were not implemented) ?

Thanks.

-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Pôle informatique - systèmes et réseau

Timo Sirainen

12:47 p.m.

On Wed, 2006-06-28 at 10:52 +0200, Thomas Hummel wrote:

...

...
If a mail delivery agent uses only dotlocking, then it means that Dovecot's read locking doesn't work, so Dovecot could see half-written mails at the end of the mbox.

In such a case, does it get corrected on the next mbox read or is it too late (the UA as already seen/cached the half written message) ?

Hmm. Actually Dovecot locks the mbox for writing whenever it notices the mbox has changed, so it shouldn't ever see half-written messages. Perhaps I shouldn't do that if mbox_lazy_writes is set, since it's not needed then..

Anyway, if it did only read-lock it and saw a half-written message, at the next sync it would notice that it changed. But I don't think it reassigns a new UID for it, so any cached in formation (in dovecot.index.cache or in client side) wouldn't get updated.

...

...
...

Obviously, fcntl in write_locks should be exclusive, but is the lock set by fcntl in mbox_read_lock exclusive or shared ?

It's shared.

...
Writers should do also the fcntl locking. As long as that's done, multiple shared fcntl locks can be acquired so it's then safe to read the mbox.

So you seem to confirm that a read should acutally block a write

Right.

...

(note : this seem to make sense but I was wondering if some priority to writes were not implemented) ?

There's nothing like that. If you want something like that use another mailbox format.

6999

Age (days ago)

7001

Last active (days ago)

List overview

9 comments

2 participants

participants (2)

Thomas Hummel
Timo Sirainen