[Dovecot] key -> object mailstore

older
[Dovecot] http://xi.rename-it.nl...

Damien Churchill

14 Sep 2012 14 Sep '12

5:46 p.m.

Hi,

I was wondering what would be entailed in modifying dovecot 2.2 to support storing mail in an object store. I've seen a few mails dotted around in the ML history about supporting such a thing and seen it's basically dependant upon some changes in lib-storage to support writing messages without locking. Is this still the case?

Regards, Damien

Show replies by date

Timo Sirainen

14 Sep 14 Sep

5:59 p.m.

On 14.9.2012, at 17.46, Damien Churchill wrote:

...

I was wondering what would be entailed in modifying dovecot 2.2 to support storing mail in an object store. I've seen a few mails dotted around in the ML history about supporting such a thing and seen it's basically dependant upon some changes in lib-storage to support writing messages without locking. Is this still the case?

I've a whole new design for it and I was planning on implementing it for v2.2. Do you want to help coding it? :) Which storage would you want to use?

The generic idea is:

only one server accesses one user simultaneously
index files are copied from object storage to local filesystem and accessed there, once in a while uploaded back to object storage
if user is accessed from two servers because of some bug/split brain/something, the changes are merged using dsync
support high latency: asynchronous reads/writes. prefetch mail bodies.

Damien Churchill

6:16 p.m.

On 14 September 2012 15:59, Timo Sirainen <tss@iki.fi> wrote:

...

On 14.9.2012, at 17.46, Damien Churchill wrote:

...
I was wondering what would be entailed in modifying dovecot 2.2 to support storing mail in an object store. I've seen a few mails dotted around in the ML history about supporting such a thing and seen it's basically dependant upon some changes in lib-storage to support writing messages without locking. Is this still the case?

I've a whole new design for it and I was planning on implementing it for v2.2. Do you want to help coding it? :) Which storage would you want to use?

That's good to hear :) I've been evaluating RADOS as an object store, which is similar to S3. Although any distributed storage would be great. I'd be more than happy to help code it!

...

The generic idea is:

only one server accesses one user simultaneously

index files are copied from object storage to local filesystem and accessed there, once in a while uploaded back to object storage

if user is accessed from two servers because of some bug/split brain/something, the changes are merged using dsync

support high latency: asynchronous reads/writes. prefetch mail bodies.

I'm assuming that the director would be used in order to distribute connections to the same server, so it's only within a local instance of dovecot you'd need to be aware of what currently has a connection open for that user?

How are you planning on handling the situation where say node X dies and hasn't uploaded the latest index file? Would that result in missing messages from the mailbox when accessed by another node, or is the local index intended to be more of a write-through cache?

Timo Sirainen

17 Sep 17 Sep

3:57 p.m.

On 14.9.2012, at 18.16, Damien Churchill wrote:

...

On 14 September 2012 15:59, Timo Sirainen <tss@iki.fi> wrote:

...
On 14.9.2012, at 17.46, Damien Churchill wrote:

...
I was wondering what would be entailed in modifying dovecot 2.2 to support storing mail in an object store. I've seen a few mails dotted around in the ML history about supporting such a thing and seen it's basically dependant upon some changes in lib-storage to support writing messages without locking. Is this still the case?

I've a whole new design for it and I was planning on implementing it for v2.2. Do you want to help coding it? :) Which storage would you want to use?

That's good to hear :) I've been evaluating RADOS as an object store, which is similar to S3. Although any distributed storage would be great. I'd be more than happy to help code it!

I think I'll first have to get started with it to see if there are some parts that are easy to give to you. First I'll at least need to do some refactoring to dbox code and lib-fs code. I'm planning on making the generic parts of it be part of Dovecot releases, but I haven't yet fully decided which backends should be..

...

...
The generic idea is:

only one server accesses one user simultaneously

index files are copied from object storage to local filesystem and accessed there, once in a while uploaded back to object storage

if user is accessed from two servers because of some bug/split brain/something, the changes are merged using dsync

support high latency: asynchronous reads/writes. prefetch mail bodies.

I'm assuming that the director would be used in order to distribute connections to the same server, so it's only within a local instance of dovecot you'd need to be aware of what currently has a connection open for that user?

Right. Probably some new process that can do the work of downloading/uploading/deleting index files as needed. That's actually a clearly separate task that you could do? :)

...

How are you planning on handling the situation where say node X dies and hasn't uploaded the latest index file? Would that result in missing messages from the mailbox when accessed by another node, or is the local index intended to be more of a write-through cache?

No messages get ever lost. Recent flag changes and expunges may get lost, at least until the original node comes back up and dsync merges the changes. Idea was that when downloading index a flag on the object storage is set for the user that it's being accessed, and removed after the user is disconnected and index is uploaded back. If index downloader already sees that the flag is set it will run some kind of a recovery process to find any messages that were uploaded but not indexed. (The message bodies are always immediately uploaded to object storage.)

Damien Churchill

21 Sep 21 Sep

12:36 p.m.

On 17 September 2012 13:57, Timo Sirainen <tss@iki.fi> wrote:

...

On 14.9.2012, at 18.16, Damien Churchill wrote:

...
On 14 September 2012 15:59, Timo Sirainen <tss@iki.fi> wrote:

...
On 14.9.2012, at 17.46, Damien Churchill wrote:

...
I was wondering what would be entailed in modifying dovecot 2.2 to support storing mail in an object store. I've seen a few mails dotted around in the ML history about supporting such a thing and seen it's basically dependant upon some changes in lib-storage to support writing messages without locking. Is this still the case?

I've a whole new design for it and I was planning on implementing it for v2.2. Do you want to help coding it? :) Which storage would you want to use?

That's good to hear :) I've been evaluating RADOS as an object store, which is similar to S3. Although any distributed storage would be great. I'd be more than happy to help code it!

I think I'll first have to get started with it to see if there are some parts that are easy to give to you. First I'll at least need to do some refactoring to dbox code and lib-fs code. I'm planning on making the generic parts of it be part of Dovecot releases, but I haven't yet fully decided which backends should be..

...
...
The generic idea is:

only one server accesses one user simultaneously

index files are copied from object storage to local filesystem and accessed there, once in a while uploaded back to object storage

if user is accessed from two servers because of some bug/split brain/something, the changes are merged using dsync

support high latency: asynchronous reads/writes. prefetch mail bodies.

I'm assuming that the director would be used in order to distribute connections to the same server, so it's only within a local instance of dovecot you'd need to be aware of what currently has a connection open for that user?

Right. Probably some new process that can do the work of downloading/uploading/deleting index files as needed. That's actually a clearly separate task that you could do? :)

Sounds good! I'll spend some time digging through the source code getting familiar.

...

...
How are you planning on handling the situation where say node X dies and hasn't uploaded the latest index file? Would that result in missing messages from the mailbox when accessed by another node, or is the local index intended to be more of a write-through cache?

No messages get ever lost. Recent flag changes and expunges may get lost, at least until the original node comes back up and dsync merges the changes. Idea was that when downloading index a flag on the object storage is set for the user that it's being accessed, and removed after the user is disconnected and index is uploaded back. If index downloader already sees that the flag is set it will run some kind of a recovery process to find any messages that were uploaded but not indexed. (The message bodies are always immediately uploaded to object storage.)

Part of me thinks making this configurable might be a good idea depending upon what the installation is trying to achieve. Since the recovery process will need to be implemented regardless allowing the user to configure dovecot to perform a write to both the local and object index when a flag or something else is modified.

Another thought occurs to me, when using the LDA how will that be able to update the index upon delivery of a new message if another node is currently accessing the mailbox?

Jeff Gustafson

15 Sep 15 Sep

6:39 a.m.

On Fri, 2012-09-14 at 17:59 +0300, Timo Sirainen wrote:

...

I've a whole new design for it and I was planning on implementing it for v2.2. Do you want to help coding it? :) Which storage would you want to use?

The generic idea is:

only one server accesses one user simultaneously

index files are copied from object storage to local filesystem and accessed there, once in a while uploaded back to object storage

if user is accessed from two servers because of some bug/split brain/something, the changes are merged using dsync

support high latency: asynchronous reads/writes. prefetch mail bodies.

With this system, would the read/write ultimately go to a normal OS

file function? If it is a file function, could this be used with a system like glusterfs, ceph, etc? The other option would be to write it against a object store client library and bypass the normal file functions.

		...Jeff

Alessio Cecchi

21 Sep 21 Sep

5:13 p.m.

Il 15/09/2012 05:39, Jeff Gustafson ha scritto:

...

On Fri, 2012-09-14 at 17:59 +0300, Timo Sirainen wrote:

...
I've a whole new design for it and I was planning on implementing it for v2.2. Do you want to help coding it? :) Which storage would you want to use?

The generic idea is:

only one server accesses one user simultaneously

index files are copied from object storage to local filesystem and accessed there, once in a while uploaded back to object storage

if user is accessed from two servers because of some bug/split brain/something, the changes are merged using dsync

support high latency: asynchronous reads/writes. prefetch mail bodies. With this system, would the read/write ultimately go to a normal OS file function? If it is a file function, could this be used with a system like glusterfs, ceph, etc? The other option would be to write it against a object store client library and bypass the normal file functions.
  	...Jeff

Also other users are talking about Ceph and Dovecot

http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg07345.html

-- Alessio Cecchi is: @ ILS -> http://www.linux.it/~alessice/ on LinkedIn -> http://www.linkedin.com/in/alessice Assistenza Sistemi GNU/Linux -> http://www.cecchi.biz/ @ PLUG -> ex-Presidente, adesso senatore a vita, http://www.prato.linux.it

Christian Rohmann

17 Sep 17 Sep

9:52 a.m.

Hey dovecot-users,

On 14.09.2012 16:59, Timo Sirainen wrote:

...

I've a whole new design for it and I was planning on implementing it for v2.2. Do you want to help coding it? :) Which storage would you want to use?

I'd vote for OpenStack's Swift or Ceph's RADOS. They are both gaining momentum with new instrallations, they are open source and quite active in development. Also they both maintain Amazon S3 compatible APIs. Ceph even has a Swift compatible API for that matter.

Regards

Christian

4700

Age (days ago)

4707

Last active (days ago)

List overview

7 comments

5 participants

participants (5)

Alessio Cecchi
Christian Rohmann
Damien Churchill
Jeff Gustafson
Timo Sirainen