Am 24.09.2017 um 02:43 schrieb Timo Sirainen:
On 22 Sep 2017, at 14.18, mj <lists@merit.unu.edu> wrote:
First, the Github link: https://github.com/ceph-dovecot/dovecot-ceph-plugin
I am not going to repeat everything which is on Github, put a short summary:
- CephFS is used for storing Mailbox Indexes - E-Mails are stored directly as RADOS objects - It's a Dovecot plugin
We would like everybody to test librmb and report back issues on Github so that further development can be done.
It's not finalized yet, but all the help is welcome to make librmb the best solution for storing your e-mails on Ceph with Dovecot.
It would be have been nicer if RADOS support was implemented as lib-fs driver, and the fs-API had been used all over the place elsewhere. So 1) LibRadosMailBox wouldn't have been relying so much on RADOS specifically and 2) fs-rados could have been used for other purposes. There are already fs-dict and dict-fs drivers, so the RADOS dict driver may not have been necessary to implement if fs-rados was implemented instead (although I didn't check it closely enough to verify). (We've had fs-rados on our TODO list for a while also.)
Please note: librmb is not Dovecot specific. The goal of this library is to abstract email storage at Ceph independent of Dovecot to allow also other mail systems to store emails in RADOS via one library. This is also the reason why it's relying on RADOS.
[...]
And using rmb-mailbox format, my main worries would be:
- doesn't store index files (= message flags) - not necessarily a problem, as long as you don't want geo-replication
The index files are stored via Dovecot's lib-index on CephFS. This is only an intermediate step. The goal is to store also index data directly in RADOS/Ceph omap key-value store. Currently geo-replication isn't an important topic for our PoC setup at Deutsche Telekom.
- index corruption means > rebuilding them, which means rescanning list of mail files, which means rescanning the whole RADOS namespace, which practically means rescanning the RADOS pool. That most likely is a very very slow operation, which you want to avoid unless it's absolutely necessary. Need to be very careful to avoid that happening, and in general to avoid losing mails in case of crashes or other bugs.
This could be may avoided by snapshot on CephFS currently, at least partially. But we will take a look at it during the PoC phase.
- I think copying/moving mails physically copies the full data on disk
- Each IMAP/POP3/LMTP/etc process connects to RADOS separately from each others - some connection pooling would likely help here
I'm not so deep in what Dovecot is currently doing. It's still under heavy development and any comment and feedback is really welcome as Wido already pointed out.
Danny