Hi, bring this up again to ask one more question:

what would be the best recommended locking strategy for dovecot against cephfs?
this is a balanced setup using independent director instances but all dovecot instances on each node share the same storage system (cephfs).

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
Belo Horizonte - Brasil
IRC NICK - WebertRLZ


On Wed, May 16, 2018 at 5:15 PM Webert de Souza Lima <webert.boss@gmail.com> wrote:
Thanks Jack.

That's good to know. It is definitely something to consider.
In a distributed storage scenario we might build a dedicated pool for that and tune the pool as more capacity or performance is needed.

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
Belo Horizonte - Brasil
IRC NICK - WebertRLZ


On Wed, May 16, 2018 at 4:45 PM Jack <ceph@jack.fr.eu.org> wrote:
On 05/16/2018 09:35 PM, Webert de Souza Lima wrote:
> We'll soon do benchmarks of sdbox vs mdbox over cephfs with bluestore
> backend.
> We'll have to do some some work on how to simulate user traffic, for writes
> and readings. That seems troublesome.
I would appreciate seeing these results !

> Thanks for the plugins recommendations. I'll take the change and ask you
> how is the SIS status? We have used it in the past and we've had some
> problems with it.

I am using it since Dec 2016 with mdbox, with no issue at all (I am
currently using Dovecot 2.2.27-3 from Debian Stretch)
The only config I use is mail_attachment_dir, the rest lies as default
(mail_attachment_min_size = 128k, mail_attachment_fs = sis posix,
ail_attachment_hash = %{sha1})
The backend storage is a local filesystem, and there is only one Dovecot
instance

>
> Regards,
>
> Webert Lima
> DevOps Engineer at MAV Tecnologia
> *Belo Horizonte - Brasil*
> *IRC NICK - WebertRLZ*
>
>
> On Wed, May 16, 2018 at 4:19 PM Jack <ceph@jack.fr.eu.org> wrote:
>
>> Hi,
>>
>> Many (most ?) filesystems does not store multiple files on the same block
>>
>> Thus, with sdbox, every single mail (you know, that kind of mail with 10
>> lines in it) will eat an inode, and a block (4k here)
>> mdbox is more compact on this way
>>
>> Another difference: sdbox removes the message, mdbox does not : a single
>> metadata update is performed, which may be packed with others if many
>> files are deleted at once
>>
>> That said, I do not have experience with dovecot + cephfs, nor have made
>> tests for sdbox vs mdbox
>>
>> However, and this is a bit out of topic, I recommend you look at the
>> following dovecot's features (if not already done), as they are awesome
>> and will help you a lot:
>> - Compression (classic, https://wiki.dovecot.org/Plugins/Zlib)
>> - Single-Instance-Storage (aka sis, aka "attachment deduplication" :
>> https://www.dovecot.org/list/dovecot/2013-December/094276.html)
>>
>> Regards,
>> On 05/16/2018 08:37 PM, Webert de Souza Lima wrote:
>>> I'm sending this message to both dovecot and ceph-users ML so please
>> don't
>>> mind if something seems too obvious for you.
>>>
>>> Hi,
>>>
>>> I have a question for both dovecot and ceph lists and below I'll explain
>>> what's going on.
>>>
>>> Regarding dbox format (https://wiki2.dovecot.org/MailboxFormat/dbox),
>> when
>>> using sdbox, a new file is stored for each email message.
>>> When using mdbox, multiple messages are appended to a single file until
>> it
>>> reaches/passes the rotate limit.
>>>
>>> I would like to understand better how the mdbox format impacts on IO
>>> performance.
>>> I think it's generally expected that fewer larger file translate to less
>> IO
>>> and more troughput when compared to more small files, but how does
>> dovecot
>>> handle that with mdbox?
>>> If dovecot does flush data to storage upon each and every new email is
>>> arrived and appended to the corresponding file, would that mean that it
>>> generate the same ammount of IO as it would do with one file per message?
>>> Also, if using mdbox many messages will be appended to a said file
>> before a
>>> new file is created. That should mean that a file descriptor is kept open
>>> for sometime by dovecot process.
>>> Using cephfs as backend, how would this impact cluster performance
>>> regarding MDS caps and inodes cached when files from thousands of users
>> are
>>> opened and appended all over?
>>>
>>> I would like to understand this better.
>>>
>>> Why?
>>> We are a small Business Email Hosting provider with bare metal, self
>> hosted
>>> systems, using dovecot for servicing mailboxes and cephfs for email
>> storage.
>>>
>>> We are currently working on dovecot and storage redesign to be in
>>> production ASAP. The main objective is to serve more users with better
>>> performance, high availability and scalability.
>>> * high availability and load balancing is extremely important to us *
>>>
>>> On our current model, we're using mdbox format with dovecot, having
>>> dovecot's INDEXes stored in a replicated pool of SSDs, and messages
>> stored
>>> in a replicated pool of HDDs (under a Cache Tier with a pool of SSDs).
>>> All using cephfs / filestore backend.
>>>
>>> Currently there are 3 clusters running dovecot 2.2.34 and ceph Jewel
>>> (10.2.9-4).
>>>  - ~25K users from a few thousands of domains per cluster
>>>  - ~25TB of email data per cluster
>>>  - ~70GB of dovecot INDEX [meta]data per cluster
>>>  - ~100MB of cephfs metadata per cluster
>>>
>>> Our goal is to build a single ceph cluster for storage that could expand
>> in
>>> capacity, be highly available and perform well enough. I know, that's
>> what
>>> everyone wants.
>>>
>>> Cephfs is an important choise because:
>>>  - there can be multiple mountpoints, thus multiple dovecot instances on
>>> different hosts
>>>  - the same storage backend is used for all dovecot instances
>>>  - no need of sharding domains
>>>  - dovecot is easily load balanced (with director sticking users to the
>>> same dovecot backend)
>>>
>>> On the upcoming upgrade we intent to:
>>>  - upgrade ceph to 12.X (Luminous)
>>>  - drop the SSD Cache Tier (because it's deprecated)
>>>  - use bluestore engine
>>>
>>> I was said on freenode/#dovecot that there are many cases where SDBOX
>> would
>>> perform better with NFS sharing.
>>> In case of cephfs, at first, I wouldn't think that would be true because
>>> more files == more generated IO, but thinking about what I said in the
>>> beginning regarding sdbox vs mdbox that could be wrong.
>>>
>>> Any thoughts will be highlt appreciated.
>>>
>>> Regards,
>>>
>>> Webert Lima
>>> DevOps Engineer at MAV Tecnologia
>>> *Belo Horizonte - Brasil*
>>> *IRC NICK - WebertRLZ*
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>