[ceph-users] dovecot + cephfs - sdbox vs mdbox

Webert de Souza Lima webert.boss at gmail.com
Thu Oct 4 15:43:05 EEST 2018


Hi, bring this up again to ask one more question:

what would be the best recommended locking strategy for dovecot against
cephfs?
this is a balanced setup using independent director instances but all
dovecot instances on each node share the same storage system (cephfs).

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*


On Wed, May 16, 2018 at 5:15 PM Webert de Souza Lima <webert.boss at gmail.com>
wrote:

> Thanks Jack.
>
> That's good to know. It is definitely something to consider.
> In a distributed storage scenario we might build a dedicated pool for that
> and tune the pool as more capacity or performance is needed.
>
> Regards,
>
> Webert Lima
> DevOps Engineer at MAV Tecnologia
> *Belo Horizonte - Brasil*
> *IRC NICK - WebertRLZ*
>
>
> On Wed, May 16, 2018 at 4:45 PM Jack <ceph at jack.fr.eu.org> wrote:
>
>> On 05/16/2018 09:35 PM, Webert de Souza Lima wrote:
>> > We'll soon do benchmarks of sdbox vs mdbox over cephfs with bluestore
>> > backend.
>> > We'll have to do some some work on how to simulate user traffic, for
>> writes
>> > and readings. That seems troublesome.
>> I would appreciate seeing these results !
>>
>> > Thanks for the plugins recommendations. I'll take the change and ask you
>> > how is the SIS status? We have used it in the past and we've had some
>> > problems with it.
>>
>> I am using it since Dec 2016 with mdbox, with no issue at all (I am
>> currently using Dovecot 2.2.27-3 from Debian Stretch)
>> The only config I use is mail_attachment_dir, the rest lies as default
>> (mail_attachment_min_size = 128k, mail_attachment_fs = sis posix,
>> ail_attachment_hash = %{sha1})
>> The backend storage is a local filesystem, and there is only one Dovecot
>> instance
>>
>> >
>> > Regards,
>> >
>> > Webert Lima
>> > DevOps Engineer at MAV Tecnologia
>> > *Belo Horizonte - Brasil*
>> > *IRC NICK - WebertRLZ*
>> >
>> >
>> > On Wed, May 16, 2018 at 4:19 PM Jack <ceph at jack.fr.eu.org> wrote:
>> >
>> >> Hi,
>> >>
>> >> Many (most ?) filesystems does not store multiple files on the same
>> block
>> >>
>> >> Thus, with sdbox, every single mail (you know, that kind of mail with
>> 10
>> >> lines in it) will eat an inode, and a block (4k here)
>> >> mdbox is more compact on this way
>> >>
>> >> Another difference: sdbox removes the message, mdbox does not : a
>> single
>> >> metadata update is performed, which may be packed with others if many
>> >> files are deleted at once
>> >>
>> >> That said, I do not have experience with dovecot + cephfs, nor have
>> made
>> >> tests for sdbox vs mdbox
>> >>
>> >> However, and this is a bit out of topic, I recommend you look at the
>> >> following dovecot's features (if not already done), as they are awesome
>> >> and will help you a lot:
>> >> - Compression (classic, https://wiki.dovecot.org/Plugins/Zlib)
>> >> - Single-Instance-Storage (aka sis, aka "attachment deduplication" :
>> >> https://www.dovecot.org/list/dovecot/2013-December/094276.html)
>> >>
>> >> Regards,
>> >> On 05/16/2018 08:37 PM, Webert de Souza Lima wrote:
>> >>> I'm sending this message to both dovecot and ceph-users ML so please
>> >> don't
>> >>> mind if something seems too obvious for you.
>> >>>
>> >>> Hi,
>> >>>
>> >>> I have a question for both dovecot and ceph lists and below I'll
>> explain
>> >>> what's going on.
>> >>>
>> >>> Regarding dbox format (https://wiki2.dovecot.org/MailboxFormat/dbox),
>> >> when
>> >>> using sdbox, a new file is stored for each email message.
>> >>> When using mdbox, multiple messages are appended to a single file
>> until
>> >> it
>> >>> reaches/passes the rotate limit.
>> >>>
>> >>> I would like to understand better how the mdbox format impacts on IO
>> >>> performance.
>> >>> I think it's generally expected that fewer larger file translate to
>> less
>> >> IO
>> >>> and more troughput when compared to more small files, but how does
>> >> dovecot
>> >>> handle that with mdbox?
>> >>> If dovecot does flush data to storage upon each and every new email is
>> >>> arrived and appended to the corresponding file, would that mean that
>> it
>> >>> generate the same ammount of IO as it would do with one file per
>> message?
>> >>> Also, if using mdbox many messages will be appended to a said file
>> >> before a
>> >>> new file is created. That should mean that a file descriptor is kept
>> open
>> >>> for sometime by dovecot process.
>> >>> Using cephfs as backend, how would this impact cluster performance
>> >>> regarding MDS caps and inodes cached when files from thousands of
>> users
>> >> are
>> >>> opened and appended all over?
>> >>>
>> >>> I would like to understand this better.
>> >>>
>> >>> Why?
>> >>> We are a small Business Email Hosting provider with bare metal, self
>> >> hosted
>> >>> systems, using dovecot for servicing mailboxes and cephfs for email
>> >> storage.
>> >>>
>> >>> We are currently working on dovecot and storage redesign to be in
>> >>> production ASAP. The main objective is to serve more users with better
>> >>> performance, high availability and scalability.
>> >>> * high availability and load balancing is extremely important to us *
>> >>>
>> >>> On our current model, we're using mdbox format with dovecot, having
>> >>> dovecot's INDEXes stored in a replicated pool of SSDs, and messages
>> >> stored
>> >>> in a replicated pool of HDDs (under a Cache Tier with a pool of SSDs).
>> >>> All using cephfs / filestore backend.
>> >>>
>> >>> Currently there are 3 clusters running dovecot 2.2.34 and ceph Jewel
>> >>> (10.2.9-4).
>> >>>  - ~25K users from a few thousands of domains per cluster
>> >>>  - ~25TB of email data per cluster
>> >>>  - ~70GB of dovecot INDEX [meta]data per cluster
>> >>>  - ~100MB of cephfs metadata per cluster
>> >>>
>> >>> Our goal is to build a single ceph cluster for storage that could
>> expand
>> >> in
>> >>> capacity, be highly available and perform well enough. I know, that's
>> >> what
>> >>> everyone wants.
>> >>>
>> >>> Cephfs is an important choise because:
>> >>>  - there can be multiple mountpoints, thus multiple dovecot instances
>> on
>> >>> different hosts
>> >>>  - the same storage backend is used for all dovecot instances
>> >>>  - no need of sharding domains
>> >>>  - dovecot is easily load balanced (with director sticking users to
>> the
>> >>> same dovecot backend)
>> >>>
>> >>> On the upcoming upgrade we intent to:
>> >>>  - upgrade ceph to 12.X (Luminous)
>> >>>  - drop the SSD Cache Tier (because it's deprecated)
>> >>>  - use bluestore engine
>> >>>
>> >>> I was said on freenode/#dovecot that there are many cases where SDBOX
>> >> would
>> >>> perform better with NFS sharing.
>> >>> In case of cephfs, at first, I wouldn't think that would be true
>> because
>> >>> more files == more generated IO, but thinking about what I said in the
>> >>> beginning regarding sdbox vs mdbox that could be wrong.
>> >>>
>> >>> Any thoughts will be highlt appreciated.
>> >>>
>> >>> Regards,
>> >>>
>> >>> Webert Lima
>> >>> DevOps Engineer at MAV Tecnologia
>> >>> *Belo Horizonte - Brasil*
>> >>> *IRC NICK - WebertRLZ*
>> >>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> ceph-users mailing list
>> >>> ceph-users at lists.ceph.com
>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>
>> >>
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> ceph-users at lists.ceph.com
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20181004/3701e9ec/attachment-0001.html>


More information about the dovecot mailing list