<div dir="ltr">Thanks Jack.<div><br></div><div>That's good to know. It is definitely something to consider.</div><div>In a distributed storage scenario we might build a dedicated pool for that and tune the pool as more capacity or performance is needed.</div><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><br></div>Regards,<div><br></div><div>Webert Lima</div><div>DevOps Engineer at MAV Tecnologia</div><div><b>Belo Horizonte - Brasil</b></div><div><b>IRC NICK - WebertRLZ</b></div></div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, May 16, 2018 at 4:45 PM Jack <<a href="mailto:ceph@jack.fr.eu.org">ceph@jack.fr.eu.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 05/16/2018 09:35 PM, Webert de Souza Lima wrote:<br>
> We'll soon do benchmarks of sdbox vs mdbox over cephfs with bluestore<br>
> backend.<br>
> We'll have to do some some work on how to simulate user traffic, for writes<br>
> and readings. That seems troublesome.<br>
I would appreciate seeing these results !<br>
<br>
> Thanks for the plugins recommendations. I'll take the change and ask you<br>
> how is the SIS status? We have used it in the past and we've had some<br>
> problems with it.<br>
<br>
I am using it since Dec 2016 with mdbox, with no issue at all (I am<br>
currently using Dovecot 2.2.27-3 from Debian Stretch)<br>
The only config I use is mail_attachment_dir, the rest lies as default<br>
(mail_attachment_min_size = 128k, mail_attachment_fs = sis posix,<br>
ail_attachment_hash = %{sha1})<br>
The backend storage is a local filesystem, and there is only one Dovecot<br>
instance<br>
<br>
> <br>
> Regards,<br>
> <br>
> Webert Lima<br>
> DevOps Engineer at MAV Tecnologia<br>
> *Belo Horizonte - Brasil*<br>
> *IRC NICK - WebertRLZ*<br>
> <br>
> <br>
> On Wed, May 16, 2018 at 4:19 PM Jack <<a href="mailto:ceph@jack.fr.eu.org" target="_blank">ceph@jack.fr.eu.org</a>> wrote:<br>
> <br>
>> Hi,<br>
>><br>
>> Many (most ?) filesystems does not store multiple files on the same block<br>
>><br>
>> Thus, with sdbox, every single mail (you know, that kind of mail with 10<br>
>> lines in it) will eat an inode, and a block (4k here)<br>
>> mdbox is more compact on this way<br>
>><br>
>> Another difference: sdbox removes the message, mdbox does not : a single<br>
>> metadata update is performed, which may be packed with others if many<br>
>> files are deleted at once<br>
>><br>
>> That said, I do not have experience with dovecot + cephfs, nor have made<br>
>> tests for sdbox vs mdbox<br>
>><br>
>> However, and this is a bit out of topic, I recommend you look at the<br>
>> following dovecot's features (if not already done), as they are awesome<br>
>> and will help you a lot:<br>
>> - Compression (classic, <a href="https://wiki.dovecot.org/Plugins/Zlib" rel="noreferrer" target="_blank">https://wiki.dovecot.org/Plugins/Zlib</a>)<br>
>> - Single-Instance-Storage (aka sis, aka "attachment deduplication" :<br>
>> <a href="https://www.dovecot.org/list/dovecot/2013-December/094276.html" rel="noreferrer" target="_blank">https://www.dovecot.org/list/dovecot/2013-December/094276.html</a>)<br>
>><br>
>> Regards,<br>
>> On 05/16/2018 08:37 PM, Webert de Souza Lima wrote:<br>
>>> I'm sending this message to both dovecot and ceph-users ML so please<br>
>> don't<br>
>>> mind if something seems too obvious for you.<br>
>>><br>
>>> Hi,<br>
>>><br>
>>> I have a question for both dovecot and ceph lists and below I'll explain<br>
>>> what's going on.<br>
>>><br>
>>> Regarding dbox format (<a href="https://wiki2.dovecot.org/MailboxFormat/dbox" rel="noreferrer" target="_blank">https://wiki2.dovecot.org/MailboxFormat/dbox</a>),<br>
>> when<br>
>>> using sdbox, a new file is stored for each email message.<br>
>>> When using mdbox, multiple messages are appended to a single file until<br>
>> it<br>
>>> reaches/passes the rotate limit.<br>
>>><br>
>>> I would like to understand better how the mdbox format impacts on IO<br>
>>> performance.<br>
>>> I think it's generally expected that fewer larger file translate to less<br>
>> IO<br>
>>> and more troughput when compared to more small files, but how does<br>
>> dovecot<br>
>>> handle that with mdbox?<br>
>>> If dovecot does flush data to storage upon each and every new email is<br>
>>> arrived and appended to the corresponding file, would that mean that it<br>
>>> generate the same ammount of IO as it would do with one file per message?<br>
>>> Also, if using mdbox many messages will be appended to a said file<br>
>> before a<br>
>>> new file is created. That should mean that a file descriptor is kept open<br>
>>> for sometime by dovecot process.<br>
>>> Using cephfs as backend, how would this impact cluster performance<br>
>>> regarding MDS caps and inodes cached when files from thousands of users<br>
>> are<br>
>>> opened and appended all over?<br>
>>><br>
>>> I would like to understand this better.<br>
>>><br>
>>> Why?<br>
>>> We are a small Business Email Hosting provider with bare metal, self<br>
>> hosted<br>
>>> systems, using dovecot for servicing mailboxes and cephfs for email<br>
>> storage.<br>
>>><br>
>>> We are currently working on dovecot and storage redesign to be in<br>
>>> production ASAP. The main objective is to serve more users with better<br>
>>> performance, high availability and scalability.<br>
>>> * high availability and load balancing is extremely important to us *<br>
>>><br>
>>> On our current model, we're using mdbox format with dovecot, having<br>
>>> dovecot's INDEXes stored in a replicated pool of SSDs, and messages<br>
>> stored<br>
>>> in a replicated pool of HDDs (under a Cache Tier with a pool of SSDs).<br>
>>> All using cephfs / filestore backend.<br>
>>><br>
>>> Currently there are 3 clusters running dovecot 2.2.34 and ceph Jewel<br>
>>> (10.2.9-4).<br>
>>> - ~25K users from a few thousands of domains per cluster<br>
>>> - ~25TB of email data per cluster<br>
>>> - ~70GB of dovecot INDEX [meta]data per cluster<br>
>>> - ~100MB of cephfs metadata per cluster<br>
>>><br>
>>> Our goal is to build a single ceph cluster for storage that could expand<br>
>> in<br>
>>> capacity, be highly available and perform well enough. I know, that's<br>
>> what<br>
>>> everyone wants.<br>
>>><br>
>>> Cephfs is an important choise because:<br>
>>> - there can be multiple mountpoints, thus multiple dovecot instances on<br>
>>> different hosts<br>
>>> - the same storage backend is used for all dovecot instances<br>
>>> - no need of sharding domains<br>
>>> - dovecot is easily load balanced (with director sticking users to the<br>
>>> same dovecot backend)<br>
>>><br>
>>> On the upcoming upgrade we intent to:<br>
>>> - upgrade ceph to 12.X (Luminous)<br>
>>> - drop the SSD Cache Tier (because it's deprecated)<br>
>>> - use bluestore engine<br>
>>><br>
>>> I was said on freenode/#dovecot that there are many cases where SDBOX<br>
>> would<br>
>>> perform better with NFS sharing.<br>
>>> In case of cephfs, at first, I wouldn't think that would be true because<br>
>>> more files == more generated IO, but thinking about what I said in the<br>
>>> beginning regarding sdbox vs mdbox that could be wrong.<br>
>>><br>
>>> Any thoughts will be highlt appreciated.<br>
>>><br>
>>> Regards,<br>
>>><br>
>>> Webert Lima<br>
>>> DevOps Engineer at MAV Tecnologia<br>
>>> *Belo Horizonte - Brasil*<br>
>>> *IRC NICK - WebertRLZ*<br>
>>><br>
>>><br>
>>><br>
>>> _______________________________________________<br>
>>> ceph-users mailing list<br>
>>> <a href="mailto:ceph-users@lists.ceph.com" target="_blank">ceph-users@lists.ceph.com</a><br>
>>> <a href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" rel="noreferrer" target="_blank">http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com</a><br>
>>><br>
>><br>
>> _______________________________________________<br>
>> ceph-users mailing list<br>
>> <a href="mailto:ceph-users@lists.ceph.com" target="_blank">ceph-users@lists.ceph.com</a><br>
>> <a href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" rel="noreferrer" target="_blank">http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com</a><br>
>><br>
> <br>
<br>
</blockquote></div>