Le 17 oct. 2023 à 13:12, Marc <Marc@f1-outsourcing.eu> a écrit :
Is s3 not to slow for this?
I think the clue is in the name "s3-compatible".
Clearly calling out to "real" (AWS) S3 would be a non-starter.
But a local installation of something like CEPH, MinIO or whatever on the same LAN ? I'd think that should be workable, no ? Do you know of anything that does this reliably?
I tested a few years ago with ceph[1] but at that point there was some issues where it had a 2x write applification (on top of the 3x) if I remember correctly. All of this is if not dead end will be a lots of complexity and inefficiency and a lot of waste of money. Only the application know how to things efficiently and with consistency.
S3-compatible storage is very good for multi-server installations where you need redundancy, availability. S3 is basically HTTP server so you can code your own logic on stored emails, balancers, caches, deduplication, compression, encryption it does't need to be off-the-shelf storage.
The problem is a bit what everyone understands as s3. I associate this indeed also with an http endpoint on object storage. But the ceph plugin skips this http and talks directly to object store. I don't think you would like to operate on this http level. If I look at this page of ceph[1], it also looks like you do not want to get yourself involved in deduplication.
[1] https://docs.ceph.com/en/reef/dev/deduplication/ <https://docs.ceph.com/en/reef/dev/deduplication/>
Moreover, following Filip remark about block deduplication, having any kind of deduplication that is not optimized for the email case (where attachments are always embed in slightly different documents) would make it ineffective. Is it really worse bothering deploying a whole Ceph cluster for that ?
Le 17 oct. 2023 à 13:12, Marc <Marc@f1-outsourcing.eu> a écrit :
Is s3 not to slow for this?
I think the clue is in the name "s3-
compatible".
Clearly calling out to "real" (AWS) S3
would be a non-starter.
But a local installation of something
like CEPH, MinIO or whatever on
the
same LAN ? I'd think that should be
workable, no ?
Do you know of anything that does this reliably?
I tested a few years ago with ceph[1] but at that
point there was some
issues where it had a 2x write applification (on top of the 3x) if I
remember correctly.
All of this is if not dead end will be a lots of complexity
and
inefficiency and a lot of waste of money. Only the application know
how to
things efficiently and with consistency.
S3-compatible storage is very good for multi-server installations
where you
need redundancy, availability. S3 is basically HTTP server so you can
code
your own logic on stored emails, balancers, caches, deduplication,
compression, encryption it does't need to be off-the-shelf storage.
The problem is a bit what everyone understands as s3. I associate this indeed also with an http endpoint on object storage. But the ceph plugin skips this http and talks directly to object store. I don't think you would like to operate on this http level. If I look at this page of ceph[1], it also looks like you do not want to get yourself involved in deduplication.
[1] https://docs.ceph.com/en/reef/dev/deduplication/
Moreover, following Filip remark about block deduplication, having any kind of deduplication that is not optimized for the email case (where attachments are always embed in slightly different documents) would make it ineffective. Is it really worse bothering deploying a whole Ceph cluster for that ?