Buckets with different replication factor #838
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#838
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Hi,
I have some data I would like to store in Object Store that is not that important to be available to write anytime, but being available to read at anytime would be great - which implies that replication_factor = 2 would be enough for that usecase (no write, read-only if 1 replica fails). But currently it's impossible to set such policy per-bucket, only per-garage-deployment.
Please add such feature to allow to specify different replication factors for different buckets so users can choose how important their data it ;)
Thanks!
An additional thought/use case to this:
This would improve storage efficency for some large bucket, which are totally reproducible and therefore need no replicas, e.g. repository mirrors.
Do you plan on configuring this with S3 Storage Class?
Do you have any idea of the steps involved to implement this feature in Garage?
How could we split this feature request in small pull requests?
What could be the drawbacks of implementing this?
[Feature request] Buckets with different replication factorto Buckets with different replication factor@quentin to be honest, not sure if you're asking me those questions, as I simply don't know how Garage works internally and how it splits the data, so I simply don't know the answer what work has to be done here :(
Regarding configuring this feature - from my perspective I don't care if this uses S3 Storage Class compatible API or it's configured via Garage's internal tool (like API keys are configured via Garage feature instead of AWS-compatible API).
Replication factor is not in the S3 API, right? If implemented, the
bucket create
command might be a good place to have a flag for this:Alternatively, it could be done by deploying multiple Garage clusters.
@quentin does Garage yet have an opinion on how this should be done? What we've been doing is running multiple garage clusters as needed. In our case, we're dividing up storage using whole disks, but it seems the design of Garage might be amenable to shared storage.
I think the overall use case of needing different replication values for different data sets is valid and probably common (we have this use case as well).
IMO the project should cover this in the docs and design intent, and if the intent of Garage is that it should be done by creating multiple clusters, then ideally Garage would have good support for sharing underlying disks/filesystems with other processes (which I think it already does).