Allow anonymous read access to buckets to enable website hosting #6
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#6
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
To enable website hosting, we should start by allowing anonymous access to the bucket.
Either by creating a specific/fake API key or by adding an option to buckets.
Will try to be more precise after reading the code and see how AWS is doing.
Suggested architecture: open a third end point (HTTP server) only for anonymous/public access to buckets configured to serve as static websites. This would allow to clearly distinguish the semantics of the S3 API (read/write/list files/etc, authentified) and of the public website access (read only, no auth).
AWS seems to use an independent endpoint for websites, similarly as your recommendation:
https://docs.aws.amazon.com/fr_fr/AmazonS3/latest/dev/WebsiteEndpoints.html
Next, I will see how website management is implemented in S3.
It might be possible to implement this as a S3 API (not necessary the option we want).
Here are the relevant endpoints:
Some other endpoints, not mandatory, but that could be of interest later:
If we choose this solution, creating a website would be as simple as running:
where
example-bucket
is the bucket.We could then expose websites on a specific port as suggested by lx, specific port that would be bound to a specific domain in our reverse proxy. Not so simple however...
If we publish buckets as
site.deuxfleurs.fr/example-bucket
, we might be open to security risks if people use cookies and that sort of things that are bound to a domain name.If we publish buckets as
example-bucket.site.deuxfleurs.fr
, we could use wildcards at the proxy and ACME/Let's Encrypt level. Would need to check it can be done in practise.Finally, if we want to be as generic as possible and support arbitrary websites like
example.com
, we should use the Consul Catalog feature of Traefik: when PutBucketWebsite is called, an entry is inserted in Consul KV to inform Traefik of the website. Especially, two URLS must be registered:example.com
andexample.com.site.deuxfleurs.fr
. Similarly, withexample-bucket
andexample-bucket.site.deuxfleurs.fr
must be registered. In the second case, we can see that Traefik will never be able to register for the domain nameexample-bucket
. We must be careful with that and check that it will not exhaust Let's Encrypt Rate Limiting.Otherwise, we could check before adding the domain name that its DNS entry is
example.com CNAME site.deuxfleurs.fr
.Not related to this issue but some other endpoints we could implement:
Not sure it is a good idea.
So my minimal proposition to start:
What do you think of this plan LX? If you validate it, I will open a [WIP] PR to track my progress :-)
In my opinion, a minimal proposition would not even contain the implementation of the {Put,Get,Delete}BucketWebsite endpoints, and only allow configuration using the command line interface. This is currently what we are doing to create buckets and configure access keys (we don't have PutBucket or such). This requires manual intervention for the configuration of every new website, however given the small numbers of website hosted on Deuxfleurs, this is probably an acceptable cost to begin with.
Exposing API endpoints that allowe the user to create or configure buckets should be a separate issue. We probably need more thought as to what permission model we want to implement before we do that.
So my plan for a minimal implementation would only be:
Alternatively, a 1-indirection-layer option exists: use a separate table to store website configuration, so that website host names do not need to match bucket names, and several website host names can be served by the same bucket.
Agree, let's start without implementing the S3 API. To be as close as possible as S3, I will start by not adding the 1-indirection-layer. We will see later if we need it but having the same website listening on multiple domain name is a bad practise in term of SEO (redirections to a main domain name is preferred)