All questions / remarks we have (FOSDEM, HN, etc.) #228

Closed
opened 2022-02-06 10:28:11 +00:00 by quentin · 0 comments
Owner

do you support active-active across different zone?

My suggestion: We might add a part in our website doc that explains that Garage design is totally different from traditional software. Indeed, traditional designs are based on local clusters, each in a zone, that are sometimes federated/reconcialiated/replicated across different zones. So, these systems are designed locally and then fixed to work across distant locations.

Instead, Garage is designed from the start for being distributed across distant locations. We do not have "active-active" or "active-passive" replications, because such systems are used to fix local-first clusters.

But will the retreival of a recently synced file take longer at the get if there is a network delay?

We had lot of questions about delays. Should be answered by our blog post on delays with the minio benchmark. The title could be something link "The network is slow so do not worsen it with your app".

Very interesting! Does it only support copies, or can it do erasure coding?

Maybe we could do a FAQ with these questions?

It recommends at least a 50Mbps connection, but how would it handle assymetric connections with very low upload (10-20Mbit/s) when performance is not that important? Does it download a file from multiple nodes at once in parallel?

I put this without thinking to it too much. We could say 10MBit/s up and down, and more if you want better performances

Is there a downside to having just one node per location?

I think we should have a page in our doc with various deployment examples and their properties. With a single node deployment, a 3-zones-one-server-per-zone, lot of zones and the drawbacks it have, etc.

Do I need to trust all nodes hosts? Or just the endpoint which I'm talking to?

We could have a security model page?

So if in a 3 cluster system no write can happen but can we still read at that point? To be clear in the 3 cluster system, if 2 systems are down.

We could have a page explaining "Read-after-writes" and a page explaining how requests are handled.

Would you consider adding parallel downloading, or is it not possible with the way the system is designed? I imagine it would improve performance a lot when downloading from remote locations

About our download/request scheduling policy

foldersync android does not work with AuthorizationHeaderMalformed; Request ID: null)

You must set configure garage with your region set to us-east-1 as it is hardcoded in the application. We might document that.

Is the network encrypted? What protocol do you use between nodes?

Garage already encrypts the traffic between nodes. The protocol used is named Secret Handshake. Your secret is the rpc_secret you define in your configuration. To join a Garage cluster, you must know this rpc_secret and the ID of one of the nodes of the network. There is no way to deactivate encryption on Garage (but if you copy paste the rpc_secret in our example config, your encryption will not be very strong).

Nodes communicate on TCP sockets, they use message pack to serialize/deserialize messages, scheduling on the network is handled by netapp, our internal network library.

Why exposing a website created through CreateBucket does not work? What are "scoped bucket names"? What are bucket aliases? Are they specific to Garage?

Because the bucket name is scoped to your key. Compared to other S3 daemon, you can create bucket names that are valid only for a key, so multiple keys can create buckets with the same name without any conflict. Internally, buckets are identified with a unique identifier, this identifier

Is Garage different from SeaweedFS, why not contributing to it instead, etc.

SeaweedFS requires a raft server. Either the writes will be super slow in presence of latency or you will lose data during a crash.

Is Garage different from IPFS, why not contributing to it.

Garage is closer to IPFS Cluster, which is not IPFS per-se. IPFS is a distribution network, a way to share a file efficiently to many people. It is not a tool to store data despite its misleading advertising. IPFS Cluster is less efficient than Garage (both on the algorithmic and system sides).

How can you configure Garage if you are behind a NAT? Have you heard about libp2p/pinecone/yggdrasil? Could you integrate them in Garage?

We should add a new cookbook entry named "NAT & firewall", "coping with network restrictions" or something similar. We could mention diplonat (with a "tech preview" warning) and yggdrasil. We could also have a "manual section" mentioning wireguard/iptables/routers.

> do you support active-active across different zone? **My suggestion**: We might add a part in our website doc that explains that Garage design is totally different from traditional software. Indeed, traditional designs are based on local clusters, each in a zone, that are sometimes federated/reconcialiated/replicated across different zones. So, these systems are designed locally and then fixed to work across distant locations. Instead, Garage is designed from the start for being distributed across distant locations. We do not have "active-active" or "active-passive" replications, because such systems are used to fix local-first clusters. > But will the retreival of a recently synced file take longer at the get if there is a network delay? We had lot of questions about delays. Should be answered by our blog post on delays with the minio benchmark. The title could be something link "The network is slow so do not worsen it with your app". > Very interesting! Does it only support copies, or can it do erasure coding? Maybe we could do a FAQ with these questions? > It recommends at least a 50Mbps connection, but how would it handle assymetric connections with very low upload (10-20Mbit/s) when performance is not that important? Does it download a file from multiple nodes at once in parallel? I put this without thinking to it too much. We could say 10MBit/s up and down, and more if you want better performances > Is there a downside to having just one node per location? I think we should have a page in our doc with various deployment examples and their properties. With a single node deployment, a 3-zones-one-server-per-zone, lot of zones and the drawbacks it have, etc. > Do I need to trust all nodes hosts? Or just the endpoint which I'm talking to? We could have a security model page? > So if in a 3 cluster system no write can happen but can we still read at that point? To be clear in the 3 cluster system, if 2 systems are down. We could have a page explaining "Read-after-writes" and a page explaining how requests are handled. > Would you consider adding parallel downloading, or is it not possible with the way the system is designed? I imagine it would improve performance a lot when downloading from remote locations About our download/request scheduling policy > foldersync android does not work with `AuthorizationHeaderMalformed; Request ID: null)` You must set configure garage with your region set to `us-east-1` as it is hardcoded in the application. We might document that. > Is the network encrypted? What protocol do you use between nodes? Garage already encrypts the traffic between nodes. The protocol used is named [Secret Handshake](https://dominictarr.github.io/secret-handshake-paper/shs.pdf). Your secret is the `rpc_secret` you define in your configuration. To join a Garage cluster, you must know this rpc_secret and the ID of one of the nodes of the network. There is no way to deactivate encryption on Garage (but if you copy paste the rpc_secret in our example config, your encryption will not be very strong). Nodes communicate on TCP sockets, they use message pack to serialize/deserialize messages, scheduling on the network is handled by [netapp](https://git.deuxfleurs.fr/lx/netapp), our internal network library. > Why exposing a website created through CreateBucket does not work? What are "scoped bucket names"? What are bucket aliases? Are they specific to Garage? Because the bucket name is scoped to your key. Compared to other S3 daemon, you can create bucket names that are valid only for a key, so multiple keys can create buckets with the same name without any conflict. Internally, buckets are identified with a unique identifier, this identifier > Is Garage different from SeaweedFS, why not contributing to it instead, etc. SeaweedFS requires a raft server. Either the writes will be super slow in presence of latency or you will lose data during a crash. > Is Garage different from IPFS, why not contributing to it. Garage is closer to IPFS Cluster, which is not IPFS per-se. IPFS is a distribution network, a way to share a file efficiently to many people. It is not a tool to store data despite its misleading advertising. IPFS Cluster is less efficient than Garage (both on the algorithmic and system sides). > How can you configure Garage if you are behind a NAT? Have you heard about libp2p/pinecone/yggdrasil? Could you integrate them in Garage? We should add a new cookbook entry named "NAT & firewall", "coping with network restrictions" or something similar. We could mention diplonat (with a "tech preview" warning) and yggdrasil. We could also have a "manual section" mentioning wireguard/iptables/routers.
quentin added the
Documentation
label 2022-02-06 10:28:11 +00:00
quentin changed title from Fosdem questions to All questions / remarks we have 2022-02-10 13:37:08 +00:00
quentin changed title from All questions / remarks we have to All questions / remarks we have (FOSDEM, HN, etc.) 2022-02-10 13:37:18 +00:00
lx closed this issue 2023-06-14 08:54:11 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#228
No description provided.