forked from Deuxfleurs/garage
Some work on documentation towards v0.8
This commit is contained in:
parent
89b8087ba8
commit
f6aebefcc9
15 changed files with 151 additions and 71 deletions
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Benchmarks"
|
||||
weight = 10
|
||||
weight = 40
|
||||
+++
|
||||
|
||||
With Garage, we wanted to build a software defined storage service that follow the [KISS principle](https://en.wikipedia.org/wiki/KISS_principle),
|
||||
|
|
|
@ -1,13 +1,13 @@
|
|||
+++
|
||||
title = "Goals and use cases"
|
||||
weight = 5
|
||||
weight = 10
|
||||
+++
|
||||
|
||||
## Goals and non-goals
|
||||
|
||||
Garage is a lightweight geo-distributed data store that implements the
|
||||
[Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html)
|
||||
object storage protocole. It enables applications to store large blobs such
|
||||
object storage protocol. It enables applications to store large blobs such
|
||||
as pictures, video, images, documents, etc., in a redundant multi-node
|
||||
setting. S3 is versatile enough to also be used to publish a static
|
||||
website.
|
||||
|
|
|
@ -20,6 +20,49 @@ In the meantime, you can find some information at the following links:
|
|||
- [an old design draft](@/documentation/working-documents/design-draft.md)
|
||||
|
||||
|
||||
## Request routing logic
|
||||
|
||||
Data retrieval requests to Garage endpoints (S3 API and websites) are resolved
|
||||
to an individual object in a bucket. Since objects are replicated to multiple nodes
|
||||
Garage must ensure consistency before answering the request.
|
||||
|
||||
### Using quorum to ensure consistency
|
||||
|
||||
Garage ensures consistency by attempting to establish a quorum with the
|
||||
data nodes responsible for the object. When a majority of the data nodes
|
||||
have provided metadata on a object Garage can then answer the request.
|
||||
|
||||
When a request arrives Garage will, assuming the recommended 3 replicas, perform the following actions:
|
||||
|
||||
- Make a request to the two preferred nodes for object metadata
|
||||
- Try the third node if one of the two initial requests fail
|
||||
- Check that the metadata from at least 2 nodes match
|
||||
- Check that the object hasn't been marked deleted
|
||||
- Answer the request with inline data from metadata if object is small enough
|
||||
- Or get data blocks from the preferred nodes and answer using the assembled object
|
||||
|
||||
Garage dynamically determines which nodes to query based on health, preference, and
|
||||
which nodes actually host a given data. Garage has no concept of "primary" so any
|
||||
healthy node with the data can be used as long as a quorum is reached for the metadata.
|
||||
|
||||
### Node health
|
||||
|
||||
Garage keeps a TCP session open to each node in the cluster and periodically pings them. If a connection
|
||||
cannot be established, or a node fails to answer a number of pings, the target node is marked as failed.
|
||||
Failed nodes are not used for quorum or other internal requests.
|
||||
|
||||
### Node preference
|
||||
|
||||
Garage prioritizes which nodes to query according to a few criteria:
|
||||
|
||||
- A node always prefers itself if it can answer the request
|
||||
- Then the node prioritizes nodes in the same zone
|
||||
- Finally the nodes with the lowest latency are prioritized
|
||||
|
||||
|
||||
For further reading on the cluster structure look at the [gateway](@/documentation/cookbook/gateways.md)
|
||||
and [cluster layout management](@/documentation/reference-manual/layout.md) pages.
|
||||
|
||||
## Garbage collection
|
||||
|
||||
A faulty garbage collection procedure has been the cause of
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Related work"
|
||||
weight = 15
|
||||
weight = 50
|
||||
+++
|
||||
|
||||
## Context
|
||||
|
|
|
@ -9,6 +9,15 @@ Let's start your Garage journey!
|
|||
In this chapter, we explain how to deploy Garage as a single-node server
|
||||
and how to interact with it.
|
||||
|
||||
## What is Garage?
|
||||
|
||||
Before jumping in, you might be interested in reading the following pages:
|
||||
|
||||
- [Goals and use cases](@/documentation/design/goals.md)
|
||||
- [List of features](@/documentation/reference-manual/features.md)
|
||||
|
||||
## Scope of this tutorial
|
||||
|
||||
Our goal is to introduce you to Garage's workflows.
|
||||
Following this guide is recommended before moving on to
|
||||
[configuring a multi-node cluster](@/documentation/cookbook/real-world.md).
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Administration API"
|
||||
weight = 16
|
||||
weight = 60
|
||||
+++
|
||||
|
||||
The Garage administration API is accessible through a dedicated server whose
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Garage CLI"
|
||||
weight = 15
|
||||
weight = 30
|
||||
+++
|
||||
|
||||
The Garage CLI is mostly self-documented. Make use of the `help` subcommand
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Configuration file format"
|
||||
weight = 5
|
||||
weight = 20
|
||||
+++
|
||||
|
||||
Here is an example `garage.toml` configuration file that illustrates all of the possible options:
|
||||
|
@ -10,7 +10,6 @@ metadata_dir = "/var/lib/garage/meta"
|
|||
data_dir = "/var/lib/garage/data"
|
||||
|
||||
block_size = 1048576
|
||||
block_manager_background_tranquility = 2
|
||||
|
||||
replication_mode = "3"
|
||||
|
||||
|
@ -87,17 +86,6 @@ files will remain available. This however means that chunks from existing files
|
|||
will not be deduplicated with chunks from newly uploaded files, meaning you
|
||||
might use more storage space that is optimally possible.
|
||||
|
||||
### `block_manager_background_tranquility`
|
||||
|
||||
This parameter tunes the activity of the background worker responsible for
|
||||
resyncing data blocks between nodes. The higher the tranquility value is set,
|
||||
the more the background worker will wait between iterations, meaning the load
|
||||
on the system (including network usage between nodes) will be reduced. The
|
||||
minimal value for this parameter is `0`, where the background worker will
|
||||
allways work at maximal throughput to resynchronize blocks. The default value
|
||||
is `2`, where the background worker will try to spend at most 1/3 of its time
|
||||
working, and 2/3 sleeping in order to reduce system load.
|
||||
|
||||
### `replication_mode`
|
||||
|
||||
Garage supports the following replication modes:
|
||||
|
|
85
doc/book/reference-manual/features.md
Normal file
85
doc/book/reference-manual/features.md
Normal file
|
@ -0,0 +1,85 @@
|
|||
+++
|
||||
title = "List of Garage features"
|
||||
weight = 10
|
||||
+++
|
||||
|
||||
|
||||
### S3 API
|
||||
|
||||
The main goal of Garage is to provide an object storage service that is compatible with the
|
||||
[S3 API](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html) from Amazon Web Services.
|
||||
We try to adhere as strictly as possible to the semantics of the API as implemented by Amazon
|
||||
and other vendors such as Minio or CEPH.
|
||||
|
||||
Of course Garage does not implement the full span of API endpoints that AWS S3 does;
|
||||
the exact list of S3 features implemented by Garage can be found [on our S3 compatibility page](@/documentation/reference-manual/s3-compatibility.md).
|
||||
|
||||
### Geo-distribution
|
||||
|
||||
Garage allows you to store copies of your data in multiple geographical locations in order to maximize resilience
|
||||
to adverse events, such as network/power outages or hardware failures.
|
||||
This allows Garage to run very well even at home, using consumer-grade Internet connectivity
|
||||
(such as FTTH) and power, as long as cluster nodes can be spawned at several physical locations.
|
||||
Garage exploits knowledge of the capacity and physical location of each storage node to design
|
||||
a storage plan that best exploits the available storage capacity while satisfying the geo-distributed replication constraint.
|
||||
|
||||
To learn more about geo-distributed Garage clusters,
|
||||
read our documentation on [setting up a real-world deployment](@/documentation/cookbook/real-world.md).
|
||||
|
||||
### Flexible topology
|
||||
|
||||
A Garage cluster can very easily evolve over time, as storage nodes are added or removed.
|
||||
Garage will automatically rebalance data between nodes as needed to ensure the desired number of copies.
|
||||
Read about cluster layout management [here](@/documentation/reference-manual/layout.md).
|
||||
|
||||
### No RAFT slowing you down
|
||||
|
||||
It might seem strange to tout the absence of something as a desirable feature,
|
||||
but this is in fact a very important point! Garage does not use RAFT or another
|
||||
consensus algorithm internally to order incoming requests: this means that all requests
|
||||
directed to a Garage cluster can be handled independently of one another instead
|
||||
of going through a central bottleneck (the leader node).
|
||||
As a consequence, requests can be handled much faster, even in cases where latency
|
||||
between cluster nodes is important (see our [benchmarks](@/documentation/design/benchmarks/index.md) for data on this).
|
||||
This is particularly usefull when nodes are far from one another and talk to one other through standard Internet connections.
|
||||
|
||||
### Several replication modes
|
||||
|
||||
Garage supports a variety of replication modes, with 1 copy, 2 copies or 3 copies of your data,
|
||||
and with various levels of consistency.
|
||||
Read our reference page on [supported replication modes](@/documentation/reference-manual/configuration.md#replication-mode)
|
||||
to select the replication mode best suited to your use case (hint: in most cases, `replication_mode = "3"` is what you want).
|
||||
|
||||
### Web server for static websites
|
||||
|
||||
A storage bucket can easily be configured to be served directly by Garage as a static web site.
|
||||
Domain names for multiple websites directly map to bucket names, making it easy to build
|
||||
a platform for your user's to autonomously build and host their websites over Garage.
|
||||
Surprisingly, none of the other alternative S3 implementations we surveyed (such as Minio
|
||||
or CEPH) support publishing static websites from S3 buckets, a feature that is however
|
||||
directly inherited from S3 on AWS.
|
||||
|
||||
### Bucket names as aliases
|
||||
|
||||
- the same bucket may have multiple names (useful when exposing websites for example)
|
||||
|
||||
- bucket renaming is possible
|
||||
|
||||
- Scoped buckets: 2 users can have a different bucket with the same name -> avoid collision. Helpful if you want to write an application that creates per-user bucket always with the same name.
|
||||
|
||||
### Standalone/self contained
|
||||
|
||||
|
||||
### Integration with Kubernetes and Nomad
|
||||
|
||||
Many node discovery methods: Kubernetes integration, Nomad integration through Consul
|
||||
|
||||
### Support for changing IP addresses
|
||||
|
||||
(as long as all nodes don't change their IP at the same time)
|
||||
|
||||
### Cluster administration API
|
||||
|
||||
### Metrics and traces
|
||||
|
||||
### (experimental) K2V API
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "K2V"
|
||||
weight = 30
|
||||
weight = 70
|
||||
+++
|
||||
|
||||
Starting with version 0.7.2, Garage introduces an optionnal feature, K2V,
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Cluster layout management"
|
||||
weight = 10
|
||||
weight = 50
|
||||
+++
|
||||
|
||||
The cluster layout in Garage is a table that assigns to each node a role in
|
||||
|
|
|
@ -1,45 +0,0 @@
|
|||
+++
|
||||
title = "Request routing logic"
|
||||
weight = 10
|
||||
+++
|
||||
|
||||
Data retrieval requests to Garage endpoints (S3 API and websites) are resolved
|
||||
to an individual object in a bucket. Since objects are replicated to multiple nodes
|
||||
Garage must ensure consistency before answering the request.
|
||||
|
||||
## Using quorum to ensure consistency
|
||||
|
||||
Garage ensures consistency by attempting to establish a quorum with the
|
||||
data nodes responsible for the object. When a majority of the data nodes
|
||||
have provided metadata on a object Garage can then answer the request.
|
||||
|
||||
When a request arrives Garage will, assuming the recommended 3 replicas, perform the following actions:
|
||||
|
||||
- Make a request to the two preferred nodes for object metadata
|
||||
- Try the third node if one of the two initial requests fail
|
||||
- Check that the metadata from at least 2 nodes match
|
||||
- Check that the object hasn't been marked deleted
|
||||
- Answer the request with inline data from metadata if object is small enough
|
||||
- Or get data blocks from the preferred nodes and answer using the assembled object
|
||||
|
||||
Garage dynamically determines which nodes to query based on health, preference, and
|
||||
which nodes actually host a given data. Garage has no concept of "primary" so any
|
||||
healthy node with the data can be used as long as a quorum is reached for the metadata.
|
||||
|
||||
## Node health
|
||||
|
||||
Garage keeps a TCP session open to each node in the cluster and periodically pings them. If a connection
|
||||
cannot be established, or a node fails to answer a number of pings, the target node is marked as failed.
|
||||
Failed nodes are not used for quorum or other internal requests.
|
||||
|
||||
## Node preference
|
||||
|
||||
Garage prioritizes which nodes to query according to a few criteria:
|
||||
|
||||
- A node always prefers itself if it can answer the request
|
||||
- Then the node prioritizes nodes in the same zone
|
||||
- Finally the nodes with the lowest latency are prioritized
|
||||
|
||||
|
||||
For further reading on the cluster structure look at the [gateway](@/documentation/cookbook/gateways.md)
|
||||
and [cluster layout management](@/documentation/reference-manual/layout.md) pages.
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "S3 Compatibility status"
|
||||
weight = 20
|
||||
weight = 40
|
||||
+++
|
||||
|
||||
## DISCLAIMER
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Design draft"
|
||||
weight = 25
|
||||
title = "Design draft (obsolete)"
|
||||
weight = 50
|
||||
+++
|
||||
|
||||
**WARNING: this documentation is a design draft which was written before Garage's actual implementation.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Load balancing data"
|
||||
weight = 10
|
||||
title = "Load balancing data (obsolete)"
|
||||
weight = 60
|
||||
+++
|
||||
|
||||
**This is being yet improved in release 0.5. The working document has not been updated yet, it still only applies to Garage 0.2 through 0.4.**
|
||||
|
|
Loading…
Reference in a new issue