forked from Deuxfleurs/garage
Some work on documentation towards v0.8
This commit is contained in:
parent
89b8087ba8
commit
f6aebefcc9
15 changed files with 151 additions and 71 deletions
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "Benchmarks"
|
title = "Benchmarks"
|
||||||
weight = 10
|
weight = 40
|
||||||
+++
|
+++
|
||||||
|
|
||||||
With Garage, we wanted to build a software defined storage service that follow the [KISS principle](https://en.wikipedia.org/wiki/KISS_principle),
|
With Garage, we wanted to build a software defined storage service that follow the [KISS principle](https://en.wikipedia.org/wiki/KISS_principle),
|
||||||
|
|
|
@ -1,13 +1,13 @@
|
||||||
+++
|
+++
|
||||||
title = "Goals and use cases"
|
title = "Goals and use cases"
|
||||||
weight = 5
|
weight = 10
|
||||||
+++
|
+++
|
||||||
|
|
||||||
## Goals and non-goals
|
## Goals and non-goals
|
||||||
|
|
||||||
Garage is a lightweight geo-distributed data store that implements the
|
Garage is a lightweight geo-distributed data store that implements the
|
||||||
[Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html)
|
[Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html)
|
||||||
object storage protocole. It enables applications to store large blobs such
|
object storage protocol. It enables applications to store large blobs such
|
||||||
as pictures, video, images, documents, etc., in a redundant multi-node
|
as pictures, video, images, documents, etc., in a redundant multi-node
|
||||||
setting. S3 is versatile enough to also be used to publish a static
|
setting. S3 is versatile enough to also be used to publish a static
|
||||||
website.
|
website.
|
||||||
|
|
|
@ -20,6 +20,49 @@ In the meantime, you can find some information at the following links:
|
||||||
- [an old design draft](@/documentation/working-documents/design-draft.md)
|
- [an old design draft](@/documentation/working-documents/design-draft.md)
|
||||||
|
|
||||||
|
|
||||||
|
## Request routing logic
|
||||||
|
|
||||||
|
Data retrieval requests to Garage endpoints (S3 API and websites) are resolved
|
||||||
|
to an individual object in a bucket. Since objects are replicated to multiple nodes
|
||||||
|
Garage must ensure consistency before answering the request.
|
||||||
|
|
||||||
|
### Using quorum to ensure consistency
|
||||||
|
|
||||||
|
Garage ensures consistency by attempting to establish a quorum with the
|
||||||
|
data nodes responsible for the object. When a majority of the data nodes
|
||||||
|
have provided metadata on a object Garage can then answer the request.
|
||||||
|
|
||||||
|
When a request arrives Garage will, assuming the recommended 3 replicas, perform the following actions:
|
||||||
|
|
||||||
|
- Make a request to the two preferred nodes for object metadata
|
||||||
|
- Try the third node if one of the two initial requests fail
|
||||||
|
- Check that the metadata from at least 2 nodes match
|
||||||
|
- Check that the object hasn't been marked deleted
|
||||||
|
- Answer the request with inline data from metadata if object is small enough
|
||||||
|
- Or get data blocks from the preferred nodes and answer using the assembled object
|
||||||
|
|
||||||
|
Garage dynamically determines which nodes to query based on health, preference, and
|
||||||
|
which nodes actually host a given data. Garage has no concept of "primary" so any
|
||||||
|
healthy node with the data can be used as long as a quorum is reached for the metadata.
|
||||||
|
|
||||||
|
### Node health
|
||||||
|
|
||||||
|
Garage keeps a TCP session open to each node in the cluster and periodically pings them. If a connection
|
||||||
|
cannot be established, or a node fails to answer a number of pings, the target node is marked as failed.
|
||||||
|
Failed nodes are not used for quorum or other internal requests.
|
||||||
|
|
||||||
|
### Node preference
|
||||||
|
|
||||||
|
Garage prioritizes which nodes to query according to a few criteria:
|
||||||
|
|
||||||
|
- A node always prefers itself if it can answer the request
|
||||||
|
- Then the node prioritizes nodes in the same zone
|
||||||
|
- Finally the nodes with the lowest latency are prioritized
|
||||||
|
|
||||||
|
|
||||||
|
For further reading on the cluster structure look at the [gateway](@/documentation/cookbook/gateways.md)
|
||||||
|
and [cluster layout management](@/documentation/reference-manual/layout.md) pages.
|
||||||
|
|
||||||
## Garbage collection
|
## Garbage collection
|
||||||
|
|
||||||
A faulty garbage collection procedure has been the cause of
|
A faulty garbage collection procedure has been the cause of
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "Related work"
|
title = "Related work"
|
||||||
weight = 15
|
weight = 50
|
||||||
+++
|
+++
|
||||||
|
|
||||||
## Context
|
## Context
|
||||||
|
|
|
@ -9,6 +9,15 @@ Let's start your Garage journey!
|
||||||
In this chapter, we explain how to deploy Garage as a single-node server
|
In this chapter, we explain how to deploy Garage as a single-node server
|
||||||
and how to interact with it.
|
and how to interact with it.
|
||||||
|
|
||||||
|
## What is Garage?
|
||||||
|
|
||||||
|
Before jumping in, you might be interested in reading the following pages:
|
||||||
|
|
||||||
|
- [Goals and use cases](@/documentation/design/goals.md)
|
||||||
|
- [List of features](@/documentation/reference-manual/features.md)
|
||||||
|
|
||||||
|
## Scope of this tutorial
|
||||||
|
|
||||||
Our goal is to introduce you to Garage's workflows.
|
Our goal is to introduce you to Garage's workflows.
|
||||||
Following this guide is recommended before moving on to
|
Following this guide is recommended before moving on to
|
||||||
[configuring a multi-node cluster](@/documentation/cookbook/real-world.md).
|
[configuring a multi-node cluster](@/documentation/cookbook/real-world.md).
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "Administration API"
|
title = "Administration API"
|
||||||
weight = 16
|
weight = 60
|
||||||
+++
|
+++
|
||||||
|
|
||||||
The Garage administration API is accessible through a dedicated server whose
|
The Garage administration API is accessible through a dedicated server whose
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "Garage CLI"
|
title = "Garage CLI"
|
||||||
weight = 15
|
weight = 30
|
||||||
+++
|
+++
|
||||||
|
|
||||||
The Garage CLI is mostly self-documented. Make use of the `help` subcommand
|
The Garage CLI is mostly self-documented. Make use of the `help` subcommand
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "Configuration file format"
|
title = "Configuration file format"
|
||||||
weight = 5
|
weight = 20
|
||||||
+++
|
+++
|
||||||
|
|
||||||
Here is an example `garage.toml` configuration file that illustrates all of the possible options:
|
Here is an example `garage.toml` configuration file that illustrates all of the possible options:
|
||||||
|
@ -10,7 +10,6 @@ metadata_dir = "/var/lib/garage/meta"
|
||||||
data_dir = "/var/lib/garage/data"
|
data_dir = "/var/lib/garage/data"
|
||||||
|
|
||||||
block_size = 1048576
|
block_size = 1048576
|
||||||
block_manager_background_tranquility = 2
|
|
||||||
|
|
||||||
replication_mode = "3"
|
replication_mode = "3"
|
||||||
|
|
||||||
|
@ -87,17 +86,6 @@ files will remain available. This however means that chunks from existing files
|
||||||
will not be deduplicated with chunks from newly uploaded files, meaning you
|
will not be deduplicated with chunks from newly uploaded files, meaning you
|
||||||
might use more storage space that is optimally possible.
|
might use more storage space that is optimally possible.
|
||||||
|
|
||||||
### `block_manager_background_tranquility`
|
|
||||||
|
|
||||||
This parameter tunes the activity of the background worker responsible for
|
|
||||||
resyncing data blocks between nodes. The higher the tranquility value is set,
|
|
||||||
the more the background worker will wait between iterations, meaning the load
|
|
||||||
on the system (including network usage between nodes) will be reduced. The
|
|
||||||
minimal value for this parameter is `0`, where the background worker will
|
|
||||||
allways work at maximal throughput to resynchronize blocks. The default value
|
|
||||||
is `2`, where the background worker will try to spend at most 1/3 of its time
|
|
||||||
working, and 2/3 sleeping in order to reduce system load.
|
|
||||||
|
|
||||||
### `replication_mode`
|
### `replication_mode`
|
||||||
|
|
||||||
Garage supports the following replication modes:
|
Garage supports the following replication modes:
|
||||||
|
|
85
doc/book/reference-manual/features.md
Normal file
85
doc/book/reference-manual/features.md
Normal file
|
@ -0,0 +1,85 @@
|
||||||
|
+++
|
||||||
|
title = "List of Garage features"
|
||||||
|
weight = 10
|
||||||
|
+++
|
||||||
|
|
||||||
|
|
||||||
|
### S3 API
|
||||||
|
|
||||||
|
The main goal of Garage is to provide an object storage service that is compatible with the
|
||||||
|
[S3 API](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html) from Amazon Web Services.
|
||||||
|
We try to adhere as strictly as possible to the semantics of the API as implemented by Amazon
|
||||||
|
and other vendors such as Minio or CEPH.
|
||||||
|
|
||||||
|
Of course Garage does not implement the full span of API endpoints that AWS S3 does;
|
||||||
|
the exact list of S3 features implemented by Garage can be found [on our S3 compatibility page](@/documentation/reference-manual/s3-compatibility.md).
|
||||||
|
|
||||||
|
### Geo-distribution
|
||||||
|
|
||||||
|
Garage allows you to store copies of your data in multiple geographical locations in order to maximize resilience
|
||||||
|
to adverse events, such as network/power outages or hardware failures.
|
||||||
|
This allows Garage to run very well even at home, using consumer-grade Internet connectivity
|
||||||
|
(such as FTTH) and power, as long as cluster nodes can be spawned at several physical locations.
|
||||||
|
Garage exploits knowledge of the capacity and physical location of each storage node to design
|
||||||
|
a storage plan that best exploits the available storage capacity while satisfying the geo-distributed replication constraint.
|
||||||
|
|
||||||
|
To learn more about geo-distributed Garage clusters,
|
||||||
|
read our documentation on [setting up a real-world deployment](@/documentation/cookbook/real-world.md).
|
||||||
|
|
||||||
|
### Flexible topology
|
||||||
|
|
||||||
|
A Garage cluster can very easily evolve over time, as storage nodes are added or removed.
|
||||||
|
Garage will automatically rebalance data between nodes as needed to ensure the desired number of copies.
|
||||||
|
Read about cluster layout management [here](@/documentation/reference-manual/layout.md).
|
||||||
|
|
||||||
|
### No RAFT slowing you down
|
||||||
|
|
||||||
|
It might seem strange to tout the absence of something as a desirable feature,
|
||||||
|
but this is in fact a very important point! Garage does not use RAFT or another
|
||||||
|
consensus algorithm internally to order incoming requests: this means that all requests
|
||||||
|
directed to a Garage cluster can be handled independently of one another instead
|
||||||
|
of going through a central bottleneck (the leader node).
|
||||||
|
As a consequence, requests can be handled much faster, even in cases where latency
|
||||||
|
between cluster nodes is important (see our [benchmarks](@/documentation/design/benchmarks/index.md) for data on this).
|
||||||
|
This is particularly usefull when nodes are far from one another and talk to one other through standard Internet connections.
|
||||||
|
|
||||||
|
### Several replication modes
|
||||||
|
|
||||||
|
Garage supports a variety of replication modes, with 1 copy, 2 copies or 3 copies of your data,
|
||||||
|
and with various levels of consistency.
|
||||||
|
Read our reference page on [supported replication modes](@/documentation/reference-manual/configuration.md#replication-mode)
|
||||||
|
to select the replication mode best suited to your use case (hint: in most cases, `replication_mode = "3"` is what you want).
|
||||||
|
|
||||||
|
### Web server for static websites
|
||||||
|
|
||||||
|
A storage bucket can easily be configured to be served directly by Garage as a static web site.
|
||||||
|
Domain names for multiple websites directly map to bucket names, making it easy to build
|
||||||
|
a platform for your user's to autonomously build and host their websites over Garage.
|
||||||
|
Surprisingly, none of the other alternative S3 implementations we surveyed (such as Minio
|
||||||
|
or CEPH) support publishing static websites from S3 buckets, a feature that is however
|
||||||
|
directly inherited from S3 on AWS.
|
||||||
|
|
||||||
|
### Bucket names as aliases
|
||||||
|
|
||||||
|
- the same bucket may have multiple names (useful when exposing websites for example)
|
||||||
|
|
||||||
|
- bucket renaming is possible
|
||||||
|
|
||||||
|
- Scoped buckets: 2 users can have a different bucket with the same name -> avoid collision. Helpful if you want to write an application that creates per-user bucket always with the same name.
|
||||||
|
|
||||||
|
### Standalone/self contained
|
||||||
|
|
||||||
|
|
||||||
|
### Integration with Kubernetes and Nomad
|
||||||
|
|
||||||
|
Many node discovery methods: Kubernetes integration, Nomad integration through Consul
|
||||||
|
|
||||||
|
### Support for changing IP addresses
|
||||||
|
|
||||||
|
(as long as all nodes don't change their IP at the same time)
|
||||||
|
|
||||||
|
### Cluster administration API
|
||||||
|
|
||||||
|
### Metrics and traces
|
||||||
|
|
||||||
|
### (experimental) K2V API
|
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "K2V"
|
title = "K2V"
|
||||||
weight = 30
|
weight = 70
|
||||||
+++
|
+++
|
||||||
|
|
||||||
Starting with version 0.7.2, Garage introduces an optionnal feature, K2V,
|
Starting with version 0.7.2, Garage introduces an optionnal feature, K2V,
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "Cluster layout management"
|
title = "Cluster layout management"
|
||||||
weight = 10
|
weight = 50
|
||||||
+++
|
+++
|
||||||
|
|
||||||
The cluster layout in Garage is a table that assigns to each node a role in
|
The cluster layout in Garage is a table that assigns to each node a role in
|
||||||
|
|
|
@ -1,45 +0,0 @@
|
||||||
+++
|
|
||||||
title = "Request routing logic"
|
|
||||||
weight = 10
|
|
||||||
+++
|
|
||||||
|
|
||||||
Data retrieval requests to Garage endpoints (S3 API and websites) are resolved
|
|
||||||
to an individual object in a bucket. Since objects are replicated to multiple nodes
|
|
||||||
Garage must ensure consistency before answering the request.
|
|
||||||
|
|
||||||
## Using quorum to ensure consistency
|
|
||||||
|
|
||||||
Garage ensures consistency by attempting to establish a quorum with the
|
|
||||||
data nodes responsible for the object. When a majority of the data nodes
|
|
||||||
have provided metadata on a object Garage can then answer the request.
|
|
||||||
|
|
||||||
When a request arrives Garage will, assuming the recommended 3 replicas, perform the following actions:
|
|
||||||
|
|
||||||
- Make a request to the two preferred nodes for object metadata
|
|
||||||
- Try the third node if one of the two initial requests fail
|
|
||||||
- Check that the metadata from at least 2 nodes match
|
|
||||||
- Check that the object hasn't been marked deleted
|
|
||||||
- Answer the request with inline data from metadata if object is small enough
|
|
||||||
- Or get data blocks from the preferred nodes and answer using the assembled object
|
|
||||||
|
|
||||||
Garage dynamically determines which nodes to query based on health, preference, and
|
|
||||||
which nodes actually host a given data. Garage has no concept of "primary" so any
|
|
||||||
healthy node with the data can be used as long as a quorum is reached for the metadata.
|
|
||||||
|
|
||||||
## Node health
|
|
||||||
|
|
||||||
Garage keeps a TCP session open to each node in the cluster and periodically pings them. If a connection
|
|
||||||
cannot be established, or a node fails to answer a number of pings, the target node is marked as failed.
|
|
||||||
Failed nodes are not used for quorum or other internal requests.
|
|
||||||
|
|
||||||
## Node preference
|
|
||||||
|
|
||||||
Garage prioritizes which nodes to query according to a few criteria:
|
|
||||||
|
|
||||||
- A node always prefers itself if it can answer the request
|
|
||||||
- Then the node prioritizes nodes in the same zone
|
|
||||||
- Finally the nodes with the lowest latency are prioritized
|
|
||||||
|
|
||||||
|
|
||||||
For further reading on the cluster structure look at the [gateway](@/documentation/cookbook/gateways.md)
|
|
||||||
and [cluster layout management](@/documentation/reference-manual/layout.md) pages.
|
|
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "S3 Compatibility status"
|
title = "S3 Compatibility status"
|
||||||
weight = 20
|
weight = 40
|
||||||
+++
|
+++
|
||||||
|
|
||||||
## DISCLAIMER
|
## DISCLAIMER
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "Design draft"
|
title = "Design draft (obsolete)"
|
||||||
weight = 25
|
weight = 50
|
||||||
+++
|
+++
|
||||||
|
|
||||||
**WARNING: this documentation is a design draft which was written before Garage's actual implementation.
|
**WARNING: this documentation is a design draft which was written before Garage's actual implementation.
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
+++
|
+++
|
||||||
title = "Load balancing data"
|
title = "Load balancing data (obsolete)"
|
||||||
weight = 10
|
weight = 60
|
||||||
+++
|
+++
|
||||||
|
|
||||||
**This is being yet improved in release 0.5. The working document has not been updated yet, it still only applies to Garage 0.2 through 0.4.**
|
**This is being yet improved in release 0.5. The working document has not been updated yet, it still only applies to Garage 0.2 through 0.4.**
|
||||||
|
|
Loading…
Reference in a new issue