forked from Deuxfleurs/garage
New replication modes and their documentation
This commit is contained in:
parent
8f9cf3a5d1
commit
0091002ef2
2 changed files with 81 additions and 19 deletions
|
@ -48,7 +48,6 @@ root_domain = ".web.garage"
|
||||||
[admin]
|
[admin]
|
||||||
api_bind_addr = "0.0.0.0:3903"
|
api_bind_addr = "0.0.0.0:3903"
|
||||||
trace_sink = "http://localhost:4317"
|
trace_sink = "http://localhost:4317"
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
The following gives details about each available configuration option.
|
The following gives details about each available configuration option.
|
||||||
|
@ -89,20 +88,47 @@ might use more storage space that is optimally possible.
|
||||||
|
|
||||||
Garage supports the following replication modes:
|
Garage supports the following replication modes:
|
||||||
|
|
||||||
- `none` or `1`: data stored on Garage is stored on a single node. There is no redundancy,
|
- `none` or `1`: data stored on Garage is stored on a single node. There is no
|
||||||
and data will be unavailable as soon as one node fails or its network is disconnected.
|
redundancy, and data will be unavailable as soon as one node fails or its
|
||||||
Do not use this for anything else than test deployments.
|
network is disconnected. Do not use this for anything else than test
|
||||||
|
deployments.
|
||||||
|
|
||||||
- `2`: data stored on Garage will be stored on two different nodes, if possible in different
|
- `2`: data stored on Garage will be stored on two different nodes, if possible
|
||||||
zones. Garage tolerates one node failure before losing data. Data should be available
|
in different zones. Garage tolerates one node failure, or several nodes
|
||||||
read-only when one node is down, but write operations will fail.
|
failing but all in a single zone (in a deployment with at least two zones),
|
||||||
Use this only if you really have to.
|
before losing data. Data remains available in read-only mode when one node is
|
||||||
|
down, but write operations will fail.
|
||||||
|
|
||||||
- `3`: data stored on Garage will be stored on three different nodes, if possible each in
|
- `2-dangerous`: a variant of mode `2`, where written objects are written to
|
||||||
a different zones.
|
the second replica asynchronously. This means that Garage will return `200
|
||||||
Garage tolerates two node failure before losing data. Data should be available
|
OK` to a PutObject request before the second copy is fully written (or even
|
||||||
read-only when two nodes are down, and writes should be possible if only a single node
|
before it even starts being written). This means that data can more easily
|
||||||
is down.
|
be lost if the node crashes before a second copy can be completed. This
|
||||||
|
also means that written objects might not be visible immediately in read
|
||||||
|
operations. In other words, this mode severely breaks the consistency and
|
||||||
|
durability guarantees of standard Garage cluster operation. Benefits of
|
||||||
|
this mode: you can still write to your cluster when one node is
|
||||||
|
unavailable.
|
||||||
|
|
||||||
|
- `3`: data stored on Garage will be stored on three different nodes, if
|
||||||
|
possible each in a different zones. Garage tolerates two node failure, or
|
||||||
|
several node failures but in no more than two zones (in a deployment with at
|
||||||
|
least three zones), before losing data. As long as only a single node fails,
|
||||||
|
or node failures are only in a single zone, reading and writing data to
|
||||||
|
Garage can continue normally.
|
||||||
|
|
||||||
|
- `3-degraded`: a variant of replication mode `3`, that lowers the read
|
||||||
|
quorum to `1`, to allow you to read data from your cluster when several
|
||||||
|
nodes (or nodes in several zones) are unavailable. In this mode, Garage
|
||||||
|
does not provide read-after-write consistency anymore. The write quorum is
|
||||||
|
still 2, ensuring that data successfully written to Garage is stored on at
|
||||||
|
least two nodes.
|
||||||
|
|
||||||
|
- `3-dangerous`: a variant of replication mode `3` that lowers both the read
|
||||||
|
and write quorums to `1`, to allow you to both read and write to your
|
||||||
|
cluster when several nodes (or nodes in several zones) are unavailable. It
|
||||||
|
is the least consistent mode of operation proposed by Garage, and also one
|
||||||
|
that should probably never be used.
|
||||||
|
|
||||||
Note that in modes `2` and `3`,
|
Note that in modes `2` and `3`,
|
||||||
if at least the same number of zones are available, an arbitrary number of failures in
|
if at least the same number of zones are available, an arbitrary number of failures in
|
||||||
|
@ -111,8 +137,35 @@ any given zone is tolerated as copies of data will be spread over several zones.
|
||||||
**Make sure `replication_mode` is the same in the configuration files of all nodes.
|
**Make sure `replication_mode` is the same in the configuration files of all nodes.
|
||||||
Never run a Garage cluster where that is not the case.**
|
Never run a Garage cluster where that is not the case.**
|
||||||
|
|
||||||
Changing the `replication_mode` of a cluster might work (make sure to shut down all nodes
|
The quorums associated with each replication mode are described below:
|
||||||
and changing it everywhere at the time), but is not officially supported.
|
|
||||||
|
| `replication_mode` | Number of replicas | Write quorum | Read quorum | Read-after-write consistency? |
|
||||||
|
| ------------------ | ------------------ | ------------ | ----------- | ----------------------------- |
|
||||||
|
| `none` or `1` | 1 | 1 | 1 | yes |
|
||||||
|
| `2` | 2 | 2 | 1 | yes |
|
||||||
|
| `2-dangerous` | 2 | 1 | 1 | NO |
|
||||||
|
| `3` | 3 | 2 | 2 | yes |
|
||||||
|
| `3-degraded` | 3 | 2 | 1 | NO |
|
||||||
|
| `3-dangerous` | 3 | 1 | 1 | NO |
|
||||||
|
|
||||||
|
Changing the `replication_mode` between modes with the same number of replicas
|
||||||
|
(e.g. from `3` to `3-degraded`, or from `2-dangerous` to `2`), can be done easily by
|
||||||
|
just changing the `replication_mode` parameter in your config files and restarting all your
|
||||||
|
Garage nodes.
|
||||||
|
|
||||||
|
It is also technically possible to change the replication mode to a mode with a
|
||||||
|
different numbers of replicas, although it's a dangerous operation that is not
|
||||||
|
officially supported. This requires you to delete the existing cluster layout
|
||||||
|
and create a new layout from scratch, meaning that a full rebalancing of your
|
||||||
|
cluster's data will be needed. To do it, shut down your cluster entirely,
|
||||||
|
delete the `custer_layout` files in the meta directories of all your nodes,
|
||||||
|
update all your configuration files with the new `replication_mode` parameter,
|
||||||
|
restart your cluster, and then create a new layout with all the nodes you want
|
||||||
|
to keep. Rebalancing data will take some time, and data might temporarily
|
||||||
|
appear unavailable to your users. It is recommended to shut down public access
|
||||||
|
to the cluster while rebalancing is in progress. In theory, no data should be
|
||||||
|
lost as rebalancing is a routine operation for Garage, although we cannot
|
||||||
|
guarantee you that everything will go right in such an extreme scenario.
|
||||||
|
|
||||||
### `compression_level`
|
### `compression_level`
|
||||||
|
|
||||||
|
|
|
@ -1,7 +1,10 @@
|
||||||
pub enum ReplicationMode {
|
pub enum ReplicationMode {
|
||||||
None,
|
None,
|
||||||
TwoWay,
|
TwoWay,
|
||||||
|
TwoWayDangerous,
|
||||||
ThreeWay,
|
ThreeWay,
|
||||||
|
ThreeWayDegraded,
|
||||||
|
ThreeWayDangerous,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl ReplicationMode {
|
impl ReplicationMode {
|
||||||
|
@ -9,7 +12,10 @@ impl ReplicationMode {
|
||||||
match v {
|
match v {
|
||||||
"none" | "1" => Some(Self::None),
|
"none" | "1" => Some(Self::None),
|
||||||
"2" => Some(Self::TwoWay),
|
"2" => Some(Self::TwoWay),
|
||||||
|
"2-dangerous" => Some(Self::TwoWayDangerous),
|
||||||
"3" => Some(Self::ThreeWay),
|
"3" => Some(Self::ThreeWay),
|
||||||
|
"3-degraded" => Some(Self::ThreeWayDegraded),
|
||||||
|
"3-dangerous" => Some(Self::ThreeWayDangerous),
|
||||||
_ => None,
|
_ => None,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -24,16 +30,17 @@ impl ReplicationMode {
|
||||||
pub fn replication_factor(&self) -> usize {
|
pub fn replication_factor(&self) -> usize {
|
||||||
match self {
|
match self {
|
||||||
Self::None => 1,
|
Self::None => 1,
|
||||||
Self::TwoWay => 2,
|
Self::TwoWay | Self::TwoWayDangerous => 2,
|
||||||
Self::ThreeWay => 3,
|
Self::ThreeWay | Self::ThreeWayDegraded | Self::ThreeWayDangerous => 3,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
pub fn read_quorum(&self) -> usize {
|
pub fn read_quorum(&self) -> usize {
|
||||||
match self {
|
match self {
|
||||||
Self::None => 1,
|
Self::None => 1,
|
||||||
Self::TwoWay => 1,
|
Self::TwoWay | Self::TwoWayDangerous => 1,
|
||||||
Self::ThreeWay => 2,
|
Self::ThreeWay => 2,
|
||||||
|
Self::ThreeWayDegraded | Self::ThreeWayDangerous => 1,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -41,7 +48,9 @@ impl ReplicationMode {
|
||||||
match self {
|
match self {
|
||||||
Self::None => 1,
|
Self::None => 1,
|
||||||
Self::TwoWay => 2,
|
Self::TwoWay => 2,
|
||||||
Self::ThreeWay => 2,
|
Self::TwoWayDangerous => 1,
|
||||||
|
Self::ThreeWay | Self::ThreeWayDegraded => 2,
|
||||||
|
Self::ThreeWayDangerous => 1,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
Loading…
Reference in a new issue