ReplicationMode -> ConsistencyMode+ReplicationFactor #750

Merged
lx merged 2 commits from yuka/garage:split-consistency-mode into next-0.10 2024-03-07 16:32:52 +00:00
Contributor

Split out from #748 to make it easier to review

Split out from https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/748 to make it easier to review
yuka force-pushed split-consistency-mode from 017dcfe703 to 386476bbc6 2024-03-03 13:28:14 +00:00 Compare
lx requested changes 2024-03-03 14:21:50 +00:00
lx left a comment
Owner

Some minor things to fix here and there.

IMO the biggest thing to worry about is that this changes the format of the config file. See my comment on the section in config.rs, I think we should support users that continue using replication_mode, or at least have a safeguard that prevents them from inadvertently changing the replication factor of their cluster. Also, the reference documentation page on the config options has to be updated.

Some minor things to fix here and there. IMO the biggest thing to worry about is that this changes the format of the config file. See my comment on the section in config.rs, I think we should support users that continue using `replication_mode`, or at least have a safeguard that prevents them from inadvertently changing the replication factor of their cluster. Also, the reference documentation page on the config options has to be updated.
@ -141,0 +144,4 @@
}
pub fn write_quorum(self: &Arc<Self>) -> usize {
self.replication_factor.read_quorum(self.consistency_mode)
Owner

Did you mean write_quorum instead of read_quorum?

Did you mean write_quorum instead of read_quorum?
yuka marked this conversation as resolved
@ -20,1 +18,3 @@
_ => None,
impl ConsistencyMode {
pub fn parse(s: &str) -> Option<Self> {
serde_json::from_value(serde_json::Value::String(s.to_string())).ok()
Owner

Does this work for values written in lowercase? Because the enum variants start with a capital letter.

Does this work for values written in lowercase? Because the enum variants start with a capital letter.
@ -21,4 +34,4 @@
}
}
pub fn control_write_max_faults(&self) -> usize {
Owner

We don't need this function anymore, right?

We don't need this function anymore, right?
yuka marked this conversation as resolved
@ -62,0 +56,4 @@
match consistency_mode {
ConsistencyMode::Dangerous => 1,
ConsistencyMode::Degraded | ConsistencyMode::Consistent => {
(self.replication_factor() + 1) - self.read_quorum(consistency_mode)
Owner

In degraded mode, we will have write_quorum = replication_mode, but that doesn't seem right?

Can be fixed by:

ConsistencyMode::Degraded | ConsistencyMode::Consistent => {
    (self.replication_factor() + 1) - self.read_quorum(ConsistencyMode::Consistent)
}
In degraded mode, we will have write_quorum = replication_mode, but that doesn't seem right? Can be fixed by: ```rust ConsistencyMode::Degraded | ConsistencyMode::Consistent => { (self.replication_factor() + 1) - self.read_quorum(ConsistencyMode::Consistent) } ```
yuka marked this conversation as resolved
@ -63,2 +62,4 @@
}
}
impl Serialize for ReplicationFactor {
Owner

The same can be achieved with #[derive(Serialize, Deserialize)] + #[serde(transparent)]

The same can be achieved with `#[derive(Serialize, Deserialize)]` + `#[serde(transparent)]`
yuka marked this conversation as resolved
@ -46,3 +39,1 @@
}
fn max_write_errors(&self) -> usize {
self.replication_factor - self.write_quorum
self.layout_manager.read_quorum()
Owner

Did you mean write_quorum instead of read_quorum ?

Did you mean write_quorum instead of read_quorum ?
yuka marked this conversation as resolved
@ -37,2 +35,3 @@
/// - 3 is the recommended and supported setting.
// (we can add more aliases for this later)
pub replication_mode: String,
pub replication_factor: usize,
Owner

We probably want a migration path, or at least a safeguard, for people that don't update their config files and continue using the old replication_mode variable.

Option 1, migration path : if replication_mode is set, check that replication_factor and consistency_mode are not set, and populate them from the legacy replication_mode value.

Option 2, safeguard : if replication_mode is set, throw an error when initializing Garage

We probably want a migration path, or at least a safeguard, for people that don't update their config files and continue using the old `replication_mode` variable. Option 1, migration path : if replication_mode is set, check that replication_factor and consistency_mode are not set, and populate them from the legacy replication_mode value. Option 2, safeguard : if replication_mode is set, throw an error when initializing Garage
@ -39,0 +40,4 @@
/// - Degraded -> Disable read quorum
/// - Dangerous -> Disable read and write quorum
#[serde(default)]
pub consistency_mode: String,
Owner

Is the default value (empty string) parsed the same as "consistent" ?

Maybe we can move the ConsistencyMode enum here and use it directly in the config struct

Is the default value (empty string) parsed the same as "consistent" ? Maybe we can move the ConsistencyMode enum here and use it directly in the config struct
yuka force-pushed split-consistency-mode from 386476bbc6 to 537bd5bb97 2024-03-03 14:27:42 +00:00 Compare
Author
Contributor

My original intention was for the config option to be case-sensitive, uppercase ("Degraded", "Dangerous"), and the default when no value was specified should be "Consistent". I broke the default while rebasing/splitting the commit.

We can easily change it to case-sensitive lowercase. For case-insensitive we'd have to have a custom mapping or add a new dependency (serde-aux).

Also I don't have a strong opinion whether this should live in the config module or in its own module. So far I saw other enums (like ReplicationMode before) placed outside of the config module, so I followed that.

My original intention was for the config option to be case-sensitive, uppercase ("Degraded", "Dangerous"), and the default when no value was specified should be "Consistent". I broke the default while rebasing/splitting the commit. We can easily change it to case-sensitive lowercase. For case-insensitive we'd have to have a custom mapping or add a new dependency (serde-aux). Also I don't have a strong opinion whether this should live in the config module or in its own module. So far I saw other enums (like ReplicationMode before) placed outside of the config module, so I followed that.
yuka force-pushed split-consistency-mode from 537bd5bb97 to 8f1f87e2b1 2024-03-03 14:36:43 +00:00 Compare
yuka force-pushed split-consistency-mode from 8f1f87e2b1 to 56b793b959 2024-03-03 15:41:05 +00:00 Compare
Author
Contributor

I have changed the consistency mode setting to lower-case (still case-sensitive), restored the behavior for the default consistency mode, and added a migration path.

I have changed the consistency mode setting to lower-case (still case-sensitive), restored the behavior for the default consistency mode, and added a migration path.
yuka force-pushed split-consistency-mode from 56b793b959 to 1fb70df026 2024-03-03 16:00:58 +00:00 Compare
yuka force-pushed split-consistency-mode from 1fb70df026 to 5cd9e7e36b 2024-03-03 16:02:24 +00:00 Compare
yuka force-pushed split-consistency-mode from 5cd9e7e36b to df469ed97a 2024-03-03 16:14:47 +00:00 Compare
yuka force-pushed split-consistency-mode from df469ed97a to 975bcc9cd0 2024-03-03 18:41:16 +00:00 Compare
yuka force-pushed split-consistency-mode from 975bcc9cd0 to e4eab91c5a 2024-03-03 18:41:25 +00:00 Compare
yuka requested review from lx 2024-03-03 18:42:57 +00:00
lx reviewed 2024-03-04 15:16:31 +00:00
lx left a comment
Owner

LGTM, there is just a conflict in one of the documentation files

LGTM, there is just a conflict in one of the documentation files
@ -9,3 +9,3 @@
```toml
replication_mode = "3"
replication_factor = 3
Owner

For completeness we should include consistency_mode = "consistent" here (we want an example for all configuration lines)

For completeness we should include `consistency_mode = "consistent"` here (we want an example for all configuration lines)
Author
Contributor

Done

Done
@ -161,4 +153,3 @@
- `3`: data stored on Garage will be stored on three different nodes, if
possible each in a different zones. Garage tolerates two node failure, or
several node failures but in no more than two zones (in a deployment with at
Owner

Add a comment saying that arbitrarily large replication factors are now supported?

Add a comment saying that arbitrarily large replication factors are now supported?
Author
Contributor

Added a note about replication factors higher than 3 in the listing

Added a note about replication factors higher than 3 in the listing
lx marked this conversation as resolved
yuka force-pushed split-consistency-mode from e4eab91c5a to 7acd75dc26 2024-03-04 15:37:22 +00:00 Compare
yuka force-pushed split-consistency-mode from 7acd75dc26 to 9c6e1bf933 2024-03-04 15:40:57 +00:00 Compare
yuka force-pushed split-consistency-mode from 9c6e1bf933 to 060f83edf1 2024-03-04 17:43:20 +00:00 Compare
yuka changed title from ReplicationMode -> ConsistencyMode+ReplicationFactor to WIP: ReplicationMode -> ConsistencyMode+ReplicationFactor 2024-03-04 17:59:10 +00:00
yuka force-pushed split-consistency-mode from 060f83edf1 to 3670f61779 2024-03-04 18:02:36 +00:00 Compare
yuka force-pushed split-consistency-mode from 3670f61779 to 5a12279dc1 2024-03-04 18:03:06 +00:00 Compare
yuka force-pushed split-consistency-mode from 5a12279dc1 to ce042e48d4 2024-03-04 18:52:35 +00:00 Compare
yuka force-pushed split-consistency-mode from ce042e48d4 to 6bd595e3c5 2024-03-04 18:58:36 +00:00 Compare
yuka force-pushed split-consistency-mode from 6bd595e3c5 to 6af4a9d4b0 2024-03-05 21:57:19 +00:00 Compare
yuka changed title from WIP: ReplicationMode -> ConsistencyMode+ReplicationFactor to ReplicationMode -> ConsistencyMode+ReplicationFactor 2024-03-05 21:59:32 +00:00
Owner

I think changing all of the config files in all of the testing scripts to use replication_factor instead of the old replication_mode config param is going to cause issues, for example:

  • in the Jepsen tests, if we want to run them on older Garage versions that still only have the replication_mode configuration variable (I think we want to keep that possibility as much as possible)

  • in the upgrade test (script/test-upgrade.sh), where we are testing an upgrade path from v0.8.x to v0.9 / v0.10.

In the upgrade test we have some logic to differenciate between Garage versions (0.8 vs 0.9) because some CLI commands have changed (in dev-configure.sh and dev-bucket.sh). For the config file syntax change, either we add some similar logic, or we could just keep the old syntax with replication_mode = "3" for those tests.

I've launched a full CI run that should do the upgrade test, which might fail because of this (we'll see)

I think changing all of the config files in all of the testing scripts to use `replication_factor` instead of the old `replication_mode` config param is going to cause issues, for example: - in the Jepsen tests, if we want to run them on older Garage versions that still only have the `replication_mode` configuration variable (I think we want to keep that possibility as much as possible) - in the upgrade test (`script/test-upgrade.sh`), where we are testing an upgrade path from v0.8.x to v0.9 / v0.10. In the upgrade test we have some logic to differenciate between Garage versions (0.8 vs 0.9) because some CLI commands have changed (in `dev-configure.sh` and `dev-bucket.sh`). For the config file syntax change, either we add some similar logic, or we could just keep the old syntax with `replication_mode = "3"` for those tests. I've launched a full CI run that should do the upgrade test, which might fail because of this (we'll see)
yuka force-pushed split-consistency-mode from 6af4a9d4b0 to c1769bbe69 2024-03-07 11:45:42 +00:00 Compare
lx merged commit 20c0b4ffb2 into next-0.10 2024-03-07 16:32:52 +00:00
Owner

Thanks !

Thanks !
Sign in to join this conversation.
No description provided.