New model for buckets #172

Merged
lx merged 19 commits from new-buckets into main 2022-01-10 11:32:42 +00:00
2 changed files with 12 additions and 0 deletions
Showing only changes of commit 87121dce9d - Show all commits

View file

@ -638,6 +638,7 @@ in
rand = rustPackages."registry+https://github.com/rust-lang/crates.io-index".rand."0.8.4" { inherit profileName; };
rmp_serde = rustPackages."registry+https://github.com/rust-lang/crates.io-index".rmp-serde."0.15.5" { inherit profileName; };
serde = rustPackages."registry+https://github.com/rust-lang/crates.io-index".serde."1.0.130" { inherit profileName; };
serde_bytes = rustPackages."registry+https://github.com/rust-lang/crates.io-index".serde_bytes."0.11.5" { inherit profileName; };
sled = rustPackages."registry+https://github.com/rust-lang/crates.io-index".sled."0.34.7" { inherit profileName; };
structopt = rustPackages."registry+https://github.com/rust-lang/crates.io-index".structopt."0.3.23" { inherit profileName; };
tokio = rustPackages."registry+https://github.com/rust-lang/crates.io-index".tokio."1.12.0" { inherit profileName; };

View file

@ -28,6 +28,17 @@ pub trait Crdt {
fn merge(&mut self, other: &Self);
}
/// Option<T> implements Crdt for any type T, even if T doesn't implement CRDT itself: when
/// different values are detected, they are always merged to None. This can be used for value
/// types which shoulnd't be merged, instead of trying to merge things when we know we don't want
/// to merge them (which is what the AutoCrdt trait is used for most of the time). This cases
/// arises very often, for example with a Lww or a LwwMap: the value type has to be a CRDT so that
/// we have a rule for what to do when timestamps aren't enough to disambiguate (in a distributed
lx marked this conversation as resolved Outdated

I find this implementation surprising. I'd expect Some(true) and Some(false) to be merged to Some(true), not None.

I find this implementation surprising. I'd expect `Some(true)` and `Some(false)` to be merged to `Some(true)`, not `None`.
Outdated
Review

So I haven't written a doc comment for this yet, but the idea of this impl is that we can make a CRDT of any type T that doesn't implement CRDT, by declaring that different values merge to None. We probably should use this more often in fact, instead of trying to merge things when we know we don't want to merge them (which is what the AutoCrdt trait is used for most of the time). This cases arises very often, for example with a Lww or a LwwMap: the value type has to be a CRDT so that we have a rule for what to do when timestamps aren't enough to disambiguate (in a distributed system, anything can happen!), and with AutoCrdt the rule is to make an arbitrary (but determinstic) choice between the two. When using an option instead with this impl, ambiguity cases are explicitely stored as None, which allows us to detect the ambiguity and handle it in the way we want. This truly depends on the semantics of the application: the crdt module is just a toolbox of things that can be taken when needed when building models. In the precise case where we are using it here, i.e. for storing the website configuration, the logic is that if we have two website configurations and we don't know which one is the correct one, we can just disable website access until an admin gives a new configuration (we try to have a safe behavior instead of an unpredictable one -- but anyways this is an extreme edge case as it mostly should never happen that the timestamps are the same).

So I haven't written a doc comment for this yet, but the idea of this impl is that we can make a CRDT of any type T that doesn't implement CRDT, by declaring that different values merge to None. We probably should use this more often in fact, instead of trying to merge things when we know we don't want to merge them (which is what the AutoCrdt trait is used for most of the time). This cases arises very often, for example with a `Lww` or a `LwwMap`: the value type has to be a CRDT so that we have a rule for what to do when timestamps aren't enough to disambiguate (in a distributed system, anything can happen!), and with AutoCrdt the rule is to make an arbitrary (but determinstic) choice between the two. When using an option instead with this impl, ambiguity cases are explicitely stored as None, which allows us to detect the ambiguity and handle it in the way we want. This truly depends on the semantics of the application: the `crdt` module is just a toolbox of things that can be taken when needed when building models. In the precise case where we are using it here, i.e. for storing the website configuration, the logic is that if we have two website configurations and we don't know which one is the correct one, we can just disable website access until an admin gives a new configuration (we try to have a safe behavior instead of an unpredictable one -- but anyways this is an extreme edge case as it mostly should never happen that the timestamps are the same).

I understand now, but I'm not sure it's well fit to some place it's used.
For website configuration, if two updates happen at the same time, I think it makes more sense to consider an arbitrary one happened an instant before the other, and got overwritten, than to merge both into the state "website disabled"

I understand now, but I'm not sure it's well fit to some place it's used. For website configuration, if two updates happen at the same time, I think it makes more sense to consider an arbitrary one happened an instant before the other, and got overwritten, than to merge both into the state "website disabled"
/// system, anything can happen!), and with AutoCrdt the rule is to make an arbitrary (but
/// determinstic) choice between the two. When using an Option<T> instead with this impl, ambiguity
/// cases are explicitely stored as None, which allows us to detect the ambiguity and handle it in
/// the way we want. (this can only work if we are happy with losing the value when an ambiguity
/// arises)
impl<T> Crdt for Option<T>
where
T: Eq,