jepsen: reg2 failure seems to happen only with deleteobject
This commit is contained in:
parent
4b93ce179a
commit
d148b83d4f
3 changed files with 30 additions and 7 deletions
|
@ -69,6 +69,8 @@ Command: `lein run test --nodes-file nodes.vagrant --time-limit 60 --rate 100 -
|
||||||
Results:
|
Results:
|
||||||
|
|
||||||
- Failures with clock-scramble nemesis + partition nemesis ???? TODO INVESTIGATE
|
- Failures with clock-scramble nemesis + partition nemesis ???? TODO INVESTIGATE
|
||||||
|
-> the issue seems to be only after DeleteObject (deletions are not always taken into account),
|
||||||
|
the issue does not appear if we are using only PutObject with an actual object content
|
||||||
- TODO: layout reconfiguration nemesis
|
- TODO: layout reconfiguration nemesis
|
||||||
|
|
||||||
|
|
||||||
|
@ -86,7 +88,7 @@ Results:
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
|
|
||||||
## Investigating (and fixing) wierd behavior
|
## Investigating (and fixing) errors
|
||||||
|
|
||||||
### Segfaults
|
### Segfaults
|
||||||
|
|
||||||
|
@ -107,6 +109,22 @@ Finally found out that this was due to closures not correctly capturing their co
|
||||||
Not sure exactly where it came from but it seems to have been fixed by making list-inner a separate function and not a sub-function,
|
Not sure exactly where it came from but it seems to have been fixed by making list-inner a separate function and not a sub-function,
|
||||||
and passing all values that were previously in the context (creds and prefix) as additional arguments.
|
and passing all values that were previously in the context (creds and prefix) as additional arguments.
|
||||||
|
|
||||||
|
### `reg2` test inconsistency, even with timestamp fix
|
||||||
|
|
||||||
|
The reg2 test is our custom checker for CRDT read-after-write on individual object keys, acting as registers which can be updated.
|
||||||
|
The test fails without the timestamp fix, which is expected as the clock scrambler will prevent nodes from having a correct ordering of objects.
|
||||||
|
|
||||||
|
With the timestamp fix, the happenned-before relationship should at least be respected, meaning that when a PutObject call starts
|
||||||
|
after another PutObject call has ended, the second call should overwrite the value of the first call, and that value should not be
|
||||||
|
readable by future GetObject calls.
|
||||||
|
However, we observed inconsistencies even with the timestamp fix.
|
||||||
|
|
||||||
|
The inconsistencies seemed to always happenned after writing a nil value, which translates to a DeleteObject call
|
||||||
|
instead of a PutObject. By removing the possibility of writing nil values, therefore only doing
|
||||||
|
PutObject calls, the issue disappears. There is therefore an issue to fix in DeleteObject.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
Copyright © 2023 Alex Auvolat
|
Copyright © 2023 Alex Auvolat
|
||||||
|
|
|
@ -20,10 +20,16 @@
|
||||||
"set1" set/workload1
|
"set1" set/workload1
|
||||||
"set2" set/workload2})
|
"set2" set/workload2})
|
||||||
|
|
||||||
|
(def patches
|
||||||
|
"A map of patch names to Garage builds"
|
||||||
|
{"default" "v0.9.0"
|
||||||
|
"tsfix1" "d146cdd5b66ca1d3ed65ce93ca42c6db22defc09"})
|
||||||
|
|
||||||
(def cli-opts
|
(def cli-opts
|
||||||
"Additional command line options."
|
"Additional command line options."
|
||||||
[["-I" "--increasing-timestamps" "Garage version with increasing timestamps on PutObject"
|
[["-p" "--patch NAME" "Garage patch to use"
|
||||||
:default false]
|
:default "default"
|
||||||
|
:validate [patches (cli/one-of patches)]]
|
||||||
["-r" "--rate HZ" "Approximate number of requests per second, per thread."
|
["-r" "--rate HZ" "Approximate number of requests per second, per thread."
|
||||||
:default 10
|
:default 10
|
||||||
:parse-fn read-string
|
:parse-fn read-string
|
||||||
|
@ -41,9 +47,7 @@
|
||||||
:concurrency, ...), constructs a test map."
|
:concurrency, ...), constructs a test map."
|
||||||
[opts]
|
[opts]
|
||||||
(let [workload ((get workloads (:workload opts)) opts)
|
(let [workload ((get workloads (:workload opts)) opts)
|
||||||
garage-version (if (:increasing-timestamps opts)
|
garage-version (get patches (:patch opts))]
|
||||||
"d146cdd5b66ca1d3ed65ce93ca42c6db22defc09"
|
|
||||||
"v0.9.0")]
|
|
||||||
(merge tests/noop-test
|
(merge tests/noop-test
|
||||||
opts
|
opts
|
||||||
{:pure-generators true
|
{:pure-generators true
|
||||||
|
|
|
@ -112,7 +112,8 @@
|
||||||
(range)
|
(range)
|
||||||
(fn [k]
|
(fn [k]
|
||||||
(->>
|
(->>
|
||||||
(gen/mix [op-get op-put op-del])
|
; (gen/mix [op-get op-put op-del])
|
||||||
|
(gen/mix [op-get op-put])
|
||||||
(gen/limit (:ops-per-key opts)))))})
|
(gen/limit (:ops-per-key opts)))))})
|
||||||
|
|
||||||
(defn workload1
|
(defn workload1
|
||||||
|
|
Loading…
Reference in a new issue