Jepsen testing (NLnet task 3 subtask 1) #544

Merged
lx merged 41 commits from jepsen into main 2024-01-11 10:52:13 +00:00
3 changed files with 13 additions and 3 deletions
Showing only changes of commit 5b1f50be65 - Show all commits

View file

@ -97,10 +97,10 @@ Command: `lein run test --nodes-file nodes.vagrant --time-limit 60 --rate 100 -
Results:
- For now, no failures with clock-scramble nemesis + partition nemesis -> TODO long test run
- No failures with clock-scramble nemesis + db nemesis + partition nemesis (`--scenario cdp`) (0 failures in 10 runs).
- **Fails with layout reconfiguration nemesis** (`--scenario r`).
Example of a failed run: `garage set2/20231025T115033.553+0200` (2 failures in 2 runs).
- **Fails with just layout reconfiguration nemesis** (`--scenario r`).
Example of a failed run: `garage set2/20231025T141940.198+0200` (10 failures in 10 runs).
TODO: investigate.
This is the failure mode we are looking for and trying to fix for NLnet task 3.

View file

@ -28,6 +28,7 @@
"r" grgNemesis/scenario-r
"pr" grgNemesis/scenario-pr
"cpr" grgNemesis/scenario-cpr
"cdp" grgNemesis/scenario-cdp
"dpr" grgNemesis/scenario-dpr})
(def patches

View file

@ -124,6 +124,14 @@
(combined/partition-package {:db (:db opts), :interval 1, :faults #{:partition}})
(reconfiguration-package {:interval 1})]))
(defn scenario-cdp
"Clock modifying + db + partition scenario"
[opts]
(combined/compose-packages
[(combined/clock-package {:db (:db opts), :interval 1, :faults #{:clock}})
(combined/db-package {:db (:db opts), :interval 1, :faults #{:db :pause :kill}})
(combined/partition-package {:db (:db opts), :interval 1, :faults #{:partition}})]))
(defn scenario-dpr
"Db + partition + cluster reconfiguration scenario"
[opts]
@ -131,3 +139,4 @@
[(combined/db-package {:db (:db opts), :interval 1, :faults #{:db :pause :kill}})
(combined/partition-package {:db (:db opts), :interval 1, :faults #{:partition}})
(reconfiguration-package {:interval 1})]))