Test strategy #114

New issue

Closed

opened 2021-10-04 13:12:39 +00:00 by quentin · 4 comments

quentin commented

2021-10-04 13:12:39 +00:00

Owner

Define our test strategy. Testing is a research field on its own.
I think we should aggregate some references to take an informed decision.

About testing distributed systems:

Jepsen is a testing framework designed to test distributed systems. It can mock some part of the system like the time and the network.
FoundationDB Testing Approach. They chose to abstract "all sources of nondeterminism and communication are abstracted, including network, disk, time, and pseudo random number generator" to be able to run tests by simulating faults.
Testing Distributed Systems - Curated list of resources on testing distributed systems

About S3 compatibility:

About benchmarking S3 (I think it is not necessarily very relevant for this iteration):

Engineering blog posts:

Quincy @ Scale: A Tale of Three Large-Scale Clusters

Misc:

I would really like to rewrite our test-smoke.sh script as a proper Rust integration test.
Writing an integration tests can be done in 2 ways:
- Running the binary from the test (ie. running execve)
- Writing garage as a library, the entrypoint would be a simple call to this library. From the test, we could also import and call this library.
mutagen - mutation testing is a way to assert our test quality by mutating the code and see if the mutation makes the tests fail
fuzzing - cargo supports fuzzing, it could be a way to test our software reliability in presence of garbage data.

My current opinion on this subject:

We should rewrite our tests with Rust to have a proper integration, our bash scripts are fragile and are harder to extend
It should not be too complicated to add minio/mint and would add value to Garage by providing a compatibility score and reference that can be trusted
As a last step, we may slowly integrate Jepsen.

Feel free to contribute references

Define our test strategy. Testing is a research field on its own. I think we should aggregate some references to take an informed decision. About testing distributed systems: - [Jepsen](https://jepsen.io/) is a testing framework designed to test distributed systems. It can mock some part of the system like the time and the network. - [FoundationDB Testing Approach](https://www.micahlerner.com/2021/06/12/foundationdb-a-distributed-unbundled-transactional-key-value-store.html#what-is-unique-about-foundationdbs-testing-framework). They chose to abstract "all sources of nondeterminism and communication are abstracted, including network, disk, time, and pseudo random number generator" to be able to run tests by simulating faults. - [Testing Distributed Systems](https://asatarin.github.io/testing-distributed-systems/) - Curated list of resources on testing distributed systems About S3 compatibility: - [ceph/s3-tests](https://github.com/ceph/s3-tests) - (deprecated) [minio/s3verify](https://blog.min.io/s3verify-a-simple-tool-to-verify-aws-s3-api-compatibility/) - [minio/mint](https://github.com/minio/mint) About benchmarking S3 (I think it is not necessarily very relevant for this iteration): - [minio/warp](https://github.com/minio/warp) - [wasabi-tech/s3-benchmark](https://github.com/wasabi-tech/s3-benchmark) - [dvassallo/s3-benchmark](https://github.com/dvassallo/s3-benchmark) - [intel-cloud/cosbench](https://github.com/intel-cloud/cosbench) - used by Ceph Engineering blog posts: - [Quincy @ Scale: A Tale of Three Large-Scale Clusters](https://ceph.io/en/news/blog/2022/three-large-scale-clusters/) Misc: - I would really like to rewrite our `test-smoke.sh` script as a proper Rust integration test. - Writing an integration tests can be done in 2 ways: - Running the binary from the test (ie. running `execve`) - Writing `garage` as a library, the entrypoint would be a simple call to this library. From the test, we could also import and call this library. - [mutagen](https://github.com/llogiq/mutagen) - mutation testing is a way to assert our test quality by mutating the code and see if the mutation makes the tests fail - [fuzzing](https://rust-fuzz.github.io/book/) - cargo supports fuzzing, it could be a way to test our software reliability in presence of garbage data. My current opinion on this subject: 1. We should rewrite our tests with Rust to have a proper integration, our bash scripts are fragile and are harder to extend 2. It should not be too complicated to add minio/mint and would add value to Garage by providing a compatibility score and reference that can be trusted 3. As a last step, we may slowly integrate Jepsen. Feel free to contribute references

quentin added the

kind

ideas

label 2021-10-04 13:12:39 +00:00

lx commented

2021-10-06 08:12:18 +00:00

Owner

My opinion is that we should try to test in least invasive ways, i.e. minimize the impact of the testing framework on Garage's source code. This means for example:

Not abstracting IO/nondeterminism in the source code
Not making garage a shared library (launch using execve, it's perfectly fine)

Instead, we should focus on building a clean outer interface for the garage binary, for example loading configuration using environnement variables instead of the configuration file if that's helpfull for writing the tests.

There are two reasons for this:

Keep the soure code clean and focused
Test something that is as close as possible as the true garage that will actually be running

Reminder: rules of simplicity, concerning changes to Garage's source code. Always question what we are doing. Never do anything just because it looks nice or because we "think" it might be usefull at some later point but without knowing precisely why/when. Only do things that make perfect sense in the context of what we currently know.

Wrt testing strategies, let's make a list of things we want to try. Sort them from the most promising to the less promising. Try them in that order, aborting early if it's too complicated or not giving any good results.

My opinion is that we should try to test in least invasive ways, i.e. minimize the impact of the testing framework on Garage's source code. This means for example: - Not abstracting IO/nondeterminism in the source code - Not making `garage` a shared library (launch using `execve`, it's perfectly fine) Instead, we should focus on building a clean outer interface for the `garage` binary, for example loading configuration using environnement variables instead of the configuration file if that's helpfull for writing the tests. There are two reasons for this: - Keep the soure code clean and focused - Test something that is as close as possible as the true garage that will actually be running Reminder: rules of simplicity, concerning changes to Garage's source code. Always question what we are doing. Never do anything just because it looks nice or because we "think" it might be usefull at some later point but without knowing precisely why/when. Only do things that make perfect sense in the context of what we currently know. Wrt testing strategies, let's make a list of things we want to try. Sort them from the most promising to the less promising. Try them in that order, aborting early if it's too complicated or not giving any good results.

lx commented

2021-11-08 13:38:40 +00:00

Owner

Interesting blog posts on the blog of the Sled database:

Interesting blog posts on the blog of the Sled database: - <https://sled.rs/simulation.html> - <https://sled.rs/perf.html>

quentin added the

kind

testing

label 2021-11-16 08:52:45 +00:00

quentin commented

2022-11-13 12:57:00 +00:00

Author