New article: Bringing theoretical design and observed performances face to face #12

Merged
quentin merged 24 commits from perf into master 2022-09-29 11:16:04 +00:00
Showing only changes of commit be6aaffa0c - Show all commits

View file

@ -4,30 +4,30 @@ date=2022-09-26
+++
*We*
*For the past years, we have extensively analyzed possible design decisions and their theoretical tradeoffs on Garage, being it on the network, data structure, or scheduling side. And it worked well enough for our production cluster at Deuxfleurs, but we also knew that people started discovering some unexpected behaviors. We thus started a round of benchmark and performance improvement to make Garage more versatile and better understand what we can expect from it.*
<!-- more -->
---
## ⚠️ Disclaimer
The following results must be taken with a critical grain of salt due to some limitations that are inherent to any benchmark. We try to reference them in the following.
The following results must be taken with a critical grain of salt due to some limitations that are inherent to any benchmark. We try to reference them in this section, some limitations might be missing.
Most of our tests are done on simulated networks that can not represent all the diversity of real networks (dynamic drop, jitter, latency, all of them could possibly be correlated with throughput or any other external event). We also limited ourselves to very small workloads that are not representative of a production cluster.
For some benchmarks, we used Minio as a reference. It must be noted that we did not try to optimize its configuration as we have done on Garage, and more generally, we have way less knowledge on Minio than on Garage.
It must also be noted that Gare and Minio are systems with different feature set, *eg.* Minio supports erasure coding for better data density while Garage doesn't.
For some benchmarks, we used Minio as a reference. It must be noted that we did not try to optimize its configuration as we have done on Garage, and more generally, we have way less knowledge on Minio than on Garage, which can lead to underrated performance measurements for Minio.
It must also be noted that Garage and Minio are systems with different feature sets, *eg.* Minio supports erasure coding for better data density while Garage doesn't, Minio implements way more S3 endpoints than Garage, etc. Such feature have necessarily a cost that you must keep in mind when reading plots.
Impact of the testing environment is also not evaluated (kernel patches, configuration, parameters, filesystem, etc.), some of these configurations could favor one configuration/software over another. Finally, our results are also provided without statistical tests to check their significance, and thus might be statistically not significative.
Impact of the testing environment is also not evaluated (kernel patches, configuration, parameters, filesystem, hardware configuration, etc.), some of these configurations could favor one configuration/software over another. Especially, it must be noted that most of the tests were done on a consumer-grade computer and SSD only, which will be different from most production setups. Finally, our results are also provided without statistical tests to check their significance, and thus might be statistically not significative.
In any case, we are not making a business or technical recommendation, we only share bits of our development process.
Read [benchmarking crimes](https://gernot-heiser.org/benchmarking-crimes.html) and make your own tests if you need to take a decision!
When reading this post, please keep in mind that **we are not making any business or technical recommendation here**, we only share bits of our development process.
Read [benchmarking crimes](https://gernot-heiser.org/benchmarking-crimes.html), make your own tests if you need to take a decision, and remain supportive and caring with your peers...
## About our testing environment
- Grid 5k.
We started a batch of tests on [Grid5000](https://www.grid5000.fr/w/Grid5000:Home), a large-scale and flexible testbed for experiment-driven research in all areas of computer science, under the [Open Access](https://www.grid5000.fr/w/Grid5000:Open-Access) program. During our tests, we used part of the following clusters: [nova](https://www.grid5000.fr/w/Lyon:Hardware#nova), [paravance](https://www.grid5000.fr/w/Rennes:Hardware#paravance), and [econome](https://www.grid5000.fr/w/Nantes:Hardware#econome) to make a geo-distributed topology. We used the Grid5000 testbed only during our preliminary tests to identify issues when running Garage on many powerful servers, issues that we then reproduced in a controlled environment; don't be surprised then if Grid5000 is not mentioned often on our plots.
- My own computer.
To reproduce some environments locally, we have a small set of Python scripts named [mknet](https://git.deuxfleurs.fr/Deuxfleurs/mknet) tailored to our needs[^1]. Most of the following tests where thus run locally with mknet on a single computer: a Dell Inspiron 27" 7775 AIO, with a Ryzen 5 1400, 16GB of RAM, a 512GB SSD. In term of software, NixOS 22.05 with the 5.15.50 kernel is used with an ext4 encrypted filesystem. The `vm.dirty_background_ratio` and `vm.dirty_ratio` have been reduce to `2` and `1` respectively as, otherwise, the system tends to freeze when it is under heavy I/O load.
## Efficient I/O
@ -73,3 +73,5 @@ Read [benchmarking crimes](https://gernot-heiser.org/benchmarking-crimes.html) a
- analysis and comparison of Garage at scale
- try to better understand ecosystem (riak cs, minio, ceph, swift) -> some knowledge to get
[^1]: Yes, we are aware of [Jepsen](https://github.com/jepsen-io/jepsen) existence. This tool is far more complex than our set of scripts, but we know that it is also way more versatile.