forked from Deuxfleurs/mknet
Add some doc about our own bench tool
This commit is contained in:
parent
86ab9d7c00
commit
430259d050
1 changed files with 22 additions and 1 deletions
23
liveness.md
23
liveness.md
|
@ -42,9 +42,30 @@ It starts to really look like a congestion control/control flow error/scheduler
|
|||
|
||||
## Write a custom client exhibiting the issue
|
||||
|
||||
We know how to trigger the issue with `warp`, Minio's benchmark tool but we don't yet understand well what kind of load it puts on the cluster except that it sends concurrently PUT and Multipart requests. So, before investigating the issue more in depth, we want to know:
|
||||
We know how to trigger the issue with `warp`, Minio's benchmark tool but we don't yet understand well what kind of load it puts on the cluster except that it sends concurrently Multipart and PutObject requests concurrently. So, before investigating the issue more in depth, we want to know:
|
||||
- If a single large PUT request can trigger this issue or not?
|
||||
- How many parallel requests are needed to trigger this issue?
|
||||
- Does Multipart transfer are more impacted by this issue?
|
||||
|
||||
To get answer to our questions, we will write a specific benchmark.
|
||||
Named s3concurrent, it is available here: https://git.deuxfleurs.fr/quentin/s3concurrent
|
||||
The benchmark starts by sending 1 file, then 2 files concurrently,
|
||||
then 3, then 4, up to 16 (this is hardcoded for now).
|
||||
|
||||
When ran on our mknet cluster, we start triggering issues as soon as we send 2 files at once:
|
||||
|
||||
```
|
||||
$ ./s3concurrent
|
||||
2022/08/11 20:35:28 created bucket 3ffd6798-bdab-4218-b6d0-973a07e46ea9
|
||||
2022/08/11 20:35:28 start concurrent loop with 1 coroutines
|
||||
2022/08/11 20:35:55 done, 1 coroutines returned
|
||||
2022/08/11 20:35:55 start concurrent loop with 2 coroutines
|
||||
2022/08/11 20:36:34 1/2 failed with Internal error: Could not reach quorum of 2. 1 of 3 request succeeded, others returned errors: ["Timeout", "Timeout"]
|
||||
2022/08/11 20:36:37 done, 2 coroutines returned
|
||||
2022/08/11 20:36:37 start concurrent loop with 3 coroutines
|
||||
2022/08/11 20:37:13 1/3 failed with Internal error: Could not reach quorum of 2. 1 of 3 request succeeded, others returned errors: ["Netapp error: Not connected: 92c7fb74ed89f289", "Netapp error: Not connected: 3cb7ed98f7c66a55"]
|
||||
2022/08/11 20:37:51 2/3 failed with Internal error: Could not reach quorum of 2. 1 of 3 request succeeded, others returned errors: ["Netapp error: Not connected: 92c7fb74ed89f289", "Netapp error: Not connected: 3cb7ed98f7c66a55"]
|
||||
2022/08/11 20:37:51 3/3 failed with Internal error: Could not reach quorum of 2. 1 of 3 request succeeded, others returned errors: ["Netapp error: Not connected: 92c7fb74ed89f289", "Netapp error: Not connected: 3cb7ed98f7c66a55"]
|
||||
2022/08/11 20:37:51 done, 3 coroutines returned
|
||||
2022/08/11 20:37:51 start concurrent loop with 4 coroutines
|
||||
```
|
||||
|
|
Loading…
Reference in a new issue