RPC performance changes #387

Merged
lx merged 3 commits from configurable-timeouts into main 3 months ago
lx commented 3 months ago
Owner
  • configurable ping timeout
  • single, much higher, configurable RPC timeout
  • no more concurrency semaphore
- configurable ping timeout - single, much higher, configurable RPC timeout - no more concurrency semaphore
lx added 1 commit 3 months ago
e9836a9747
RPC performance changes
lx force-pushed configurable-timeouts from e9836a9747 to ef3f7d6f76 3 months ago
lx force-pushed configurable-timeouts from ef3f7d6f76 to 56592e1853 3 months ago
Poster
Owner
Release build is here: <https://garagehq.deuxfleurs.fr/_releases/56592e18538b379ccaaa7b7c1990a599ac83b191/x86_64-unknown-linux-musl/garage> (ping @quentin)
Poster
Owner

Warp on v0.8.0-beta1 (before this patch)

# warp-mixed-2022-09-20[114353]-6dcL.csv.zst
$ warp mixed --host=localhost:3991 --objects 500 --obj.size 1MiB

Mixed operations.
Operation: DELETE, 10%, Concurrency: 20, Ran 4m45s.
 * Throughput: 2.09 obj/s

Operation: GET, 45%, Concurrency: 20, Ran 4m48s.
 * Throughput: 9.41 MiB/s, 9.41 obj/s

Operation: PUT, 15%, Concurrency: 20, Ran 4m48s.
 * Throughput: 3.13 MiB/s, 3.13 obj/s

Operation: STAT, 30%, Concurrency: 20, Ran 4m48s.
 * Throughput: 6.27 obj/s

Cluster Total: 12.51 MiB/s, 20.86 obj/s over 4m47s.
# warp-mixed-2022-09-20[115328]-pVlR.csv.zst
$ warp mixed --host=localhost:3991 --objects 500 --obj.size 1MiB

Mixed operations.
Operation: DELETE, 10%, Concurrency: 20, Ran 4m30s.
 * Throughput: 2.27 obj/s

Operation: GET, 45%, Concurrency: 20, Ran 4m32s.
 * Throughput: 10.19 MiB/s, 10.19 obj/s

Operation: PUT, 15%, Concurrency: 20, Ran 4m32s.
 * Throughput: 3.40 MiB/s, 3.40 obj/s

Operation: STAT, 30%, Concurrency: 20, Ran 4m33s.
 * Throughput: 6.80 obj/s

Cluster Total: 13.60 MiB/s, 22.64 obj/s over 4m30s.
# warp-get-2022-09-20[120439]-vzLH.csv.zst
$ warp get --host=localhost:3991 --objects 500 --obj.size 1MiB --autoterm

----------------------------------------
Operation: PUT
* Average: 2.06 MiB/s, 2.06 obj/s

Throughput, split into 70 x 1s:
 * Fastest: 12.3MiB/s, 12.29 obj/s
 * 50% Median: 498.4KiB/s, 0.49 obj/s
 * Slowest: 498.4KiB/s, 0.49 obj/s

----------------------------------------
Operation: GET
* Average: 234.43 MiB/s, 234.43 obj/s

Throughput, split into 208 x 1s:
 * Fastest: 289.4MiB/s, 289.43 obj/s
 * 50% Median: 241.6MiB/s, 241.61 obj/s
 * Slowest: 152.1MiB/s, 152.10 obj/s
# warp-put-2022-09-20[121052]-UY96.csv.zst
$ warp put --host=localhost:3991 --obj.size 1MiB --autoterm

----------------------------------------
Operation: PUT
* Average: 3.65 MiB/s, 3.65 obj/s

Throughput, split into 295 x 1s:
 * Fastest: 28.0MiB/s, 28.03 obj/s
 * 50% Median: 1041.6KiB/s, 1.02 obj/s
 * Slowest: 692.0KiB/s, 0.68 obj/s
## Warp on `v0.8.0-beta1` (before this patch) ``` # warp-mixed-2022-09-20[114353]-6dcL.csv.zst $ warp mixed --host=localhost:3991 --objects 500 --obj.size 1MiB Mixed operations. Operation: DELETE, 10%, Concurrency: 20, Ran 4m45s. * Throughput: 2.09 obj/s Operation: GET, 45%, Concurrency: 20, Ran 4m48s. * Throughput: 9.41 MiB/s, 9.41 obj/s Operation: PUT, 15%, Concurrency: 20, Ran 4m48s. * Throughput: 3.13 MiB/s, 3.13 obj/s Operation: STAT, 30%, Concurrency: 20, Ran 4m48s. * Throughput: 6.27 obj/s Cluster Total: 12.51 MiB/s, 20.86 obj/s over 4m47s. ``` ``` # warp-mixed-2022-09-20[115328]-pVlR.csv.zst $ warp mixed --host=localhost:3991 --objects 500 --obj.size 1MiB Mixed operations. Operation: DELETE, 10%, Concurrency: 20, Ran 4m30s. * Throughput: 2.27 obj/s Operation: GET, 45%, Concurrency: 20, Ran 4m32s. * Throughput: 10.19 MiB/s, 10.19 obj/s Operation: PUT, 15%, Concurrency: 20, Ran 4m32s. * Throughput: 3.40 MiB/s, 3.40 obj/s Operation: STAT, 30%, Concurrency: 20, Ran 4m33s. * Throughput: 6.80 obj/s Cluster Total: 13.60 MiB/s, 22.64 obj/s over 4m30s. ``` ``` # warp-get-2022-09-20[120439]-vzLH.csv.zst $ warp get --host=localhost:3991 --objects 500 --obj.size 1MiB --autoterm ---------------------------------------- Operation: PUT * Average: 2.06 MiB/s, 2.06 obj/s Throughput, split into 70 x 1s: * Fastest: 12.3MiB/s, 12.29 obj/s * 50% Median: 498.4KiB/s, 0.49 obj/s * Slowest: 498.4KiB/s, 0.49 obj/s ---------------------------------------- Operation: GET * Average: 234.43 MiB/s, 234.43 obj/s Throughput, split into 208 x 1s: * Fastest: 289.4MiB/s, 289.43 obj/s * 50% Median: 241.6MiB/s, 241.61 obj/s * Slowest: 152.1MiB/s, 152.10 obj/s ``` ``` # warp-put-2022-09-20[121052]-UY96.csv.zst $ warp put --host=localhost:3991 --obj.size 1MiB --autoterm ---------------------------------------- Operation: PUT * Average: 3.65 MiB/s, 3.65 obj/s Throughput, split into 295 x 1s: * Fastest: 28.0MiB/s, 28.03 obj/s * 50% Median: 1041.6KiB/s, 1.02 obj/s * Slowest: 692.0KiB/s, 0.68 obj/s ```
Poster
Owner

Warp with this patch

# warp-mixed-2022-09-20[130846]-qwTO.csv.zst
$ warp mixed --host=localhost:3991 --objects 500 --obj.size 1MiB

Mixed operations.
Operation: DELETE, 10%, Concurrency: 20, Ran 4m59s.
 * Throughput: 11.16 obj/s

Operation: GET, 45%, Concurrency: 20, Ran 5m0s.
Errors: 5
 * Throughput: 50.22 MiB/s, 50.22 obj/s

Operation: PUT, 15%, Concurrency: 20, Ran 5m0s.
 * Throughput: 16.73 MiB/s, 16.73 obj/s

Operation: STAT, 30%, Concurrency: 20, Ran 4m59s.
 * Throughput: 33.47 obj/s

Cluster Total: 66.93 MiB/s, 111.56 obj/s, 5 errors over 5m0s.
Total Errors:5.
# warp-mixed-2022-09-20[131803]-sp1P.csv.zst
$ warp mixed --host=localhost:3991 --objects 500 --obj.size 1MiB

Mixed operations.
Operation: DELETE, 10%, Concurrency: 20, Ran 4m58s.
 * Throughput: 10.93 obj/s

Operation: GET, 45%, Concurrency: 20, Ran 4m58s.
Errors: 10
 * Throughput: 49.14 MiB/s, 49.14 obj/s

Operation: PUT, 15%, Concurrency: 20, Ran 5m0s.
 * Throughput: 16.38 MiB/s, 16.38 obj/s

Operation: STAT, 30%, Concurrency: 20, Ran 4m58s.
 * Throughput: 32.79 obj/s

Cluster Total: 65.50 MiB/s, 109.21 obj/s, 10 errors over 4m59s.
Total Errors:10.
$ warp get --host=localhost:3991 --objects 500 --obj.size 1MiB --autoterm
# warp-get-2022-09-20[132629]-lNHt.csv.zst

Throughput 297.5MiB/s within 7.500000% for 16.457s. Assuming stability. Terminating benchmark.

----------------------------------------
Operation: PUT
* Average: 22.60 MiB/s, 22.60 obj/s

Throughput, split into 5 x 1s:
 * Fastest: 31.3MiB/s, 31.28 obj/s
 * 50% Median: 25.3MiB/s, 25.29 obj/s
 * Slowest: 13.9MiB/s, 13.94 obj/s

----------------------------------------
Operation: GET
* Average: 205.91 MiB/s, 205.91 obj/s

Throughput, split into 58 x 1s:
 * Fastest: 301.4MiB/s, 301.41 obj/s
 * 50% Median: 271.8MiB/s, 271.80 obj/s
 * Slowest: 1696.8KiB/s, 1.66 obj/s
# warp-put-2022-09-20[132906]-KpRZ.csv.zst
$ warp put --host=localhost:3991 --obj.size 1MiB --autoterm

----------------------------------------
Operation: PUT
* Average: 22.08 MiB/s, 22.08 obj/s

Throughput, split into 298 x 1s:
 * Fastest: 34.9MiB/s, 34.90 obj/s
 * 50% Median: 23.1MiB/s, 23.12 obj/s
 * Slowest: 3.1MiB/s, 3.12 obj/s
# warp-put-2022-09-20[141352]-HdyN.csv.zst
$ warp put --host=localhost:3991 --obj.size 1MiB --autoterm

Throughput 23.4MiB/s within 7.500000% for 27.104s. Assuming stability. Terminating benchmark.

----------------------------------------
Operation: PUT
* Average: 23.18 MiB/s, 23.18 obj/s

Throughput, split into 97 x 1s:
 * Fastest: 34.4MiB/s, 34.42 obj/s
 * 50% Median: 23.6MiB/s, 23.57 obj/s
 * Slowest: 11.1MiB/s, 11.08 obj/s
## Warp with this patch ``` # warp-mixed-2022-09-20[130846]-qwTO.csv.zst $ warp mixed --host=localhost:3991 --objects 500 --obj.size 1MiB Mixed operations. Operation: DELETE, 10%, Concurrency: 20, Ran 4m59s. * Throughput: 11.16 obj/s Operation: GET, 45%, Concurrency: 20, Ran 5m0s. Errors: 5 * Throughput: 50.22 MiB/s, 50.22 obj/s Operation: PUT, 15%, Concurrency: 20, Ran 5m0s. * Throughput: 16.73 MiB/s, 16.73 obj/s Operation: STAT, 30%, Concurrency: 20, Ran 4m59s. * Throughput: 33.47 obj/s Cluster Total: 66.93 MiB/s, 111.56 obj/s, 5 errors over 5m0s. Total Errors:5. ``` ``` # warp-mixed-2022-09-20[131803]-sp1P.csv.zst $ warp mixed --host=localhost:3991 --objects 500 --obj.size 1MiB Mixed operations. Operation: DELETE, 10%, Concurrency: 20, Ran 4m58s. * Throughput: 10.93 obj/s Operation: GET, 45%, Concurrency: 20, Ran 4m58s. Errors: 10 * Throughput: 49.14 MiB/s, 49.14 obj/s Operation: PUT, 15%, Concurrency: 20, Ran 5m0s. * Throughput: 16.38 MiB/s, 16.38 obj/s Operation: STAT, 30%, Concurrency: 20, Ran 4m58s. * Throughput: 32.79 obj/s Cluster Total: 65.50 MiB/s, 109.21 obj/s, 10 errors over 4m59s. Total Errors:10. ``` ``` $ warp get --host=localhost:3991 --objects 500 --obj.size 1MiB --autoterm # warp-get-2022-09-20[132629]-lNHt.csv.zst Throughput 297.5MiB/s within 7.500000% for 16.457s. Assuming stability. Terminating benchmark. ---------------------------------------- Operation: PUT * Average: 22.60 MiB/s, 22.60 obj/s Throughput, split into 5 x 1s: * Fastest: 31.3MiB/s, 31.28 obj/s * 50% Median: 25.3MiB/s, 25.29 obj/s * Slowest: 13.9MiB/s, 13.94 obj/s ---------------------------------------- Operation: GET * Average: 205.91 MiB/s, 205.91 obj/s Throughput, split into 58 x 1s: * Fastest: 301.4MiB/s, 301.41 obj/s * 50% Median: 271.8MiB/s, 271.80 obj/s * Slowest: 1696.8KiB/s, 1.66 obj/s ``` ``` # warp-put-2022-09-20[132906]-KpRZ.csv.zst $ warp put --host=localhost:3991 --obj.size 1MiB --autoterm ---------------------------------------- Operation: PUT * Average: 22.08 MiB/s, 22.08 obj/s Throughput, split into 298 x 1s: * Fastest: 34.9MiB/s, 34.90 obj/s * 50% Median: 23.1MiB/s, 23.12 obj/s * Slowest: 3.1MiB/s, 3.12 obj/s ``` ``` # warp-put-2022-09-20[141352]-HdyN.csv.zst $ warp put --host=localhost:3991 --obj.size 1MiB --autoterm Throughput 23.4MiB/s within 7.500000% for 27.104s. Assuming stability. Terminating benchmark. ---------------------------------------- Operation: PUT * Average: 23.18 MiB/s, 23.18 obj/s Throughput, split into 97 x 1s: * Fastest: 34.4MiB/s, 34.42 obj/s * 50% Median: 23.6MiB/s, 23.57 obj/s * Slowest: 11.1MiB/s, 11.08 obj/s ```
lx added 1 commit 3 months ago
357b72f4ff
Merge branch 'main' into configurable-timeouts
lx added 1 commit 3 months ago
ded444f6c9
Ability to have custom timeouts in request strategy (not used)
Poster
Owner

Consequences of this PR:

  • Much higher PutObject throughput
  • Higher RAM usage, when data that cannot be sent to storage nodes accumulates in send buffer
  • The time to act upon the detection of a dead node might be increased, we'll have to see if this makes more frequent cases where Garage takes long to respond, or if failure detection in Netapp is sufficient for good failover (I guess we'll be testing this in prod)
Consequences of this PR: - Much higher PutObject throughput - Higher RAM usage, when data that cannot be sent to storage nodes accumulates in send buffer - The time to act upon the detection of a dead node might be increased, we'll have to see if this makes more frequent cases where Garage takes long to respond, or if failure detection in Netapp is sufficient for good failover (I guess we'll be testing this in prod)
lx changed title from WIP: RPC performance changes to RPC performance changes 3 months ago
lx merged commit 7a901f7aab into main 3 months ago
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
The pull request has been merged as 7a901f7aab.
Sign in to join this conversation.
Loading…
There is no content yet.