diff --git a/README.md b/README.md index 92abc61..5801916 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ you must add to your path some tools. ```bash # see versions on https://garagehq.deuxfleurs.fr/_releases.html export GRG_ARCH=x86_64-unknown-linux-musl -export GRG_VERSION=v0.5.0 +export GRG_VERSION=v0.7.2.1 sudo wget https://garagehq.deuxfleurs.fr/_releases/${GRG_VERSION}/${GRG_ARCH}/garage -O /usr/local/bin/garage sudo chmod +x /usr/local/bin/garage diff --git a/example/deploy_garage.sh b/example/deploy_garage.sh index 0dc9b38..4e661d4 100755 --- a/example/deploy_garage.sh +++ b/example/deploy_garage.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash set -euo pipefail IFS=$'\n\t' diff --git a/liveness.md b/liveness.md new file mode 100644 index 0000000..60f2fea --- /dev/null +++ b/liveness.md @@ -0,0 +1,35 @@ +# Liveness issues + +We know that some people reported having timeouts when putting some load on their Garage cluster. +On our production cluster that runs without pressure, we don't really observe this behaviour. + +But when I wanted to start a benchmark created by Minio developers, I hit the same limit. +So I wanted to reproduce this behavior in a more controlled environment. + +I thus chose to use mknet to emulate a simple network with close to zero latency but with a very small bandwidth, 1M. The idea is that the network will be the bottleneck, and not the CPU, the memory or the disk. + +After a while, we quickly observe that the cluster is not reacting very well: + +``` +[nix-shell:/home/quentin/Documents/dev/deuxfleurs/mknet]# warp get --host=[fc00:9a7a:9e::1]:3900 --obj.size 100MiB --obj.randsize --duration=10m --concurrent 8 --objects 200 --access-key=GKc1e16da48142bdb95d98a4e4 --secret-key=c4ef5d5f7ee24ccae12a98912bf5b1fda28120a7e3a8f90cb3710c8683478b31 +Creating Bucket "warp-benchmark-bucket"...Element { tag_name: {http://s3.amazonaws.com/doc/2006-03-01/}LocationConstraint, attributes: [], namespaces: [Namespace { name: None, uri: "http://s3.amazonaws.com/doc/2006-03-01/" }] } +warp: upload error: Internal error: Could not reach quorum of 2. 1 of 3 request succeeded, others returned errors: ["Timeout", "Timeout"] +warp: upload error: Internal error: Could not reach quorum of 2. 1 of 3 request succeeded, others returned errors: ["Netapp error: Not connected: 3cb7ed98f7c66a55", "Netapp error: Not connected: 92c7fb74ed89f289"] +warp: upload error: Put "http://[fc00:9a7a:9e::1]:3900/warp-benchmark-bucket/xVdzjy23/1.KximayVLlhLwfE5f.rnd": dial tcp [fc00:9a7a:9e::1]:3900: i/o timeout +warp: upload error: Put "http://[fc00:9a7a:9e::1]:3900/warp-benchmark-bucket/N4zQvKhs/1.XkkO6DJ%28hVpGIrMj.rnd": dial tcp [fc00:9a7a:9e::1]:3900: i/o timeout +warp: upload error: Internal error: Could not reach quorum of 2. 1 of 3 request succeeded, others returned errors: ["Timeout", "Timeout"] +warp: upload error: Internal error: Could not reach quorum of 2. 1 of 3 request succeeded, others returned errors: ["Netapp error: Not connected: 3cb7ed98f7c66a55", "Netapp error: Not connected: 92c7fb74ed89f289"] +warp: upload error: Put "http://[fc00:9a7a:9e::1]:3900/warp-benchmark-bucket/GQrsevhN/1.7hglGIP%28mXTJMgFE.rnd": read tcp [fc00:9a7a:9e:ffff:ffff:ffff:ffff:ffff]:57008->[fc00:9a7a:9e::1]:3900: read: connection reset by peer + +warp: Error preparing server: upload error: Internal error: Could not reach quorum of 2. 1 of 3 request succeeded, others returned errors: ["Timeout", "Timeout"]. +``` + +We observe many different types of error: + - [RPC] Timeout quorum errors, they are probably generated by a ping between nodes + - [RPC] Not connected error, after a timeout, a reconnection is not triggered directly + - [S3 Gateway] The gateway took to much time to answer and a timeout was triggered in the client + - [S3 Gateway] The S3 gateway closes the TCP connection before answering + +As a first conclusion, we started to clearly reduce the scope of the problem by identifying that this undesirable behavior is triggered by a network bottleneck. + + diff --git a/slow-net.yml b/slow-net.yml new file mode 100644 index 0000000..75fdfc2 --- /dev/null +++ b/slow-net.yml @@ -0,0 +1,25 @@ +links: + - &slow + bandwidth: 1M + latency: 500us + - &1000 + bandwidth: 1000M + latency: 100us + +servers: + - name: node1 + <<: *slow + - name: node2 + <<: *slow + - name: node3 + <<: *slow + +global: + subnet: + base: 'fc00:9a7a:9e::' + local: 64 + zone: 16 + latency-offset: 3ms + upstream: + ip: fc00:9a7a:9e:ffff:ffff:ffff:ffff:ffff + conn: *1000