Add latency amplification part

This commit is contained in:
Quentin 2022-09-27 16:20:00 +02:00
parent af589aacd6
commit d99406fe63
Signed by: quentin
GPG key ID: E9602264D639FF68
2 changed files with 16 additions and 6 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 142 KiB

After

Width:  |  Height:  |  Size: 144 KiB

View file

@ -117,19 +117,29 @@ Such freezes could lead to request timeouts and failures. If it occures on our t
document how to avoid it or change how we handle our I/O.
At the same time, this was a very stressful test that will probably not be encountered in many setups: we were adding 273 object per seconds for 30 minutes!
As a conclusion, Garage can ingest 1 million tiny objects in 30 minutes in a very restricted environment. As a comparison, our production cluster at [deuxfleurs.fr](https://deuxfleurs) manages a bucket with 116k objects. This bucket contains real data as it is used by our Matrix instance to store people's media files (profile picture, shared pictures, videos, audios, documents...). Thanks to this benchmark, we have identified two points of vigilance: putting object duration seems linear with the number of existing objects in the cluster, and we have some volatility in our measured data that could be a symptom of our system freezing under the load. Despite these 2 points, we are confident that Garage could scale way above 1M+ objects, but it remains to be proved!
As a conclusion, Garage can ingest 1 million tiny objects in 30 minutes in a very restricted environment. As a comparison, our production cluster at [deuxfleurs.fr](https://deuxfleurs) manages a bucket with 116k objects. This bucket contains real data as it is used by our Matrix instance to store people's media files (profile picture, shared pictures, videos, audios, documents...). Thanks to this benchmark, we have identified two points of vigilance: putting object duration seems linear with the number of existing objects in the cluster, and we have some volatility in our measured data that could be a symptom of our system freezing under the load. Despite these two points, we are confident that Garage could scale way above 1M+ objects, but it remains to be proved!
## In an unpredictable world, stay resilient
- low bandwidth
**Latency amplification** - We designed Garage with low-tech geo-distributed setups in mind. For example, our production cluster is hosted [on old Lenovo Thinkcentre Tiny Desktop computers](https://guide.deuxfleurs.fr/img/serv_neptune.jpg) behind consumer-grade fiber links across France and Belgium. With these kind of networks, the observed latency is in the 50ms range between nodes.
![]()
When latency is not negligible, you will observe that your requests completion time is a factor of your observed latency. That's expected: in many cases, the node of the cluster you are contacting can not directly answer your request, it needs to reach other nodes of the cluster to get your information. Each sequential request it does add to the final request duration, which can quickly become expensive.
This ratio between request duration and network latency is what we refer as *latency amplification*.
- high network latency. phenomenon we name amplification
For example, on Garage, a GetObject request does two sequential calls: first, it asks for the descriptor of the requested object containing the block list of the requested object, then it retrieves its blocks.
We can expect that the request duration of a small GetObject request will be close to twice the network latency.
![](amplification.png)
On the following graph, we test experimentally some standard endpoints, including GetObject:
- complexity (constant time)
![Latency amplification](amplification.png)
As Garage has been optimized for this use case from the beginning, we don't see any significant evolution from one version to another (garage v0.7.3 and garage v0.8.0 beta here).
Compared to Minio, these values are either similar (for ListObjects and ListBuckets) or way better (for GetObject, PutObject, and RemoveObject).
It is understandable: Minio has not been designed for environment with high latencies, you are expected to build your clusters in the same datacenter, and then possibly connect them with their asynchronous [Bucket Replication](https://min.io/docs/minio/linux/administration/bucket-replication.html?ref=docs-redirect) feature.
*Minio also has a [Multi-Site Active-Active Replication System](https://blog.min.io/minio-multi-site-active-active-replication/) but it is even more sensitive to latency: "Multi-site replication has increased latency sensitivity, as MinIO does not consider an object as replicated until it has synchronized to all configured remote targets. Replication latency is therefore dictated by the slowest link in the replication mesh."*
**Node count amplification** -
![](complexity.png)