From fdf908e8452fc57291c03eb85aa63de5c1b8093d Mon Sep 17 00:00:00 2001
From: Alex Auvolat <alex@adnab.me>
Date: Thu, 25 Feb 2021 11:23:22 +0100
Subject: [PATCH] Add precisions

---
 doc/Load_Balancing.md | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/doc/Load_Balancing.md b/doc/Load_Balancing.md
index ead7eb920..808bb4b35 100644
--- a/doc/Load_Balancing.md
+++ b/doc/Load_Balancing.md
@@ -2,30 +2,37 @@ I have conducted a quick study of different methods to load-balance data over di
 
 ### Requirements
 
-- good balancing: two nodes that have the same announced capacity should receive close to the same number of items
-- multi-datacenter: the replicas of a partition should be distributed over as many datacenters as possible
-- minimal disruption: when adding or removing a node, as few partitions as possible should have to move around
+- *good balancing*: two nodes that have the same announced capacity should receive close to the same number of items
+
+- *multi-datacenter*: the replicas of a partition should be distributed over as many datacenters as possible
+
+- *minimal disruption*: when adding or removing a node, as few partitions as possible should have to move around
+
+- *order-agnostic*: the same set of nodes (associated with a datacenter name
+  and a capacity) should always return the same distribution of partition
+  replicas, independently of the order in which nodes were added/removed (this
+  is to keep the implementation simple)
 
 ### Methods
 
 #### Naive multi-DC ring walking strategy
 
-This strategy can be used with any ring-linke algorithm to make it aware of the *multi-datacenter* requirement:
+This strategy can be used with any ring-like algorithm to make it aware of the *multi-datacenter* requirement:
 
 - the ring is a list of positions, each associated with a single node in the cluster
 - look up position of item on ring
 - select the node for that position
 - go clockwise, skipping nodes that:
   - we halve already selected
-  - are in a datacenter of a node we have selected, except if we already have nodes from all available datacenters
+  - are in a datacenter of a node we have selected, except if we already have nodes from all possible datacenters
 
 In this way the selected nodes will always be distributed over
 `min(n_datacenters, n_replicas)` different datacenters, which is the best we
 can do.
 
-This method was implemented in the first iteration of Garage, with the basic
-ring construction that consists in associating `n_token` random positions to
-each node.
+This method was implemented in the first version of Garage, with the basic
+ring construction from Dynamo DB that consists in associating `n_token` random positions to
+each node (I know it's not optimal, the Dynamo paper already studies this).
 
 #### Better rings
 
@@ -43,7 +50,7 @@ To solve this, we want to apply a second method for partitionning our dataset:
 
 I have studied two ways to do the attribution, in a way that is deterministic:
 
-- Custom: take `argmin_node(hash(node, partition_number))`
+- Min-hash: for each partition, select node that minimizes `hash(node, partition_number)`
 - MagLev: see [here](https://blog.acolyer.org/2016/03/21/maglev-a-fast-and-reliable-software-network-load-balancer/)
 
 MagLev provided significantly better balancing, as it guarantees that the exact