better consistent hashing #31
Labels
No Label
AdminAPI
Bug
Check AWS
CI
Correctness
Critical
Documentation
Ideas
Improvement
Low priority
Newcomer
Performance
S3 Compatibility
Testing
Usability
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#31
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Current technique:
New technique:
partition_bits is a constant of the system that defines how many partitions of the data exist, typically partition_bits = 10 and we have 1024 partitions.
Two advantages to the new technique:
One inconvenient:
EDIT: Maglev is even better, the load distribution is perfect and it has the same other properties as the method described above
better consistentto better consistent hashingScript
simulate_ring.py
shows a simulation of four hashing algorihtms:The last one is the best, although it uses 3x more RAM since the list of replicas for a partition is stored in the partition and not obtained by walking the ring of partitions. (i.e. a partition stores 3 node IDs instead of just one, but it is totally worth it because the walking strategy was a major source of imbalance). Also note that the last method provides very good balancing for 256 partitions only, even in a test with 10 nodes, whereas methods 2 and 3 need much more partitions (typically 4096) to counter the imbalance due to the ring walking.
Mostly done. Todo: rename
n_tokens
intocapacity
Done