improve how nodes roles are assigned in garage #152
No reviewers
Labels
No labels
AdminAPI
Bug
Check AWS
CI
Correctness
Critical
Documentation
Ideas
Improvement
Low priority
Newcomer
Performance
S3 Compatibility
Testing
Usability
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#152
Loading…
Reference in a new issue
No description provided.
Delete branch "node-configure"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
--replace
Currently, this PR breaks the format of the network configuration: when migrating, the cluster will be in a state where no roles are assigned. All roles must be re-assigned and commited at once. This migration should not pose an issue.
WIP: improve how nodes roles are assigned in garageto improve how nodes roles are assigned in garageI reviewed your code and tested it locally on my computer.
I did not review your partition algorithm in depth neither tried a bigger deployment.
I have some minor remarks/question, the main one being the
Number of partitions that move
entry being not crystal clear to me, especially on the initialization phase.Thanks for this PR :)
@ -5,12 +5,12 @@
- [Quick start](./quick_start/index.md)
- [Cookbook](./cookbook/index.md)
- [Production Deployment](./cookbook/real_world.md)
Interesting that you changed the order here.
I put this guide way below because I thought that it was important to master the different concepts of Garage and its deployment before considering a production deployment ;)
The reasoning to me is that this is one of the most important pages in the documentation because it's not just about "production deployments" but more generally how to run Garage in a multi-node setup, a core feature of Garage. So it makes sense that it's on top just to be more visible. Maybe we should rename to "multi-node deployment" if it makes more sense.
@ -30,2 +32,4 @@
aws --endpoint-url http://127.0.0.1:3900 s3 ls
```
If a newly added gateway node seems to not be working, do a full table resync to ensure that bucket and key list are correctly propagated:
What are the conditions that require to manually trigger a resync?
Is it only in case of a bug?
The tables should be resynced regularly so if you just let nodes do their thing, it will eventually work. What can happen is that a node that is added does not receive the content of these tables before the first resync, so for some time the gateway node might be inoperable. (this is probably worthy of opening an issue)
@ -0,0 +38,4 @@
that either takes into account the proposed changes or cancels them:
```bash
garage layout apply --version <new_version_number>
For future release, could we consider a system similar to Nomad's one?
We could compute an ID for each layout, either random or a hash of the datastructure.
Then, when you run apply, you pass the ID of the layout configuration you want to apply.
This ID could be obtained from
garage layout show
.It would prevent the following bug: a version number would be necessarily bound to a specific version?
True, but we also need a way to make sure that version numbers are an increasing sequence, otherwise nodes don't have a way to know what is the last version (the one they should use)
@ -0,0 +164,4 @@
true
}
/// Calculate an assignation of partitions to nodes
So this is our new stateful partition algorithm, this one being when a modification is operated in the cluster?
Yes :)
@ -0,0 +224,4 @@
// Shuffle partitions between nodes so that nodes will reach (or better approach)
// their target number of stored partitions
loop {
let mut usefull = false;
is there any reason you wrote useful with 2
l
(it seems to be an old way to write it)?I don't speak English very well, sorry.
@ -0,0 +303,4 @@
}
println!("Number of partitions that move:");
for ((nminus, nplus), npart) in diffcount {
println!("\t-{}\t+{}\t{}", nminus, nplus, npart);
When I initialize a cluster, I have:
I do not understant what these values mean :s
I think this line misses at least the id of the affected node as the first value, something like:
And I think that the
+1
is a strange edge case of the initial algorithm, I think it should be either:or:
Ok I will make this part more readable and contain more information.
@ -0,0 +317,4 @@
true
}
fn initial_partition_assignation(&self) -> Option<Vec<PartitionAss<'_>>> {
And this is one is our initial partition algorithm, so when the cluster is initialized for the first time?
Yes, it is used:
Basically, we use this function to ensure that we have an initial assignation of three (or n) nodes to each partition. Once we have it, we just do an iterative optimization algorihtm (the other function, above), that tries to balance better the number of partitions between nodes by doing elementary operations that consist only in replacing one node by another somewhere in the assignation.
@ -0,0 +391,4 @@
Some(partitions)
}
We might want to put the pseudo code of these 2 partition computation algorithms in the design page.
Ideally it would be someone else than you, LX, that write it: it would allow to proof-read this part more in-depth and spread the knowledge a bit more :)
True
@ -6,3 +6,4 @@
license = "AGPL-3.0"
description = "Utility crate for the Garage object store"
repository = "https://git.deuxfleurs.fr/Deuxfleurs/garage"
readme = "../../README.md"
:P
971c5ca66b
to4752046990
4752046990
tocd378622b4
cd378622b4
to3685bd91e9
3685bd91e9
toa3871f2251
a3871f2251
toc94406f428