replace RPC stack with netapp #123
No reviewers
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#123
Loading…
Reference in a new issue
No description provided.
Delete branch "netapp"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
NOTICE
v0.2
seriesTODO
NodeMeta
rpc_public_addr
WIP replace RPC stack with netappto WIP: replace RPC stack with netapp8ae40088d9
to97c5d5c77f
4a555c0c7d
tod9c52e9a9c
@ -127,0 +110,4 @@
563e1ac825ee3323aa441e72c26d1030d6d4414aeb3dd25287c531e7fc2bc95d@[fc00:1::1]:3901
Venus$ garage node-id
86f0f26ae4afbd59aaf9cfb059eefac844951efd5b8caeec0d53f4ed6c85f332[fc00:1::2]:3901
Maybe a missing @ here?
@ -127,0 +120,4 @@
```toml
bootstrap_peers = [
"563e1ac825ee3323aa441e72c26d1030d6d4414aeb3dd25287c531e7fc2bc95d@[fc00:1::1]:3901",
"86f0f26ae4afbd59aaf9cfb059eefac844951efd5b8caeec0d53f4ed6c85f332[fc00:1::2]:3901",
idem here?
@ -64,1 +65,3 @@
(garage server -c /tmp/config.$count.toml 2>&1|while read r; do echo -en "$LABEL $r\n"; done) &
(garage -c /tmp/config.$count.toml server 2>&1|while read r; do echo -en "$LABEL $r\n"; done) &
done
# >>>>>>>>>>>>>>>> END FOR LOOP ON NODES
check socat multiple launch. Tracked in #124
832602d7f3
to1dc7cc7936
you need to run
cargo2nix -f
again after updating netapp :)I have reviewed your merge request and agree to merge it.
Before merging it, I think our main requirement is to make
garage node-id
works when themeta
directory has not been created yet (please refer to my comment).It seems to be an already existing bug but we may display an empty failed node list in some cases that is very useless when debugging (please refer to my comment). We must at least track this.
Might be for a future PR but we should improve error messages, especially when:
rpc_secret
does not matchMight also be for a future PR but the secret handshake protocol will be a bit new for our users, we should write some documentation about securing a cluster. We must mention:
I have not:
If you are not confident on one of these two points, I can test them before you merge.
@ -391,3 +388,2 @@
if failure_case_1 || failure_case_2 {
println!("\nFailed nodes:");
for adv in status.iter().filter(|x| !x.is_up) {
println!("\n==== FAILED NODES ====");
If we are on
failure_case_2
we display the failed nodes section but it is empty.It can be reproduced by spanning 2 garage instances then connecting the first instance to the second one.
eg:
@ -80,0 +97,4 @@
// Find and parse the address of the target host
let (id, addr) = if let Some(h) = opt.rpc_host {
let (id, addrs) = parse_and_resolve_peer_addr(&h).expect("Invalid RPC host");
We should replace this error message by something like that:
The most important in my opinion is to drop the "rpc host" and "host" terminology in favor of a more abstract terminology such as "remote peer identifier" that can not be confused with the layer 3 or layer 4 terminology/patterns.
@ -98,0 +144,4 @@
format!("{}@127.0.0.1:3901", idstr)
};
if !quiet {
I think we can improve the wording of this very helpful and critical message :)
We speak about "nodes" indiferrently, we can qualify them as "remote" or "local", "existing" or "new", etc. to help the user identifiy the new/local node he/she is configuring and the remote/existing nodes he/she has already configured.
We could better separate concerns (on this node // on a cluster node, your cluster is live // you are configuring it) and prioritize our examples.
Does it mean that we can join any cluster? If yes, this is a secrity issue.
I will propose an alternative text after gaining some more knowledge on this process.
@ -98,0 +158,4 @@
idstr
);
eprintln!(
"where <remote_node> is their own node identifier in the format: <pubkey>@<ip>:<port>"
Ok, I missed this point the first time I tested, I tried with
127.0.0.1:3911
and got the following error:I think here we should directly put
<pubkey>@<address>:<port>
instead of<remote_node>
.@ -0,0 +128,4 @@
} else {
let (_, key) = ed25519::gen_keypair();
let mut f = std::fs::File::create(key_file.as_path())?;
/etc/garage.toml
with the content given in the Quickstartgarage node-id
(this is similar to the steps advertised in "Cookbook > Deploying Garage"2 points:
@ -0,0 +130,4 @@
} else {
let (_, key) = ed25519::gen_keypair();
let mut f = std::fs::File::create(key_file.as_path())?;
There are many chance that the
garage node-id
command will fail as the key will be stored in themeta
folder that will very likely not be created yet and it will throw the previous cryptic error I a diagnosed through strace.@ -50,3 +26,1 @@
#[error(display = "PKI error: {}", _0)]
Pki(#[error(source)] webpki::Error),
#[error(display = "Netapp error: {}", _0)]
When the wrong
rpc_secret
is used and that it leads to a failed handshake, we should add a hint to the user that it may check his/her secret key.@ -71,3 +49,2 @@
#[error(display = "Remote error: {} (status code {})", _0, _1)]
RemoteError(String, StatusCode),
#[error(display = "Too many errors: {:?}", _0)]
We discussed renaming this error "FailedQuorumError" or something similar as this is the only case it is fired.
Fixes #101
Fixes #36
0a45a2b8ee
todbe457d3fa
WIP: replace RPC stack with netappto replace RPC stack with netappd838d604eb
tof4d246ee24
f4d246ee24
to991fe2032f
991fe2032f
to2a1dc24710
2a1dc24710
todb2924ad80
db2924ad80
to28c3c27c26
28c3c27c26
to2eba3d6d62
2eba3d6d62
to6d8b74cf8d
6d8b74cf8d
to88a91fe648
e84432181f
to060ed9dcd0
060ed9dcd0
tocda4523872
eb8c93d50d
to6ab65b6ef4
6ab65b6ef4
tod5a8ce7fc7
d5a8ce7fc7
todf8a4068d9