feature: Register consul services with agent API #567

Merged
lx merged 10 commits from unrob/garage:roberto/consul-agent-registration into main 2023-06-02 14:35:00 +00:00
Contributor

Implements service registration against the Consul agent API as a new feature, complementary to the current catalog API based one.

This allows the RPC service to be registered as a consul service (with additional tags/service metadata, if any are configured) through the higher level API through plain old TLS (not mTLS) and an optional ACL token. I've been running this at home for a few days now and really happy with it.

The necessary consul ACL policy looks like:

service_prefix "" {
  policy = "read"
}

service "garage" {
  policy = "write"
}

# needed to list catalog services for some reason
# https://developer.hashicorp.com/consul/docs/security/acl/acl-rules#node-rules
node_prefix "" {
  policy = "read"
}

This is my first time using rust and/or nix so I'm sure there's stuff I've missed, happy to continue working on it and would appreciate any feedback!

Implements service registration against the Consul [agent API](https://developer.hashicorp.com/consul/api-docs/agent/service#register-service) as a new feature, complementary to the current [catalog API](https://developer.hashicorp.com/consul/api-docs/catalog#register-entity) based one. This allows the RPC service to be registered as a consul service (with additional tags/service metadata, if any are configured) through the higher level API through plain old TLS (not mTLS) and an optional ACL token. I've been running this [at home](https://github.com/unRob/nidito/blob/657fa7f6b5be23f5316abef369868750824caa83/services/garage/garage.nomad) for a few days now and really happy with it. The necessary consul ACL policy looks like: ```hcl service_prefix "" { policy = "read" } service "garage" { policy = "write" } # needed to list catalog services for some reason # https://developer.hashicorp.com/consul/docs/security/acl/acl-rules#node-rules node_prefix "" { policy = "read" } ``` This is my first time using rust and/or nix so I'm sure there's stuff I've missed, happy to continue working on it and would appreciate any feedback!
lx requested changes 2023-05-09 09:10:46 +00:00
lx left a comment
Owner

This is interesting but it looks to me like a lot of duplicate code:

  • there is no need for a new feature flag, consul service discovery should just be a different mode for consul discovery than the default one

  • there is no need to duplicate the configuration options, just add a mode = "node|service" option to the existing [consul_discovery] section (and add to that section any new parameters that are required by the service discovery mode)

  • a lot of the definitions in consul_service.rs are the same as in consul.rs, please merge both files (your changes should just consist in adding the smallest possible number of struct declarations and new methods in the consul.rs file)

  • we now have two versions of the advertise_to_consul function, both of which are incompatible because they have the same name (meaning garage can't build with both feature flags activated). Please merge these two functions into one, and add the logic there to distinguish between the case of regular discovery as we had before and service discovery you are adding

Your final patch should be about 25% the size of this one.

This is interesting but it looks to me like a lot of duplicate code: - there is no need for a new feature flag, consul service discovery should just be a different mode for consul discovery than the default one - there is no need to duplicate the configuration options, just add a `mode = "node|service"` option to the existing `[consul_discovery]` section (and add to that section any new parameters that are required by the service discovery mode) - a lot of the definitions in `consul_service.rs` are the same as in `consul.rs`, please merge both files (your changes should just consist in adding the smallest possible number of struct declarations and new methods in the `consul.rs` file) - we now have two versions of the `advertise_to_consul` function, both of which are incompatible because they have the same name (meaning garage can't build with both feature flags activated). Please merge these two functions into one, and add the logic there to distinguish between the case of regular discovery as we had before and service discovery you are adding Your final patch should be about 25% the size of this one.
Author
Contributor

Thanks for the thorough feedback Alex!

Not sure I can make it 1/4th of the original size—I did briefly explore separating the catalog/agent logic into traits but that seemed to make the patch larger. Is there a particular approach to expressing these differences more succinctly you had in mind? As I mentioned, this is my first time writing rust so any examples or inline comments would be great.

Went with the separate feature in order to ensure I wasn't affecting the original implementation; an approach I could think of, but deemed too aggressive, was to delete the /v1/catalog-based implementation, only keeping the option for mTLS (I personally disable verify_incoming on the https listener for convenience, but perhaps other folks don't). That looks like this, +118/-80, but still not 1/4th of the PR's size.

I named the switch consul_http_api as I thought that maps best to what it toggles, and named the "modes" catalog and agent accordingly.

Let me know what you think!

Thanks for the thorough feedback Alex! Not sure I can make it 1/4th of the original size—I did briefly explore separating the catalog/agent logic into traits but that seemed to make the patch larger. Is there a particular approach to expressing these differences more succinctly you had in mind? As I mentioned, this is my first time writing rust so any examples or inline comments would be great. Went with the separate feature in order to ensure I wasn't affecting the original implementation; an approach I could think of, but deemed too aggressive, was to delete the `/v1/catalog`-based implementation, only keeping the option for mTLS (I personally disable `verify_incoming` on the `https` listener for convenience, but perhaps other folks don't). That looks like [this](https://git.deuxfleurs.fr/unrob/garage/pulls/1/files), `+118/-80`, but still not 1/4th of the PR's size. I named the switch `consul_http_api` as I thought that maps best to what it toggles, and named the "modes" `catalog` and `agent` accordingly. Let me know what you think!
lx reviewed 2023-05-11 10:06:11 +00:00
lx left a comment
Owner

This new version looks very good! The 25% size target was an exageration but you get the idea. Indeed, using traits is not always the best solution, and in this case I think it would have been huge overkill.

I've made a few remarks for simplifying the code. I'm suggesting to also add META_PREFIX to the old code for consistency/simplicity and because it's a good practice to have, even if it will break things when Garage daemons of different versions are runing together (we're still beta so that's fine).

This new version looks very good! The 25% size target was an exageration but you get the idea. Indeed, using traits is not always the best solution, and in this case I think it would have been huge overkill. I've made a few remarks for simplifying the code. I'm suggesting to also add `META_PREFIX` to the old code for consistency/simplicity and because it's a good practice to have, even if it will break things when Garage daemons of different versions are runing together (we're still beta so that's fine).
@ -35,12 +35,18 @@ bootstrap_peers = [
[consul_discovery]
mode = "node"
Owner

Is this the old name of consul_http_api ?

I'd suggest calling this parameter api, as it is already under the [consul_discovery] section so there is no need to specify that it is related to Consul.

For all parameters: take care of making sure the names in the docs are consistent with the actual code

Is this the old name of `consul_http_api` ? I'd suggest calling this parameter `api`, as it is already under the `[consul_discovery]` section so there is no need to specify that it is related to Consul. For all parameters: take care of making sure the names in the docs are consistent with the actual code
unrob marked this conversation as resolved
@ -41,2 +42,4 @@
client_cert = "/etc/consul/consul-client.crt"
client_key = "/etc/consul/consul-key.crt"
# for `service` mode, unset client_cert and client_key, and optionally enable `consul_http_token`
# consul_http_token = "abcdef-01234-56789"
Owner

Similarly, I'd suggest calling this simply token

Similarly, I'd suggest calling this simply `token`
unrob marked this conversation as resolved
@ -41,2 +46,4 @@
#[serde(rename = "Tags")]
tags: Vec<String>,
#[serde(rename = "Meta")]
service_meta: HashMap<String, String>,
Owner

It's called service_meta here but meta in the other definition below. Let's just call it meta everywhere

It's called `service_meta` here but `meta` in the other definition below. Let's just call it `meta` everywhere
unrob marked this conversation as resolved
@ -87,3 +80,1 @@
.use_rustls_tls()
.identity(identity)
.build()?
let mut builder: reqwest::ClientBuilder = reqwest::Client::builder();
Owner

add .use_rustls_tls() here and remove all other instances below (we are using rustls in all cases)

add `.use_rustls_tls()` here and remove all other instances below (we are using rustls in all cases)
unrob marked this conversation as resolved
@ -90,1 +109,4 @@
_ => return Err(ConsulError::InvalidTLSConfig),
}
builder.build()?
Owner

refactoring: move builder.build()? after the match

This means that there is no let client = match, just a match block that mutates builder

and after the match block there is let client = builder.build()?;

refactoring: move `builder.build()?` after the match This means that there is no `let client = match`, just a match block that mutates `builder` and after the match block there is `let client = builder.build()?;`
unrob marked this conversation as resolved
@ -118,0 +145,4 @@
let pubkey = match &self.config.consul_http_api {
ConsulDiscoveryAPI::Catalog => ent.node_meta.get("pubkey"),
ConsulDiscoveryAPI::Agent => {
ent.service_meta.get(&format!("{}-pubkey", META_PREFIX))
Owner

I think we should also add the META_PREFIX to the old code that uses the Catalog API, it's a good practice to have and it will simplify the code. It will make Consul discovery incompatible between versions but that's fine, we are still beta.

I think we should also add the `META_PREFIX` to the old code that uses the Catalog API, it's a good practice to have and it will simplify the code. It will make Consul discovery incompatible between versions but that's fine, we are still beta.
unrob marked this conversation as resolved
@ -141,0 +182,4 @@
let meta_prefix: String = match &self.config.consul_http_api {
ConsulDiscoveryAPI::Catalog => "".to_string(),
ConsulDiscoveryAPI::Agent => format!("{}-", META_PREFIX),
};
Owner

this can also be simplified by using META_PREFIX in all cases

this can also be simplified by using META_PREFIX in all cases
unrob marked this conversation as resolved
@ -141,0 +193,4 @@
for (key, value) in global_meta.into_iter() {
meta.insert(key.clone(), value.clone());
}
}
Owner

I'd replace 187-195 by the following:

let mut meta = self.config.meta.clone().unwrap_or_default();
meta.insert(format!("{}-pubkey", META_PREFIX), hex::encode(node_id));
meta.insert(format!("{}-hostname", META_PREFIX), hostname.to_string());
I'd replace 187-195 by the following: ```rust let mut meta = self.config.meta.clone().unwrap_or_default(); meta.insert(format!("{}-pubkey", META_PREFIX), hex::encode(node_id)); meta.insert(format!("{}-hostname", META_PREFIX), hostname.to_string()); ```
unrob marked this conversation as resolved
@ -162,2 +229,2 @@
let http = self.client.put(&url).json(&advertisement).send().await?;
}),
})
Owner

I don't like the big (...) around the match. Can you let the result of the match to a separate variable req_json, and then do let http = req_json.send().await?;

I don't like the big `(`...`)` around the match. Can you `let` the result of the match to a separate variable `req_json`, and then do `let http = req_json.send().await?;`
Author
Contributor

Went with an untagged enum here, since I couldn't think of a type for the advertisement structs; thought I could use serde::Serialize but that didn't work out and I'm still reading up to understand why.

Went with an untagged enum here, since I couldn't think of a type for the advertisement structs; thought I could use `serde::Serialize` but that didn't work out and I'm still reading up to understand why.
unrob marked this conversation as resolved
@ -137,1 +137,4 @@
#[derive(Deserialize, Debug, Clone)]
pub enum ConsulDiscoveryAPI {
#[serde(rename_all = "lowercase")]
Owner

I think the rename_all line has to go above the pub enum ConsulDiscoveryAPI line?

I think the `rename_all` line has to go above the `pub enum ConsulDiscoveryAPI` line?
unrob marked this conversation as resolved
@ -138,0 +141,4 @@
Catalog,
Agent,
}
impl ConsulDiscoveryAPI {
Owner

This whole impl can be avoided by deriving Default on the enum. The default variant has to be marked with a #[default] tag.

This whole impl can be avoided by deriving `Default` on the enum. The default variant has to be marked with a `#[default]` tag.
unrob marked this conversation as resolved
Author
Contributor

Really appreciate your comments, gave it another go and will be deploying a build of this at home to double check. Will report back later this week.

Really appreciate your comments, gave it another go and will be deploying a build of this at home to double check. Will report back later this week.
Author
Contributor

Been running this build for a few days and things are looking good, missed a default for the api config key, but I think it's ready for review.

Been running this build for a few days and things are looking good, missed a default for the `api` config key, but I think it's ready for review.
lx approved these changes 2023-05-22 09:12:24 +00:00
lx left a comment
Owner

LGTM, just some small inconsistencies in the documentation

LGTM, just some small inconsistencies in the documentation
@ -40,3 +41,4 @@
ca_cert = "/etc/consul/consul-ca.crt"
client_cert = "/etc/consul/consul-client.crt"
client_key = "/etc/consul/consul-key.crt"
# for `catalog` API mode, unset client_cert and client_key, and optionally enable `token`
Owner

I think you meant for agent API mode instead?

I think you meant for `agent` API mode instead?
unrob marked this conversation as resolved
@ -319,0 +326,4 @@
Two APIs for service registration are supported: `catalog` and `agent`. `catalog`, the default, will register a service using
the `/v1/catalog` endpoints, enabling mTLS if `client_cert` and `client_key` are provided. The `agent` API uses the
`v1/agent` endpoints instead, where an optional `consul_http_token` may be provided.
Owner

token instead of consul_http_token

`token` instead of `consul_http_token`
unrob marked this conversation as resolved
unrob force-pushed roberto/consul-agent-registration from e8af3a7144 to 32ad4538ee 2023-05-22 14:57:25 +00:00 Compare
Owner

Thanks for the work, I'll merge now, sorry for the wait!

Thanks for the work, I'll merge now, sorry for the wait!
lx merged commit 44548a9114 into main 2023-06-02 14:35:00 +00:00
unrob deleted branch roberto/consul-agent-registration 2023-06-07 05:20:27 +00:00
Sign in to join this conversation.
No description provided.