From 6942355d439d2c4e3a1628a6b104ac9b98c6e6e5 Mon Sep 17 00:00:00 2001 From: Quentin Dufour Date: Sun, 16 Oct 2022 11:04:36 +0200 Subject: [PATCH 1/4] update readme.md --- README.md | 29 --------------------------- cluster/prod/app/core/deploy/core.hcl | 2 +- 2 files changed, 1 insertion(+), 30 deletions(-) diff --git a/README.md b/README.md index ef3f082..11b0346 100644 --- a/README.md +++ b/README.md @@ -58,35 +58,6 @@ To upgrade NixOS, use the `./upgrade_nixos` script instead (it has the same synt **When adding a node to the cluster:** just do `./deploy_nixos ` -### Deploying Wesher - -We use Wesher to provide an encrypted overlay network between nodes in the cluster. -This is usefull in particular for securing services that are not able to do mTLS, -but as a security-in-depth measure, we make all traffic go through Wesher even when -TLS is done correctly. It is thus mandatory to have a working Wesher installation -in the cluster for it to run correctly. - -First, if no Wesher shared secret key has been generated for this cluster yet, -generate it with: - -``` -./gen_wesher_key -``` - -This key will be stored in `pass`, so you must have a working `pass` installation -for this script to run correctly. - -Then, deploy the key on all nodes with: - -``` -./deploy_wesher_key -``` - -This should be done after `./deploy_nixos` has run successfully on all nodes. -You should now have a working Wesher network between all your nodes! - -**When adding a node to the cluster:** just do `./deploy_wesher_key ` - ### Generating and deploying a PKI for Consul and Nomad This is very similar to how we do for Wesher. diff --git a/cluster/prod/app/core/deploy/core.hcl b/cluster/prod/app/core/deploy/core.hcl index 7449740..5c9f9c0 100644 --- a/cluster/prod/app/core/deploy/core.hcl +++ b/cluster/prod/app/core/deploy/core.hcl @@ -90,7 +90,7 @@ EOH } resources { - cpu = 2000 + cpu = 500 memory = 200 } From 9a8cbf91215317a571ed3714f76230a751a91896 Mon Sep 17 00:00:00 2001 From: Quentin Dufour Date: Sun, 16 Oct 2022 11:14:50 +0200 Subject: [PATCH 2/4] WIP doc --- README.md | 134 +++++++------------------------------------------ README.more.md | 129 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 148 insertions(+), 115 deletions(-) create mode 100644 README.more.md diff --git a/README.md b/README.md index 11b0346..f796b3e 100644 --- a/README.md +++ b/README.md @@ -8,130 +8,34 @@ It sets up the following: - Consul, with TLS - Nomad, with TLS -## Configuring the OS -This repo contains a bunch of scripts to configure NixOS on all cluster nodes. -Most scripts are invoked with the following syntax: +## How to welcome a new administrator -- for scripts that generate secrets: `./gen_ ` to generate the secrets to be used on cluster `` -- for deployment scripts: - - `./deploy_ ` to run the deployment script on all nodes of the cluster `` - - `./deploy_ ...` to run the deployment script only on nodes `node1, node2, ...` of cluster ``. +See: https://guide.deuxfleurs.fr/operations/acces/pass/ -All deployment scripts can use the following parameters passed as environment variables: +Basically: + - The new administrator generates a GPG key and publishes it on Gitea + - All existing administrators pull their key and sign it + - An existing administrator reencrypt the keystore with this new key and push it + - The new administrator clone the repo and check that they can decrypt the secrets -- `SUDO_PASS`: optionnally, the password for `sudo` on cluster nodes. If not set, it will be asked at the begninning. -- `SSH_USER`: optionnally, the user to try to login using SSH. If not set, the username from your local machine will be used. +## How to create files for a new zone -### Assumptions (how to setup your environment) +*The documentation is written for the production cluster, the same apply for other clusters.* -- you have an SSH access to all of your cluster nodes (listed in `cluster//ssh_config`) +Basically: + - Create your `site` file in `cluster/prod/site/` folder + - Create your `node` files in `cluster/prod/node/` folder + - Add your wireguard configuration to `cluster/prod/cluster.nix` -- your account is in group `wheel` and you know its password (you need it to become root using `sudo`); - the password is the same on all cluster nodes (see below for password management tools) +## How to deploy a Nix configuration on a fresh node -- you have a clone of the secrets repository in your `pass` password store, for instance at `~/.password-store/deuxfleurs` - (scripts in this repo will read and write all secrets in `pass` under `deuxfleurs/cluster//`) +*To be written* -### Deploying the NixOS configuration +## How to operate a node -The NixOS configuration makes use of a certain number of files: +*To be written* -- files in `nix/` that are the same for all deployments on all clusters -- the file `cluster//cluster.nix`, a Nix configuration file that is specific to the cluster but is copied the same on all cluster nodes -- files in `cluster//site/`, which are specific to the various sites on which Nix nodes are deployed -- files in `cluster//node/` which are specific to each node - -To deploy the NixOS configuration on the cluster, simply do: - -``` -./deploy_nixos -``` - -or to deploy only on a single node: - -``` -./deploy_nixos -``` - -To upgrade NixOS, use the `./upgrade_nixos` script instead (it has the same syntax). - -**When adding a node to the cluster:** just do `./deploy_nixos ` - -### Generating and deploying a PKI for Consul and Nomad - -This is very similar to how we do for Wesher. - -First, if the PKI has not yet been created, create it with: - -``` -./gen_pki -``` - -Then, deploy the PKI on all nodes with: - -``` -./deploy_pki -``` - -**When adding a node to the cluster:** just do `./deploy_pki ` - -### Adding administrators and password management - -Adminstrators are defined in the `cluster.nix` file for each cluster (they could also be defined in the site-specific Nix files if necessary). -This is where their public SSH keys for remote access are put. - -Administrators will also need passwords to administrate the cluster, as we are not using passwordless sudo. -To set the password for a new administrator, they must have a working `pass` installation as specified above. -They must then run: - -``` -./passwd -``` - -to set their password in the `pass` database (the password is hashed, so other administrators cannot learn their password even if they have access to the `pass` db). - -Then, an administrator that already has root access must run the following (after syncing the `pass` db) to set the password correctly on all cluster nodes: - -``` -./deploy_passwords -``` - -## Deploying stuff on Nomad - -### Connecting to Nomad - -Connect using SSH to one of the cluster nodes, forwarding port 14646 to port 4646 on localhost, and port 8501 to port 8501 on localhost. - -You can for instance use an entry in your `~/.ssh/config` that looks like this: - -``` -Host caribou - HostName 2a01:e0a:c:a720::23 - LocalForward 14646 127.0.0.1:4646 - LocalForward 8501 127.0.0.1:8501 - LocalForward 1389 bottin.service.staging.consul:389 -``` - -Then, in a separate window, launch `./tlsproxy `: this will -launch `socat` proxies that strip the TLS layer and allow you to simply access -Nomad and Consul on the regular, unencrypted URLs: `http://localhost:4646` for -Nomad and `http://localhost:8500` for Consul. Keep this terminal window for as -long as you need to access Nomad and Consul on the cluster. - -### Launching services - -Stuff should be started in this order: - -1. `app/core` -2. `app/frontend` -3. `app/telemetry` -4. `app/garage-staging` -5. `app/directory` - -Then, other stuff can be started in any order: - -- `app/im` (cluster `staging` only) -- `app/cryptpad` (cluster `prod` only) -- `app/drone-ci` +## More +Please read README.more.md for more detailed information diff --git a/README.more.md b/README.more.md new file mode 100644 index 0000000..8a9579f --- /dev/null +++ b/README.more.md @@ -0,0 +1,129 @@ +# Additional README + +## Configuring the OS + +This repo contains a bunch of scripts to configure NixOS on all cluster nodes. +Most scripts are invoked with the following syntax: + +- for scripts that generate secrets: `./gen_ ` to generate the secrets to be used on cluster `` +- for deployment scripts: + - `./deploy_ ` to run the deployment script on all nodes of the cluster `` + - `./deploy_ ...` to run the deployment script only on nodes `node1, node2, ...` of cluster ``. + +All deployment scripts can use the following parameters passed as environment variables: + +- `SUDO_PASS`: optionnally, the password for `sudo` on cluster nodes. If not set, it will be asked at the begninning. +- `SSH_USER`: optionnally, the user to try to login using SSH. If not set, the username from your local machine will be used. + +### Assumptions (how to setup your environment) + +- you have an SSH access to all of your cluster nodes (listed in `cluster//ssh_config`) + +- your account is in group `wheel` and you know its password (you need it to become root using `sudo`); + the password is the same on all cluster nodes (see below for password management tools) + +- you have a clone of the secrets repository in your `pass` password store, for instance at `~/.password-store/deuxfleurs` + (scripts in this repo will read and write all secrets in `pass` under `deuxfleurs/cluster//`) + +### Deploying the NixOS configuration + +The NixOS configuration makes use of a certain number of files: + +- files in `nix/` that are the same for all deployments on all clusters +- the file `cluster//cluster.nix`, a Nix configuration file that is specific to the cluster but is copied the same on all cluster nodes +- files in `cluster//site/`, which are specific to the various sites on which Nix nodes are deployed +- files in `cluster//node/` which are specific to each node + +To deploy the NixOS configuration on the cluster, simply do: + +``` +./deploy_nixos +``` + +or to deploy only on a single node: + +``` +./deploy_nixos +``` + +To upgrade NixOS, use the `./upgrade_nixos` script instead (it has the same syntax). + +**When adding a node to the cluster:** just do `./deploy_nixos ` + +### Generating and deploying a PKI for Consul and Nomad + +This is very similar to how we do for Wesher. + +First, if the PKI has not yet been created, create it with: + +``` +./gen_pki +``` + +Then, deploy the PKI on all nodes with: + +``` +./deploy_pki +``` + +**When adding a node to the cluster:** just do `./deploy_pki ` + +### Adding administrators and password management + +Adminstrators are defined in the `cluster.nix` file for each cluster (they could also be defined in the site-specific Nix files if necessary). +This is where their public SSH keys for remote access are put. + +Administrators will also need passwords to administrate the cluster, as we are not using passwordless sudo. +To set the password for a new administrator, they must have a working `pass` installation as specified above. +They must then run: + +``` +./passwd +``` + +to set their password in the `pass` database (the password is hashed, so other administrators cannot learn their password even if they have access to the `pass` db). + +Then, an administrator that already has root access must run the following (after syncing the `pass` db) to set the password correctly on all cluster nodes: + +``` +./deploy_passwords +``` + +## Deploying stuff on Nomad + +### Connecting to Nomad + +Connect using SSH to one of the cluster nodes, forwarding port 14646 to port 4646 on localhost, and port 8501 to port 8501 on localhost. + +You can for instance use an entry in your `~/.ssh/config` that looks like this: + +``` +Host caribou + HostName 2a01:e0a:c:a720::23 + LocalForward 14646 127.0.0.1:4646 + LocalForward 8501 127.0.0.1:8501 + LocalForward 1389 bottin.service.staging.consul:389 +``` + +Then, in a separate window, launch `./tlsproxy `: this will +launch `socat` proxies that strip the TLS layer and allow you to simply access +Nomad and Consul on the regular, unencrypted URLs: `http://localhost:4646` for +Nomad and `http://localhost:8500` for Consul. Keep this terminal window for as +long as you need to access Nomad and Consul on the cluster. + +### Launching services + +Stuff should be started in this order: + +1. `app/core` +2. `app/frontend` +3. `app/telemetry` +4. `app/garage-staging` +5. `app/directory` + +Then, other stuff can be started in any order: + +- `app/im` (cluster `staging` only) +- `app/cryptpad` (cluster `prod` only) +- `app/drone-ci` + From d442b9a068b39e0180027398faa7011bbbe3c3c9 Mon Sep 17 00:00:00 2001 From: Quentin Dufour Date: Sun, 16 Oct 2022 11:58:11 +0200 Subject: [PATCH 3/4] Update README --- README.md | 13 +++++++++++-- deploy_nixos | 4 ---- deploy_wg | 6 ++++++ 3 files changed, 17 insertions(+), 6 deletions(-) create mode 100755 deploy_wg diff --git a/README.md b/README.md index f796b3e..3c2a505 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,7 @@ Basically: - All existing administrators pull their key and sign it - An existing administrator reencrypt the keystore with this new key and push it - The new administrator clone the repo and check that they can decrypt the secrets + - Finally, the new administrator must choose a password to operate over SSH with `./passwd prod rick` where `rick` is the target username ## How to create files for a new zone @@ -26,11 +27,19 @@ Basically: Basically: - Create your `site` file in `cluster/prod/site/` folder - Create your `node` files in `cluster/prod/node/` folder - - Add your wireguard configuration to `cluster/prod/cluster.nix` + - Add your wireguard configuration to `cluster/prod/cluster.nix` (you will have to edit your NAT config manually) ## How to deploy a Nix configuration on a fresh node -*To be written* +We suppose that the node name is `datura`. +Start by doing the deployment one node at a time, you will have plenty of time +in your operator's life to break everything through automation. + +Run: + - `./deploy_wg prod datura` - to generate wireguard's keys + - `./deploy_nixos prod datura` - to deploy the nix configuration files (need to be redeployed on all nodes as hte new wireguard conf is needed everywhere) + - `./deploy_password prod datura` - to deploy user's passwords + - `./deploy_pki prod datura` - to deploy Nomad's and Consul's PKI ## How to operate a node diff --git a/deploy_nixos b/deploy_nixos index f62843d..0bd1b4c 100755 --- a/deploy_nixos +++ b/deploy_nixos @@ -7,8 +7,4 @@ copy cluster/$CLUSTER/cluster.nix /etc/nixos/cluster.nix copy cluster/$CLUSTER/node/$NIXHOST.nix /etc/nixos/node.nix copy cluster/$CLUSTER/node/$NIXHOST.site.nix /etc/nixos/site.nix -cmd 'mkdir -p /var/lib/deuxfleurs/wireguard-keys' -cmd 'test -f /var/lib/deuxfleurs/wireguard-keys/private || (wg genkey > /var/lib/deuxfleurs/wireguard-keys/private; chmod 600 /var/lib/deuxfleurs/wireguard-keys/private)' -cmd 'echo "Public key: $(wg pubkey < /var/lib/deuxfleurs/wireguard-keys/private)"' - cmd nixos-rebuild switch --show-trace diff --git a/deploy_wg b/deploy_wg new file mode 100755 index 0000000..ba67b2e --- /dev/null +++ b/deploy_wg @@ -0,0 +1,6 @@ +#!/usr/bin/env ./sshtool + +cmd 'nix-env -i wireguard' +cmd 'mkdir -p /var/lib/deuxfleurs/wireguard-keys' +cmd 'test -f /var/lib/deuxfleurs/wireguard-keys/private || (wg genkey > /var/lib/deuxfleurs/wireguard-keys/private; chmod 600 /var/lib/deuxfleurs/wireguard-keys/private)' +cmd 'echo "Public key: $(wg pubkey < /var/lib/deuxfleurs/wireguard-keys/private)"' From 45a0e850ce7c498e5ef1d281fb67b2f34dc00e8c Mon Sep 17 00:00:00 2001 From: Quentin Dufour Date: Sun, 16 Oct 2022 12:02:55 +0200 Subject: [PATCH 4/4] Improve deployment doc --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 3c2a505..4e0cd6f 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,7 @@ Basically: - The new administrator clone the repo and check that they can decrypt the secrets - Finally, the new administrator must choose a password to operate over SSH with `./passwd prod rick` where `rick` is the target username + ## How to create files for a new zone *The documentation is written for the production cluster, the same apply for other clusters.* @@ -27,7 +28,9 @@ Basically: Basically: - Create your `site` file in `cluster/prod/site/` folder - Create your `node` files in `cluster/prod/node/` folder - - Add your wireguard configuration to `cluster/prod/cluster.nix` (you will have to edit your NAT config manually) + - Add your wireguard configuration to `cluster/prod/cluster.nix` + - You will have to edit your NAT config manually + - To get your node's wg public key, you must run `./deploy_prod prod `, see the next section for more information ## How to deploy a Nix configuration on a fresh node