From 04f2bd48bb3d9a33e36409b8eddbad05e21807c1 Mon Sep 17 00:00:00 2001 From: Alex Auvolat Date: Wed, 20 Apr 2022 16:13:14 +0200 Subject: [PATCH] Add some readme --- README.md | 136 +++++++++++++++++++++++++++++++++++++ cluster/prod/ssh_config | 6 +- cluster/staging/ssh_config | 6 +- ssh_known_hosts | 3 + 4 files changed, 145 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index d993362..854ee41 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,142 @@ The following scripts are available here: - `tlsproxy.sh`, a script that allows non-TLS access to the TLS-secured Consul and Nomad, by running a simple local proxy with socat - `tlsenv.sh`, a script to be sourced (`source tlsenv.sh`) that configures the correct environment variables to use the Nomad and Consul CLI tools with TLS +## Configuring the OS + +This repo contains a bunch of scripts to configure NixOS on all cluster nodes. +Most scripts are invoked with the following syntax: + +- for scripts that generate secrets: `./gen_ ` to generate the secrets to be used on cluster `` +- for deployment scripts: + - `./deploy_ ` to run the deployment script on all nodes of the cluster `` + - `./deploy_ ...` to run the deployment script only on nodes `node1, node2, ...` of cluster ``. + + +### Assumptions (how to setup your environment) + +- you have an SSH access to all of your cluster nodes (listed in `cluster//ssh_config`) + +- your account is in group `wheel` and you know its password (you need it to become root using `sudo`) + +- you have a clone of the secrets repository in your `pass` password store, for instance at `~/.password-store/deuxfleurs` + (scripts in this repo will read and write all secrets in `pass` under `deuxfleurs/cluster//`) + +### Deploying the NixOS configuration + +The NixOS configuration makes use of a certain number of files: + +- files in `nix/` that are the same for all deployments on all clusters +- the file `cluster//cluster.nix`, a Nix configuration file that is specific to the cluster but is copied the same on all cluster nodes +- files in `cluster//site/`, which are specific to the various sites on which Nix nodes are deployed +- files in `cluster//node/` which are specific to each node + +To deploy the NixOS configuration on the cluster, simply do: + +``` +./deploy_nixos +``` + +or to deploy only on a single node: + +``` +./deploy_nixos +``` + +To upgrade NixOS, use the `./upgrade_nixos` script instead (it has the same syntax). + +**When adding a node to the cluster:** just do `./deploy_nixos ` + +### Deploying Wesher + +We use Wesher to provide an encrypted overlay network between nodes in the cluster. +This is usefull in particular for securing services that are not able to do mTLS, +but as a security-in-depth measure, we make all traffic go through Wesher even when +TLS is done correctly. It is thus mandatory to have a working Wesher installation +in the cluster for it to run correctly. + +First, if no Wesher shared secret key has been generated for this cluster yet, +generate it with: + +``` +./gen_wesher_key +``` + +This key will be stored in `pass`, so you must have a working `pass` installation +for this script to run correctly. + +Then, deploy the key on all nodes with: + +``` +./deploy_wesher_key +``` + +This should be done after `./deploy_nixos` has run successfully on all nodes. +You should now have a working Wesher network between all your nodes! + +**When adding a node to the cluster:** just do `./deploy_wesher_key ` + +### Generating and deploying a PKI for Consul and Nomad + +This is very similar to how we do for Wesher. + +First, if the PKI has not yet been created, create it with: + +``` +./gen_pki +``` + +Then, deploy the PKI on all nodes with: + +``` +./deploy_pki +``` + +**When adding a node to the cluster:** just do `./deploy_pki ` + +### Adding administrators + +Adminstrators are defined in the `cluster.nix` file for each cluster (they could also be defined in the site-specific Nix files if necessary). +This is where their public SSH keys for remote access are put. + +Administrators will also need passwords to administrate the cluster, as we are not using passwordless sudo. +To set the password for a new administrator, they must have a working `pass` installation as specified above. +They must then run: + +``` +./passwd +``` + +to set their password in the `pass` database (the password is hashed, so other administrators cannot learn their password even if they have access to the `pass` db). + +Then, an administrator that already has root access must run the following (after syncing the `pass` db) to set the password correctly on all cluster nodes: + +``` +./deploy_passwords +``` + +## Deploying stuff on Nomad + +### Connecting to Nomad + +Connect using SSH to one of the cluster nodes, forwarding port 14646 to port 4646 on localhost, and port 8501 to port 8501 on localhost. + +You can for instance use an entry in your `~/.ssh/config` that looks like this: + +``` +Host caribou + HostName 2a01:e0a:c:a720::23 + LocalForward 14646 127.0.0.1:4646 + LocalForward 8501 127.0.0.1:8501 +``` + +Then, in a separate window, launch `./tlsproxy `: this will +launch `socat` proxies that strip the TLS layer and allow you to simply access +Nomad and Consul on the regular, unencrypted URLs: `http://localhost:4646` for +Nomad and `http://localhost:8500` for Consul. Keep this terminal window for as +long as you need to access Nomad and Consul on the cluster. + +### Launching services + Stuff should be started in this order: - `app/core` diff --git a/cluster/prod/ssh_config b/cluster/prod/ssh_config index 266d77f..cb4841f 100644 --- a/cluster/prod/ssh_config +++ b/cluster/prod/ssh_config @@ -1,10 +1,10 @@ UserKnownHostsFile ./ssh_known_hosts Host concombre - HostName 10.42.1.31 + HostName 2a01:e0a:c:a720::31 Host courgette - HostName 10.42.1.32 + HostName 2a01:e0a:c:a720::32 Host celeri - HostName 10.42.1.33 + HostName 2a01:e0a:c:a720::33 diff --git a/cluster/staging/ssh_config b/cluster/staging/ssh_config index 8fae8ab..9bc4e6e 100644 --- a/cluster/staging/ssh_config +++ b/cluster/staging/ssh_config @@ -1,13 +1,13 @@ UserKnownHostsFile ./ssh_known_hosts Host caribou - HostName 10.42.2.23 + HostName 2a01:e0a:c:a720::23 Host carcajou - HostName 10.42.2.22 + HostName 2a01:e0a:c:a720::22 Host cariacou - HostName 10.42.2.21 + HostName 2a01:e0a:c:a720::21 Host spoutnik HostName 10.42.0.2 diff --git a/ssh_known_hosts b/ssh_known_hosts index 7e224a3..e3181cf 100644 --- a/ssh_known_hosts +++ b/ssh_known_hosts @@ -6,3 +6,6 @@ 10.42.2.21 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPXTUrXRFhudJBESCqjHCOttzqYPyIzpPOMkI8+SwLRx 10.42.2.22 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMf/ioVSSb19Slu+HZLgKt4f1/XsL+K9uMxazSWb/+nQ 10.42.2.23 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDsYD1gNmGyb6c9wjGR6tC69fHP6+FpPHTBT6laPTHeD +2a01:e0a:c:a720::22 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMf/ioVSSb19Slu+HZLgKt4f1/XsL+K9uMxazSWb/+nQ +2a01:e0a:c:a720::21 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPXTUrXRFhudJBESCqjHCOttzqYPyIzpPOMkI8+SwLRx +2a01:e0a:c:a720::23 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDsYD1gNmGyb6c9wjGR6tC69fHP6+FpPHTBT6laPTHeD