Nix system configuration for Deuxfleurs clusters
Go to file
Alex 1e32bebd38
Document used port numbers
2022-12-02 12:14:55 +01:00
cluster Staging: let nodes use each other as Nix caches (only inside same site) 2022-12-02 11:59:32 +01:00
doc Document used port numbers 2022-12-02 12:14:55 +01:00
experimental SSB experiment 2022-09-21 19:29:08 +02:00
nix Clean stuff up and update nix driver 2022-11-29 16:21:38 +01:00
secretmgr Clone core module in staging and prod, move bad stuff to experimental 2022-08-24 15:48:18 +02:00
.gitignore Modularize and prepare to support multiple clusters 2022-02-09 12:09:49 +01:00
README.md edited README: added more info to 'how to operate a node' 2022-11-09 18:57:49 +01:00
README.more.md WIP doc 2022-10-16 11:14:50 +02:00
deploy_nixos Remove old nomad-driver-nix 2022-11-29 15:41:35 +01:00
deploy_passwords Add scripts to manage passwords 2022-04-20 15:41:54 +02:00
deploy_pki Complete telemetry configuration 2022-10-16 18:12:57 +02:00
deploy_wg Reinstall caribou 2022-11-03 19:25:28 +01:00
gen_pki Fix access to consul for non-server nodes 2022-08-24 16:58:50 +02:00
passwd edited passwd command to set bash as interpreter 2022-11-09 19:02:02 +01:00
restic-summary Move cryptpad backup job to backup-daily.hcl 2022-09-26 13:02:38 +02:00
ssh_known_hosts Reinstall caribou 2022-11-03 19:25:28 +01:00
sshtool Don't make diplotaxis and doradille raft servers, fix sshtool 2022-08-24 14:29:56 +02:00
tlsproxy changed shebang of tlsproxy file to bash, because trap failed with sh (trap is a builtin of bash) 2022-11-09 18:53:21 +01:00
upgrade_nixos Staging: ability to run Nix jobs using exec2 driver 2022-11-28 22:58:39 +01:00

README.md

Deuxfleurs on NixOS!

This repository contains code to run Deuxfleur's infrastructure on NixOS.

It sets up the following:

  • A Wireguard mesh between all nodes
  • Consul, with TLS
  • Nomad, with TLS

How to welcome a new administrator

See: https://guide.deuxfleurs.fr/operations/acces/pass/

Basically:

  • The new administrator generates a GPG key and publishes it on Gitea
  • All existing administrators pull their key and sign it
  • An existing administrator reencrypt the keystore with this new key and push it
  • The new administrator clone the repo and check that they can decrypt the secrets
  • Finally, the new administrator must choose a password to operate over SSH with ./passwd prod rick where rick is the target username

How to create files for a new zone

The documentation is written for the production cluster, the same apply for other clusters.

Basically:

  • Create your site file in cluster/prod/site/ folder
  • Create your node files in cluster/prod/node/ folder
  • Add your wireguard configuration to cluster/prod/cluster.nix
    • You will have to edit your NAT config manually
    • To get your node's wg public key, you must run ./deploy_prod prod <node>, see the next section for more information
  • Add your nodes to cluster/prod/ssh_config, it will be used by the various SSH scripts.
    • If you use ssh directly, use ssh -F ./cluster/prod/ssh_config
    • Add User root for the first time as your user will not be declared yet on the system

How to deploy a Nix configuration on a fresh node

We suppose that the node name is datura. Start by doing the deployment one node at a time, you will have plenty of time in your operator's life to break everything through automation.

Run:

  • ./deploy_wg prod datura - to generate wireguard's keys
  • ./deploy_nixos prod datura - to deploy the nix configuration files
  • need to be redeployed on all nodes as the new wireguard conf is needed everywhere
  • ./deploy_password prod datura - to deploy user's passwords
  • need to be redeployed on all nodes to setup the password on all nodes
  • ./deploy_pki prod datura - to deploy Nomad's and Consul's PKI

How to operate a node

Edit your ~/.ssh/config file:

Host dahlia
  HostName dahlia.machine.deuxfleurs.fr
  LocalForward 14646 127.0.0.1:4646
  LocalForward 8501 127.0.0.1:8501
  LocalForward 1389 bottin.service.prod.consul:389
  LocalForward 5432 psql-proxy.service.prod.consul:5432

Then run the TLS proxy and leave it running:

./tlsproxy prod

SSH to a production machine (e.g. dahlia) and leave it running:

ssh dahlia

Finally you should see be able to access the production Nomad and Consul by browsing:

More

Please read README.more.md for more detailed information