Document Wireguard config

2020-05-21 15:50:14 +02:00 · 2020-05-21 15:50:14 +02:00 · bee7e10256
commit bee7e10256
parent a4f9aa2d98
1 changed files with 56 additions and 0 deletions
--- a/ansible/README.md
+++ b/ansible/README.md
@ -13,3 +13,59 @@ For each machine, **one by one** do:
  - Reboot
  - Check that cluster is healthy

+## New configuration with Wireguard
+
+This configuration is used to make all of the cluster nodes appear in a single
+virtual private network, enable them to communicate on all ports even if they
+are behind NATs at different locations. The VPN also provides a layer of
+security, encrypting all comunications that occur over the internet.
+
+### Prerequisites
+
+Nodes must all have two publicly accessible ports (potentially routed through a NAT):
+
+- A port that maps to the SSH port (port 22) of the machine, allowing TCP connections
+- A port that maps to the Wireguard port (port 51820) of the machine, allowing UDP connections
+
+
+### Configuration
+
+The network role sets up a Wireguard interface, called `wgdeuxfleurs`, and
+establishes a full mesh between all cluster machines. The following
+configuration variables are necessary in the node list:
+
+- `ansible_host`: hostname to which Ansible connects to, usually the same as `public_ip`
+- `ansible_user`: username to connect as for Ansible to run commands through SSH
+- `ansible_port`: if SSH is not bound publicly on port 22, set the port here
+- `public_ip`: the public IP for the machine or the NATting router behind which the machine is
+- `public_vpn_port`: the public port number on `public_ip` that maps to port 51820 of the machine
+- `vpn_ip`: the IP address to affect to the node on the VPN (each node must have a different one)
+- `dns_server`: any DNS resolver, typically your ISP's DNS or a public one such as OpenDNS
+
+The new iptables configuration now prevents direct communication between
+cluster machines, except on port 51820 which is used to transmit VPN packets.
+All intra-cluster communications must now go through the VPN interface (thus
+machines refer to one another using their VPN IP addresses and never their
+public or LAN addresses).
+
+### Restarting Nomad
+
+When switching to the Wireguard configuration, machines will stop using their
+LAN addresses and switch to using their VPN addresses. Consul seems to handle
+this correctly, however Nomad does not. To make Nomad able to restart
+correctly, its Raft protocol module must be informed of the new IP addresses of
+the cluster members. This is done by creating on all nodes the file
+`/var/lib/nomad/server/raft/peers.json` that contains the list of IP addresses
+of the cluster. Here is an example for such a file:
+
+```
+["10.68.70.11:4647","10.68.70.12:4647","10.68.70.13:4647"]
+```
+
+Once this file is created and is the same on all nodes, restart Nomad on all
+nodes. The cluster should resume operation normally.
+
+The same procedure can also be applied to fix Consul, however my tests showed
+that it didn't break when IP addresses changed (it just took a bit long to come
+back up).
+