forked from Deuxfleurs/infrastructure
71 lines
3.1 KiB
Markdown
71 lines
3.1 KiB
Markdown
# ANSIBLE
|
|
|
|
## How to proceed
|
|
|
|
For each machine, **one by one** do:
|
|
- Check that cluster is healthy
|
|
- `sudo gluster peer status`
|
|
- `sudo gluster volume status all` (check Online Col, only `Y` must appear)
|
|
- Check that Nomad is healthy
|
|
- Check that Consul is healthy
|
|
- Check that Postgres is healthy
|
|
- Run `ansible-playbook -i production --limit <machine> site.yml`
|
|
- Reboot
|
|
- Check that cluster is healthy
|
|
|
|
## New configuration with Wireguard
|
|
|
|
This configuration is used to make all of the cluster nodes appear in a single
|
|
virtual private network, enable them to communicate on all ports even if they
|
|
are behind NATs at different locations. The VPN also provides a layer of
|
|
security, encrypting all comunications that occur over the internet.
|
|
|
|
### Prerequisites
|
|
|
|
Nodes must all have two publicly accessible ports (potentially routed through a NAT):
|
|
|
|
- A port that maps to the SSH port (port 22) of the machine, allowing TCP connections
|
|
- A port that maps to the Wireguard port (port 51820) of the machine, allowing UDP connections
|
|
|
|
|
|
### Configuration
|
|
|
|
The network role sets up a Wireguard interface, called `wgdeuxfleurs`, and
|
|
establishes a full mesh between all cluster machines. The following
|
|
configuration variables are necessary in the node list:
|
|
|
|
- `ansible_host`: hostname to which Ansible connects to, usually the same as `public_ip`
|
|
- `ansible_user`: username to connect as for Ansible to run commands through SSH
|
|
- `ansible_port`: if SSH is not bound publicly on port 22, set the port here
|
|
- `public_ip`: the public IP for the machine or the NATting router behind which the machine is
|
|
- `public_vpn_port`: the public port number on `public_ip` that maps to port 51820 of the machine
|
|
- `vpn_ip`: the IP address to affect to the node on the VPN (each node must have a different one)
|
|
- `dns_server`: any DNS resolver, typically your ISP's DNS or a public one such as OpenDNS
|
|
|
|
The new iptables configuration now prevents direct communication between
|
|
cluster machines, except on port 51820 which is used to transmit VPN packets.
|
|
All intra-cluster communications must now go through the VPN interface (thus
|
|
machines refer to one another using their VPN IP addresses and never their
|
|
public or LAN addresses).
|
|
|
|
### Restarting Nomad
|
|
|
|
When switching to the Wireguard configuration, machines will stop using their
|
|
LAN addresses and switch to using their VPN addresses. Consul seems to handle
|
|
this correctly, however Nomad does not. To make Nomad able to restart
|
|
correctly, its Raft protocol module must be informed of the new IP addresses of
|
|
the cluster members. This is done by creating on all nodes the file
|
|
`/var/lib/nomad/server/raft/peers.json` that contains the list of IP addresses
|
|
of the cluster. Here is an example for such a file:
|
|
|
|
```
|
|
["10.68.70.11:4647","10.68.70.12:4647","10.68.70.13:4647"]
|
|
```
|
|
|
|
Once this file is created and is the same on all nodes, restart Nomad on all
|
|
nodes. The cluster should resume operation normally.
|
|
|
|
The same procedure can also be applied to fix Consul, however my tests showed
|
|
that it didn't break when IP addresses changed (it just took a bit long to come
|
|
back up).
|
|
|