2020-07-05 19:52:31 +02:00
|
|
|
# ANSIBLE
|
2019-06-01 16:02:49 +02:00
|
|
|
|
2020-07-05 19:52:31 +02:00
|
|
|
## How to proceed
|
2019-06-01 16:02:49 +02:00
|
|
|
|
2020-07-05 19:52:31 +02:00
|
|
|
For each machine, **one by one** do:
|
|
|
|
- Check that cluster is healthy
|
2022-03-09 16:54:19 +01:00
|
|
|
- Check garage
|
|
|
|
- check that all nodes are online `docker exec -ti xxx /garage status`
|
|
|
|
- check that tables are in sync `docker exec -ti 63a4d7ecd795 /garage repair --yes tables`
|
|
|
|
- check garage logs
|
|
|
|
- no unknown errors or resync should be in progress
|
|
|
|
- the following line must appear `INFO garage_util::background > Worker exited: Repair worker`
|
2020-07-05 20:15:28 +02:00
|
|
|
- Check that Nomad is healthy
|
2020-10-22 18:29:37 +02:00
|
|
|
- `nomad server members`
|
|
|
|
- `nomad node status`
|
2020-07-05 20:15:28 +02:00
|
|
|
- Check that Consul is healthy
|
2020-10-22 18:29:37 +02:00
|
|
|
- `consul members`
|
2020-07-05 20:15:28 +02:00
|
|
|
- Check that Postgres is healthy
|
2020-11-13 12:33:23 +01:00
|
|
|
- Run `ansible-playbook -i production.yml --limit <machine> -u <username> site.yml`
|
2020-10-28 17:07:55 +01:00
|
|
|
- Run `nomad node drain -enable -force -self`
|
2020-07-05 19:52:31 +02:00
|
|
|
- Reboot
|
2020-10-28 17:07:55 +01:00
|
|
|
- Run `nomad node drain -self -disable`
|
2022-03-09 16:54:19 +01:00
|
|
|
- Check that cluster is healthy (basically the whole first point)
|
2019-06-01 16:02:49 +02:00
|
|
|
|