infrastructure/os/config/README.md

# ANSIBLE

## How to proceed

For each machine, **one by one** do:
  - Check that cluster is healthy
    - Check garage
      - check that all nodes are online `docker exec -ti xxx /garage status`
      - check that tables are in sync `docker exec -ti 63a4d7ecd795 /garage repair --yes tables`
      - check garage logs
        - no unknown errors or resync should be in progress
        - the following line must appear `INFO  garage_util::background > Worker exited: Repair worker`
    - Check that Nomad is healthy
      - `nomad server members`
      - `nomad node status`
    - Check that Consul is healthy
      - `consul members`
    - Check that Postgres is healthy
  - Run `ansible-playbook -i production.yml --limit <machine> -u <username> site.yml`
  - Run `nomad node drain -enable -force -self`
  - Reboot
  - Run `nomad node drain -self -disable`
  - Check that cluster is healthy (basically the whole first point)
Add a readme 2020-07-05 19:52:31 +02:00			`# ANSIBLE`
Initial commit 2019-06-01 16:02:49 +02:00
Add a readme 2020-07-05 19:52:31 +02:00			`## How to proceed`
Initial commit 2019-06-01 16:02:49 +02:00
Add a readme 2020-07-05 19:52:31 +02:00			`For each machine, one by one do:`
			`- Check that cluster is healthy`
Maintenance du 2022-03-09 2022-03-09 16:54:19 +01:00			`- Check garage`
			- check that all nodes are online `docker exec -ti xxx /garage status`
			- check that tables are in sync `docker exec -ti 63a4d7ecd795 /garage repair --yes tables`
			`- check garage logs`
			`- no unknown errors or resync should be in progress`
			- the following line must appear `INFO garage_util::background > Worker exited: Repair worker`
Add docs + fix warning 2020-07-05 20:15:28 +02:00			`- Check that Nomad is healthy`
Fix some bugs 2020-10-22 18:29:37 +02:00			- `nomad server members`
			- `nomad node status`
Add docs + fix warning 2020-07-05 20:15:28 +02:00			`- Check that Consul is healthy`
Fix some bugs 2020-10-22 18:29:37 +02:00			- `consul members`
Add docs + fix warning 2020-07-05 20:15:28 +02:00			`- Check that Postgres is healthy`
use ansible_become instead of ansible_user: root 2020-11-13 12:33:23 +01:00			- Run `ansible-playbook -i production.yml --limit <machine> -u <username> site.yml`
Add some doc 2020-10-28 17:07:55 +01:00			- Run `nomad node drain -enable -force -self`
Add a readme 2020-07-05 19:52:31 +02:00			`- Reboot`
Add some doc 2020-10-28 17:07:55 +01:00			- Run `nomad node drain -self -disable`
Maintenance du 2022-03-09 2022-03-09 16:54:19 +01:00			`- Check that cluster is healthy (basically the whole first point)`
Initial commit 2019-06-01 16:02:49 +02:00