This repository has been archived on 2023-03-15. You can view files and clone it, but cannot push or open issues or pull requests.
infrastructure/os/config
2021-03-07 21:36:27 +01:00
..
group_vars/all Refactor 2 2020-09-12 20:17:07 +02:00
roles Expose prometheus metrics on Consul 2021-03-07 21:36:27 +01:00
cluster_nodes.yml Refactor 2 2020-09-12 20:17:07 +02:00
production.yml Fix ansible inventory + Fix jicofo's hocon conf + fix jicofo's dockerfile 2021-01-28 17:02:10 +01:00
README.md use ansible_become instead of ansible_user: root 2020-11-13 12:33:23 +01:00
README.more.md Refactor 2 2020-09-12 20:17:07 +02:00
site.yml Refactor 2 2020-09-12 20:17:07 +02:00

ANSIBLE

How to proceed

For each machine, one by one do:

  • Check that cluster is healthy
    • Check gluster
      • sudo gluster peer status
      • sudo gluster volume status all (check Online Col, only Y must appear)
    • Check that Nomad is healthy
      • nomad server members
      • nomad node status
    • Check that Consul is healthy
      • consul members
    • Check that Postgres is healthy
  • Run ansible-playbook -i production.yml --limit <machine> -u <username> site.yml
  • Run nomad node drain -enable -force -self
  • Reboot
  • Run nomad node drain -self -disable
  • Check that cluster is healthy