forked from Deuxfleurs/nixcfg
Nix system configuration for Deuxfleurs clusters
7db40a8dcf
Coturn was failing to start with the following error: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/usr/local/bin/docker-entrypoint.sh": permission denied: unknown It seems to be caused by the recent NixOS update. Either because Docker/runc is now more strict when checking if the entrypoint is executable [1] And/or because Nomad may mount the secrets directory with "noexec" [2]. In any case, the "local" directory [2] looks more appropriate, because it's shared with the task while not being accessible to other tasks. [1] https://github.com/opencontainers/runc/issues/3715 [2] https://developer.hashicorp.com/nomad/docs/concepts/filesystem |
||
---|---|---|
cluster | ||
doc | ||
experimental | ||
nix | ||
.gitignore | ||
deploy_nixos | ||
deploy_passwords | ||
deploy_pki | ||
gen_pki | ||
passwd | ||
README.md | ||
restic_restore_gen | ||
restic_summary | ||
secretmgr | ||
sshtool | ||
tlsproxy | ||
upgrade_nixos |
Deuxfleurs on NixOS!
This repository contains code to run Deuxfleurs' infrastructure on NixOS.
Our abstraction stack
We try to build a generic abstraction stack between our different resources (CPU, RAM, disk, etc.) and our services (Chat, Storage, etc.), we develop our own tools when needed.
Our first abstraction level is the NixOS level, which installs a bunch of standard components:
- Wireguard: provides encrypted communication between remote nodes
- Nomad: schedule containers and handle their lifecycle
- Consul: distributed key value store + lock + service discovery
- Docker: package, distribute and isolate applications
Then, inside our Nomad+Consul orchestrator, we deploy a number of base services:
- Data management
- Garage: S3-compatible lightweight object store for self-hosted geo-distributed deployments
- Stolon + PostgreSQL: distributed relational database
- Network Control Plane
- DiploNAT: - network automation (firewalling, upnp igd)
- D53 - update DNS entries (A and AAAA) dynamically based on Nomad service scheduling and local node info
- Tricot - a dynamic reverse proxy for nomad+consul inspired by traefik
- wgautomesh - a dynamic wireguard mesh configurator
- User Management
- Observability
- Prometheus + Grafana: monitoring
Some services we provide based on this abstraction:
- Websites: Garage (static) + fediverse blog (Plume)
- Chat: Synapse + Element Web (Matrix protocol)
- Email: Postfix SMTP + Dovecot IMAP + opendkim DKIM + Sogo webmail | Alps webmail (experimental)
- Aerogramme: an encrypted IMAP server
- Visioconference: Jitsi
- Collaboration: CryptPad
As a generic abstraction is provided, deploying new services should be easy.
How to use this?
See the following documentation topics:
- Quick start and onboarding for new administrators
- How to add new nodes to a cluster (rapid overview)
- Architecture of this repo, how the scripts work
- List of TCP and UDP ports used by services
- Why not Ansible?
Got personal services in addition to Deuxfleurs at home?
Go check cluster/prod/register_external_services.sh
. In bash, we register a redirect from Tricot to your own services or your personal reverse proxy.