Baptiste Jonglez
66b2a88826
prod: Allow bespin for all services
2025-04-16 00:29:42 +02:00
Baptiste Jonglez
2d86cd239f
email: move service to corrin (fiber outage in scorpio)
2025-04-15 22:13:56 +02:00
Baptiste Jonglez
b4d04e12fe
prod: move cryptpad to bespin (fiber outage in scorpio)
2025-04-15 21:17:25 +02:00
Armaël Guéneau
1c9d01db87
staging: deploy tricot with jemalloc heap profiling
2025-04-13 11:03:20 +02:00
Armaël Guéneau
5fd5f72c81
garage job: tag tricot-site-lb has been renamed into tricot-local-lb
2025-04-12 21:05:21 +02:00
Armaël Guéneau
2dbfc3f796
grafana dashboard "tricot global": add in-flight requests
2025-04-12 18:49:54 +02:00
Armaël Guéneau
66278fd7fb
upgrade tricot (now with log rate limiting and in-flight requests meter)
2025-04-12 18:29:13 +02:00
Armaël Guéneau
477840cf66
deploy tricot on bespin
2025-04-12 18:29:03 +02:00
Armaël Guéneau
f010eb1296
staging: upgrade tricot, now with log rate limiting
...
rate limit is set at a generous 30 log/s, since not much is happening
on staging
2025-04-12 18:10:02 +02:00
Armaël Guéneau
72647fa83a
staging: upgrade tricot (new feature: meter for in-flight requests)
2025-04-12 18:10:02 +02:00
03ecd58c55
add a comment on how to put a node in maintenance mode
2025-04-12 15:02:19 +02:00
Baptiste Jonglez
fc7bf04a9c
jitsi: health check videobridge
2025-04-12 13:31:25 +02:00
Baptiste Jonglez
1d817399eb
coturn: Fix wrong DNS config for IPv6 (the CNAME pointed to the machine hosting Tricot, which is incorrect for TURN)
2025-04-12 12:34:26 +02:00
eb373986fb
drop deuxfleurs redirect hack
2025-04-11 12:20:22 +02:00
be1d80c99d
deploy update to prod
2025-04-11 11:15:23 +02:00
06b92742ab
update matrix, fix cve
2025-04-11 11:11:25 +02:00
fa5b564771
rollback bespin
2025-04-11 10:46:39 +02:00
f7cbf12400
update tricot & garage deployment
2025-04-11 10:34:56 +02:00
455b568940
Remove dummy gitea d53 job, as the IP are now fully static
2025-04-08 21:53:37 +02:00
4d07858941
ajout du domaine de l'utilisateur "n"
2025-04-07 07:36:28 +02:00
8cd50a2a57
prevent plm search init from hanging
2025-04-04 16:00:23 +02:00
8d1ff96cbf
set a 1 day default cache (instead of 120sec)
2025-04-04 15:26:27 +02:00
27293b19a2
upgrade plume
2025-04-04 15:09:19 +02:00
ef1735f781
update backups too!
2025-03-31 22:01:50 +02:00
6a4a2528f1
migration de ananas vers abricot des emails
2025-03-31 21:48:13 +02:00
8c2c0cf949
email: dkim signing table
2025-03-31 21:11:58 +02:00
Baptiste Jonglez
fe68fdf54a
plume: increase memory again
2025-03-26 20:21:57 +01:00
Baptiste Jonglez
187d36eb9b
deploy_nixos: add help to apply changes without rebooting in production
2025-03-26 00:17:59 +01:00
Baptiste Jonglez
fd6275f5bc
prod: Fix vim configuration syntax (different between staging and prod due to NixOS version difference)
2025-03-26 00:17:08 +01:00
Baptiste Jonglez
fc88a063b1
node_exporter: avoid using network mode host
2025-03-25 22:21:35 +01:00
Baptiste Jonglez
bb8c9db2ed
telemetry: avoid network mode host, and poll less often
2025-03-25 22:12:42 +01:00
451068d716
Merge pull request 'prod: telemetry: Add smartctl_exporter based on staging work' ( #53 ) from prod_smartctl_monitoring into main
...
Reviewed-on: #53
2025-03-25 21:09:08 +00:00
Baptiste Jonglez
797f946578
prod: telemetry: Add smartctl_exporter based on staging work
2025-03-24 17:53:17 +01:00
Baptiste Jonglez
596b7ab966
prod: telemetry: rename node-exporter job
2025-03-24 17:51:55 +01:00
Baptiste Jonglez
ec1fa3e540
staging: telemetry: Use a init task to create fake disk devices for smartctl_exporter
2025-03-24 17:47:05 +01:00
67230dd60c
guichet now advertise the correct dxfl login command
2025-03-24 16:48:18 +01:00
305c160899
guichet upgrade
2025-03-21 00:27:05 +01:00
Baptiste Jonglez
8d9aa00de5
staging: harden config of smartctl exporter
...
It currently requires all nodes to have /dev/sda (the device passthrough is hardcoded for now)
2025-03-19 23:46:55 +01:00
Baptiste Jonglez
5790453ff1
nix: Allow all capabilities in Nomad
...
This will be necessary for the smartctl exporter since it needs Linux
capabilities that are not allowed by default in Nomad.
We only have trusted Nomad jobs, and we already allow privileged
containers anyway, so there is no security impact.
2025-03-19 23:39:04 +01:00
Baptiste Jonglez
a2a470ac3d
staging: promote piranha to Nomad server (caribou is dead)
2025-03-19 23:08:49 +01:00
Baptiste Jonglez
2009572fea
prod: telemetry: move storage from bespin/scorpio to bespin/corrin
2025-03-12 21:22:56 +01:00
Baptiste Jonglez
8f0a45f03e
staging: telemetry: add smartctl exporter
2025-03-12 21:06:56 +01:00
Baptiste Jonglez
b98e72af96
staging: telemetry: Fix metric collection due to faulty Consul connection
2025-03-12 20:51:49 +01:00
Baptiste Jonglez
e805cf5cf6
Augmentation stockage prometheus
...
La limite actuelle correspond à environ 2 mois d'historique prometheus,
c'est parfois trop peu pour pouvoir relever des tendances sur le long terme.
2025-03-11 23:10:07 +01:00
6b52ccd374
Merge pull request 'upgrade garage to v1.99.1' ( #49 ) from garage-1.99 into main
...
Reviewed-on: #49
2025-03-09 09:48:50 +00:00
Armaël Guéneau
c5a0577cbf
upgrade garage to v1.99.1
2025-03-09 10:44:12 +01:00
Armaël Guéneau
40da5ccca2
nixos config: tweak
2025-03-07 11:43:49 +01:00
Armaël Guéneau
0051891ff0
staging: upgrade garage to v1.99-internal (support for redirections)
2025-03-07 11:43:06 +01:00
Armaël Guéneau
41961df583
woodpecker: change site neptune->corrin
2025-03-01 22:34:55 +01:00
Armaël Guéneau
e61c7449c1
matrix: allow running on site 'corrin' and remove 'neptune' (not a prod site anymore)
2025-03-01 22:32:10 +01:00