Health endpoint reports OK status when node fails with no file descriptors available #902

Open
opened 2024-11-15 20:30:49 +00:00 by jonah · 0 comments
Contributor
Nov 15 15:21:03 GarageGW1 garage[306333]: 2024-11-15T15:21:03.765720Z  WARN garage_net::netapp: Error in listener.accept: No file descriptors available (os error 24)
Nov 15 15:21:03 GarageGW1 garage[306333]: 2024-11-15T15:21:03.765729Z  WARN garage_net::netapp: Error in listener.accept: No file descriptors available (os error 24)
Nov 15 15:21:03 GarageGW1 garage[306333]: 2024-11-15T15:21:03.765737Z  WARN garage_net::netapp: Error in listener.accept: No file descriptors available (os error 24)

I ran into an issue this morning where a garage gateway ran out of file descriptors, but the health endpoint continued reporting an OK status so my Caddy reverse proxy didn't failover properly :(

I switched to passive health checks, but I feel like it might be worth testing this, and including a LimitNOFILE=1048576 configuration line in the systemd docs since it defaults to 1024 on Debian 12.

``` Nov 15 15:21:03 GarageGW1 garage[306333]: 2024-11-15T15:21:03.765720Z WARN garage_net::netapp: Error in listener.accept: No file descriptors available (os error 24) Nov 15 15:21:03 GarageGW1 garage[306333]: 2024-11-15T15:21:03.765729Z WARN garage_net::netapp: Error in listener.accept: No file descriptors available (os error 24) Nov 15 15:21:03 GarageGW1 garage[306333]: 2024-11-15T15:21:03.765737Z WARN garage_net::netapp: Error in listener.accept: No file descriptors available (os error 24) ``` I ran into an issue this morning where a garage gateway ran out of file descriptors, but the health endpoint continued reporting an OK status so my Caddy reverse proxy didn't failover properly :( I switched to passive health checks, but I feel like it might be worth testing this, and including a `LimitNOFILE=1048576` configuration line in the [systemd docs](https://garagehq.deuxfleurs.fr/documentation/cookbook/systemd/) since it defaults to 1024 on Debian 12.
maximilien added the
kind
improvement
action
discussion-needed
scope
ops
labels 2024-11-19 22:33:57 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#902
No description provided.