Changing IP address of a node leads to a half-connected and broken cluster #652
Labels
No Label
AdminAPI
Bug
Check AWS
CI
Correctness
Critical
Documentation
Ideas
Improvement
Low priority
Newcomer
Performance
S3 Compatibility
Testing
Usability
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#652
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I had to change the IP address of a node, so I changed both rpc_bind_addr and rpc_public_addr for this node. There's no NAT.
Old config of node A:
New config of node A:
After restarting node A, here is the status on node A, which says it's correctly connected again to node B:
But on node B, it says that node A is still disconnected:
Note how node B still has the previous IP address of node A.
When I look at the logs of node B, it even accepts the connection from node A:
But this is never reflected in the status of node B.
This issue is not transient, I waited maybe 20 minutes and nothing changes. It also prevents node B from reaching a quorum when it receives queries.
This is using Garage 0.8.4 on Debian.
I think I already had this issue, and it is generally fixed by restarting the garage daemon on other nodes.
PR #724 probably fixes the issue, it will be published with v0.9.2 / v1.0. If the issue is sill there, please reopen the issue.
lx referenced this issue2024-03-01 14:14:56 +00:00