New node failing to sync after layout change #841
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
5 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#841
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Had a failure in one node. Added a new fresh node to replace it and that fresh node was added successfully in the cluster from one of other existing nodes. But when I start the data sync, I see following error in the logs on the new node:
This node also doesn't show layout at all but does shows the the other nodes on status command.
All nodes are running dxflrs/garage:v1.0.0 docker image.
Anyone with ideas with what can be done in this case?
Just realised that very often following error is coming before the layout version error:
Could be related to #809
This is a bug in the 1.0 release. When joining a new node to the cluster, the new node might take a while to get the layout history from other nodes. A quick solution is to copy the
cluster_layout
file in the data folder from an existing node into the new node, so it gets the layout history.I don't know if this issue is a full duplicate of #809, or some of these information must be transfered to #809 (for example because it makes #809 more critical). For now, I assign the triage-required flag to remember to get back to it later.
New node failing to syncto New node failing to sync after layout changeGetting the same error when adding nodes. It is more likely when I add a node and commit layer each time I add a node:
old_versions
when merging layouts, don't remove old layout versions #854I have created a potential fix for this issue here: #854
Testing latest code, not seeing the error anymore.