Cluster/layout migration status is wrong after removing nodes #916
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#916
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Hi! I migrated my Garage cluster from one nodes, to another using
layout assign
,layout remove
and thenlayout apply
(both add and remove operations at the same time).Then I watched status until it went into this:
This seems to be correct, so I started shutting down old nodes - and what a surprise - data was still not migrated from old nodes and new nodes were referencing blocks from old ones - thus making data essentially unavailable.
I booted up old nodes and I can see in logs that data is still transferred from old to new nodes, and there's no information anywhere about that. So I'm waiting now until... I don't know... logs will stop showing that blocks are synced?
EDIT: it looks like after 2-3 hours all data was finally migrated and I was able to shutdown old nodes.
Cluster/layout status in wrong after removing nodesto Cluster/layout status is wrong after removing nodesCluster/layout status is wrong after removing nodesto Cluster/layout migration status is wrong after removing nodesGarage does not have a central coordinator, and hence doesn't have a sense of "progress" at a cluster level for things such as layout migrations. The best way to monitor it is to setup monitoring, and to watch the number of block in the resync queue drop on each of the nodes as they migrate to the new nodes.