Added a disk to a single node and gone extremely slow #977
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind/experimental
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
admin-sdk
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#977
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I tested adding a filepath on a 4tb spinning rust disk last night to existing single node but since doing this all read and write performance has gone unusable slow. I used to get read and write via cyber duck at 116MB second but now it instantly drops to about 200KB sec and I find that most of my files are no longer readable however strangely some are still readable. Its used for movie sitorage for plex and jellyfin.
I'm wondering if I restarted the docker container during some kind of operation that may have been under way after applying new layout.
How to debug this and worst case scenario is there a way I can safely revert to the last working setup? Its been 8 hours since the changes and everything is still unusable.
Thanks
Jon
part of the garage.toml that was modified
^^^ big list there spans a few pages
Hi, not sure what is going on here.
In all cases, before doing anything, make a snapshot of your metadata directory (using
garage meta snapshot
, then copy the file from $metadata_dir/snapshots to somewhere safe) and if possible also take a copy of all of your data directories so that you can come back to the current garage state if needed.To improve performance after adding a disk, you should invoke
garage repair rebalance
. Doing this, Garage will move files between your various data directories, so this is why you should make a copy before. This will help Garage know of a single unique location for each file, instead of having to look through multiple disks.However, the fact that your r/w performance dropped so significantly seems to indicate that there is at least one of your disks that is extremely slow compared to the other. You should try to benchmark the IO performance of your disks to see if one of them is problematic.
Also, you should increase the
block_size
parameter if you haven't already. For media files,10M
is a minimum, you can even go higher especially in a single-node setup.Also, if you just want to go back to the previous steup, you can:
read_only = true
for the disk you added (and removecapacity = "xxx"
)garage repair rebalance
garage worker list
/garage worker info
OK will give it a shot!
The disk I added isn't that slow it is slower as its a usb spinning rust, but its fast enough when I'm backing up to it with proxmox. The others are SSD's standard ones.
how to monitor the progress of a repair?
Sorry just seen, I need to use worker list and worker info!
tried restoring the meta db from a snapshot, created a new layout and now seems worse as doesn't even authenticate when I'm connect and don't see anything in the logs...
What do you mean by "fail to authenticate"?