DB Error: unable to start Garage #666

Closed
opened 2023-11-05 18:27:08 +00:00 by Ghost · 1 comment

Hello

I use Garage via a Kubernetes deployment, and the storage part runs from an NFS share.

Since this afternoon and without explanation, Garage crashed, and even with a restore of the DB directory, impossible to restart it.

The deployed version is 0.8.4, with sled:

metadata_dir = "/mnt/meta"
data_dir = "/mnt/data"

db_engine = "sled"

block_size = 1048576
sled_cache_capacity = 134217728
sled_flush_every_ms = 2000

replication_mode = "1"

compression_level = 19

rpc_bind_addr = "[::]:3901"
# rpc_secret will be populated by the init container from a k8s secret object
rpc_secret = "__RPC_SECRET_REPLACE__"

bootstrap_peers = []

[kubernetes_discovery]
namespace = "garage"
service_name = "garage-cslow"
skip_crd = false

[s3_api]
s3_region = "eu-west-1"
api_bind_addr = "[::]:3900"
root_domain = ".cslow.domain.local"

[s3_web]
bind_addr = "[::]:3902"
root_domain = ".garage.domain.local"
index = "index.html"

[admin]
api_bind_addr = "[::]:3903"

When starting garage, it remains in this state throughout:

2023-11-05T18:17:48.393180Z  INFO garage::server: Loading configuration...
2023-11-05T18:17:48.393408Z  INFO garage::server: Initializing Garage main data store...
2023-11-05T18:17:48.393442Z  INFO garage_model::garage: Opening database...
2023-11-05T18:17:48.393446Z  INFO garage_model::garage: Opening Sled database at: /mnt/meta/db

I have another Garage instance that works with the same configuration, but without any crashing.

Regards

**UPDATE: **

It seems that everything is back, still the mysteries of the NFS.....

If it’s useful, I’ve done it this way:

  • benchmarks of disks and NFS sharing where garage stores its data:
    dd if=/dev/zero of=/mnt/meta/test1.img bs=1G count=1 oflag=dsync
  • On OpenmediaVault: restarting services: nfs-, rpc
  • On cluster: an attempt to repair garage in offline mode via own busybox pod, with NFS share mounted
  • And restarting k3s cluster
Hello I use Garage via a Kubernetes deployment, and the storage part runs from an NFS share. Since this afternoon and without explanation, Garage crashed, and even with a restore of the DB directory, impossible to restart it. The deployed version is 0.8.4, with sled: ``` metadata_dir = "/mnt/meta" data_dir = "/mnt/data" db_engine = "sled" block_size = 1048576 sled_cache_capacity = 134217728 sled_flush_every_ms = 2000 replication_mode = "1" compression_level = 19 rpc_bind_addr = "[::]:3901" # rpc_secret will be populated by the init container from a k8s secret object rpc_secret = "__RPC_SECRET_REPLACE__" bootstrap_peers = [] [kubernetes_discovery] namespace = "garage" service_name = "garage-cslow" skip_crd = false [s3_api] s3_region = "eu-west-1" api_bind_addr = "[::]:3900" root_domain = ".cslow.domain.local" [s3_web] bind_addr = "[::]:3902" root_domain = ".garage.domain.local" index = "index.html" [admin] api_bind_addr = "[::]:3903" ``` When starting garage, it remains in this state throughout: ``` 2023-11-05T18:17:48.393180Z INFO garage::server: Loading configuration... 2023-11-05T18:17:48.393408Z INFO garage::server: Initializing Garage main data store... 2023-11-05T18:17:48.393442Z INFO garage_model::garage: Opening database... 2023-11-05T18:17:48.393446Z INFO garage_model::garage: Opening Sled database at: /mnt/meta/db ``` I have another Garage instance that works with the same configuration, but without any crashing. Regards **UPDATE: ** It seems that everything is back, still the mysteries of the NFS..... If it’s useful, I’ve done it this way: - benchmarks of disks and NFS sharing where garage stores its data: `dd if=/dev/zero of=/mnt/meta/test1.img bs=1G count=1 oflag=dsync` - On OpenmediaVault: restarting services: nfs-*, rpc* - On cluster: an attempt to repair garage in offline mode via own busybox pod, with NFS share mounted - And restarting k3s cluster
Owner

Sled can be very slow when opening an existing database, especially if it is stored on slow storage. This is one of the multiple performance issues with Sled. Sled is now deprecated as a storage engine and will be removed in Garage v1.0. Please migrate to LMDB or sqlite.

Sled can be very slow when opening an existing database, especially if it is stored on slow storage. This is one of the multiple performance issues with Sled. Sled is now deprecated as a storage engine and will be removed in Garage v1.0. Please migrate to LMDB or sqlite.
lx closed this issue 2023-11-06 09:43:25 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#666
No description provided.