Slow GC #839

Closed
opened 2024-07-10 08:13:35 +00:00 by leelists · 3 comments

On a 3 nodes garage cluster GC is very slow, it does around 100K GC per day (meta is on nvme).

Is there any tunable ?

2024-07-10T07:53:44.104292Z  INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:44.469917Z  INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:44.485764Z  INFO garage_table::gc: (block_ref) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:44.537349Z  INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:44.765694Z  INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:44.780973Z  INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:45.025313Z  INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:45.040551Z  INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:45.335041Z  INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:45.350743Z  INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:45.655030Z  INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:45.670346Z  INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:45.941501Z  INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:45.957314Z  INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:46.370017Z  INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete.
2024-07-10T07:53:46.385274Z  INFO garage_table::gc: (block_ref) GC: 1 items successfully pushed, will try to delete.

Garage version: v1.0.0 [features: k2v, lmdb, sqlite, consul-discovery, kubernetes-discovery, metrics, telemetry-otlp, bundled-libs]
Rust compiler version: 1.73.0

Database engine: LMDB (using Heed crate)

Table stats:
  Table      Items   MklItems  MklTodo  GcTodo
  bucket_v2  1       1         0        0
  key        1       1         0        0
  object     401826  471377    0        89546
  version    237924  297497    0        90580
  block_ref  627759  705012    0        52812

Block manager stats:
  number of RC entries (~= number of blocks): 503790
  resync queue length: 0
  blocks with resync errors: 0
On a 3 nodes garage cluster GC is very slow, it does around 100K GC per day (meta is on nvme). Is there any tunable ? ``` 2024-07-10T07:53:44.104292Z INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:44.469917Z INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:44.485764Z INFO garage_table::gc: (block_ref) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:44.537349Z INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:44.765694Z INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:44.780973Z INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:45.025313Z INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:45.040551Z INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:45.335041Z INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:45.350743Z INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:45.655030Z INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:45.670346Z INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:45.941501Z INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:45.957314Z INFO garage_table::gc: (version) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:46.370017Z INFO garage_table::gc: (object) GC: 1 items successfully pushed, will try to delete. 2024-07-10T07:53:46.385274Z INFO garage_table::gc: (block_ref) GC: 1 items successfully pushed, will try to delete. Garage version: v1.0.0 [features: k2v, lmdb, sqlite, consul-discovery, kubernetes-discovery, metrics, telemetry-otlp, bundled-libs] Rust compiler version: 1.73.0 Database engine: LMDB (using Heed crate) Table stats: Table Items MklItems MklTodo GcTodo bucket_v2 1 1 0 0 key 1 1 0 0 object 401826 471377 0 89546 version 237924 297497 0 90580 block_ref 627759 705012 0 52812 Block manager stats: number of RC entries (~= number of blocks): 503790 resync queue length: 0 blocks with resync errors: 0 ```
quentin added the
kind
performance
action
more-info-needed
scope
background-healing
labels 2024-08-07 09:47:52 +00:00
Owner

Is the GC too slow? Or your garage cluster is too slow?
Do you want to increase the speed of GC? Or do you want to decrease it?
What is the kind of workload you have?
Can you describe your deployment a bit more in depth? Especially CPU, RAM, virtualization, shared env, etc.
What makes you think its a Garage issue and not your servers that are too slow? (Not a way to dismiss your issue, but it helps to understand where the strange things are).

Is the GC too slow? Or your garage cluster is too slow? Do you want to increase the speed of GC? Or do you want to decrease it? What is the kind of workload you have? Can you describe your deployment a bit more in depth? Especially CPU, RAM, virtualization, shared env, etc. What makes you think its a Garage issue and not your servers that are too slow? (Not a way to dismiss your issue, but it helps to understand where the strange things are).
Author

my underlying filesystem is zfs perhaps this make it slow, after my vacation gc is finished and seem no more being an issue.

Thanks,

my underlying filesystem is zfs perhaps this make it slow, after my vacation gc is finished and seem no more being an issue. Thanks,
Owner

Items in the GC queue are processes after a 24h delay so it is normal that the queue is never zero

Items in the GC queue are processes after a 24h delay so it is normal that the queue is never zero
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#839
No description provided.