improvements to CLI and new debug features #448

Merged
lx merged 9 commits from cli-improvements into main 2023-01-02 12:42:24 +00:00
Owner

CLI UX:

  • Remove info and warning messages before the CLI does its thing, including when rpc_public_addr is not set
  • Worker list is an actual table now
  • New command worker info
  • Rename resync-n-workers into resync-worker-count
  • Improve status command with more info, including info from scrub/resync/etc - or put some of that in garage stats
  • Something about paused scrubs resuming automatically / or don't resume automatically but show a warning everywhere during duration of pause
  • prettier table in garage stats
  • more things in garage stats

CLI debug features:

  • garage block list-errors -> table of resync errored blocks, with (hash, rc, number of errors, time of last error/time of retry)
  • garage block info <block> -> info about any block, including all objects that use it
  • garage block retry-now [<hash> | ---all] -> retry resync of a block in errored status
  • garage block purge <block>... -> delete all objects that reference a lost block

Non-CLI related:

  • resync worker, use error! instead of warn! for "error when resyncing" block messages
  • table lengths in metrics, if possible to have them fast (i.e. not with Sled)
CLI UX: - [x] Remove info and warning messages before the CLI does its thing, including when rpc_public_addr is not set - [x] Worker list is an actual table now - [x] New command `worker info` - [x] Rename `resync-n-workers` into `resync-worker-count` - [ ] Improve status command with more info, including info from scrub/resync/etc - or put some of that in garage stats - [ ] Something about paused scrubs resuming automatically / or don't resume automatically but show a warning everywhere during duration of pause - [x] prettier table in `garage stats` - [ ] more things in `garage stats` CLI debug features: - [x] `garage block list-errors` -> table of resync errored blocks, with (hash, rc, number of errors, time of last error/time of retry) - [x] `garage block info <block>` -> info about any block, including all objects that use it - [x] `garage block retry-now [<hash> | ---all]` -> retry resync of a block in errored status - [x] `garage block purge <block>...` -> delete all objects that reference a lost block Non-CLI related: - [x] resync worker, use `error!` instead of `warn!` for "error when resyncing" block messages - [x] table lengths in metrics, if possible to have them fast (i.e. not with Sled)
lx added 2 commits 2022-12-13 10:50:58 +00:00
Prettier worker list table; remove useless CLI log messages
All checks were successful
continuous-integration/drone/push Build is passing
de9d6cddf7
cli: rename resync-n-workers into resync-worker-count
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
a51e8d94c6
lx added 1 commit 2022-12-13 11:24:46 +00:00
cli: new worker info command
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
9d82196945
lx added 1 commit 2022-12-13 13:24:01 +00:00
Implement block list-errors and block info
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
687660b27f
lx added 1 commit 2022-12-13 14:02:58 +00:00
Implement block retry-now and block purge
Some checks reported errors
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
continuous-integration/drone Build was killed
d7f90cabb0
lx added 1 commit 2022-12-13 14:43:35 +00:00
cli: prettier table in garage stats
Some checks reported errors
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build was killed
d6040e32a6
lx added 1 commit 2022-12-13 14:46:24 +00:00
cli: more info displayed on error in garage stats
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
f8d5409894
lx added 1 commit 2022-12-13 14:54:19 +00:00
Add block.rc_size, table.size and table.merkle_tree_size metrics
Some checks reported errors
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
continuous-integration/drone Build was killed
041b60ed1d
lx added 1 commit 2022-12-13 15:18:08 +00:00
Fix error messages
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
d1279e04f3
Contributor

Quick feedback on the debug features (block list-errors & block info) and tabular display: been using on a production cluster them for debugging for the past week, with zero issue.

Quick feedback on the debug features (block list-errors & block info) and tabular display: been using on a production cluster them for debugging for the past week, with zero issue.
lx changed title from WIP: improvements to CLI and new debug features to improvements to CLI and new debug features 2023-01-02 12:42:13 +00:00
lx merged commit 7f7d53cfa9 into main 2023-01-02 12:42:24 +00:00
lx deleted branch cli-improvements 2023-01-02 12:42:25 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#448
No description provided.