update grafana template #764
No reviewers
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#764
Loading…
Reference in a new issue
No description provided.
Delete branch "mr_tron/garage:main"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
It's not ideal but it better than current because current doesn't work at all.
Hello and thank you for your contribution.
I installed your dashboard and it looks pretty nice, and I have an idea why the existing dashboard was not working for you (obviously it works well for me, otherwise it wouldn't be in the repo). I saw that on some of your graphs, you were filtering metrics with
job="garage_nodes"
, so I think you might be importing your metrics with that job name in your prometheus server. Our dashboard works if you import metrics withjob="garage"
, and all plots are filtered with that, which would explain why they didn't display anything for you.In your dashboard, I saw that some plots are filtered with
job="garage_nodes"
but some other plots are not filtered according to thejob
tag, which takes the risk of mixing up data from other applications. Also, the fact that you are usingjob="garage_nodes"
and notjob="garage"
means that if we merge this we will break the dashboards of all the people that try upgrading.I like that your dashboard has time plots of the RPC and API request duration, but it is also missing a few things of the other dashboard (like disk I/O and metrics for the web endpoint).
Overall, I'd say these issues make your dashbord slightly worse than the existing one, so I don't want to replace it just yet. I'd recommend starting from the existing dashboard, fixing your
job
tag so that it works on your cluster, and adding to that one the few plots that you have that are missing. If you come up with a new board that fixes all of these issues I'd be glad to merge it!Pull request closed