update grafana template #764

Closed
mr_tron wants to merge 1 commits from mr_tron/garage:main into main
First-time contributor

It's not ideal but it better than current because current doesn't work at all.

It's not ideal but it better than current because current doesn't work at all.
mr_tron added 1 commit 2024-03-07 20:45:46 +00:00
ci/woodpecker/pr/debug Pipeline was successful Details
f7e39974ae
update grafana template
Owner

Hello and thank you for your contribution.

I installed your dashboard and it looks pretty nice, and I have an idea why the existing dashboard was not working for you (obviously it works well for me, otherwise it wouldn't be in the repo). I saw that on some of your graphs, you were filtering metrics with job="garage_nodes", so I think you might be importing your metrics with that job name in your prometheus server. Our dashboard works if you import metrics with job="garage", and all plots are filtered with that, which would explain why they didn't display anything for you.

In your dashboard, I saw that some plots are filtered with job="garage_nodes" but some other plots are not filtered according to the job tag, which takes the risk of mixing up data from other applications. Also, the fact that you are using job="garage_nodes" and not job="garage" means that if we merge this we will break the dashboards of all the people that try upgrading.

I like that your dashboard has time plots of the RPC and API request duration, but it is also missing a few things of the other dashboard (like disk I/O and metrics for the web endpoint).

Overall, I'd say these issues make your dashbord slightly worse than the existing one, so I don't want to replace it just yet. I'd recommend starting from the existing dashboard, fixing your job tag so that it works on your cluster, and adding to that one the few plots that you have that are missing. If you come up with a new board that fixes all of these issues I'd be glad to merge it!

Hello and thank you for your contribution. I installed your dashboard and it looks pretty nice, and I have an idea why the existing dashboard was not working for you (obviously it works well for me, otherwise it wouldn't be in the repo). I saw that on some of your graphs, you were filtering metrics with `job="garage_nodes"`, so I think you might be importing your metrics with that job name in your prometheus server. Our dashboard works if you import metrics with `job="garage"`, and all plots are filtered with that, which would explain why they didn't display anything for you. In your dashboard, I saw that some plots are filtered with `job="garage_nodes"` but some other plots are not filtered according to the `job` tag, which takes the risk of mixing up data from other applications. Also, the fact that you are using `job="garage_nodes"` and not `job="garage"` means that if we merge this we will break the dashboards of all the people that try upgrading. I like that your dashboard has time plots of the RPC and API request duration, but it is also missing a few things of the other dashboard (like disk I/O and metrics for the web endpoint). Overall, I'd say these issues make your dashbord slightly worse than the existing one, so I don't want to replace it just yet. I'd recommend starting from the existing dashboard, fixing your `job` tag so that it works on your cluster, and adding to that one the few plots that you have that are missing. If you come up with a new board that fixes all of these issues I'd be glad to merge it!
lx closed this pull request 2024-03-08 10:31:33 +00:00
All checks were successful
ci/woodpecker/pr/debug Pipeline was successful

Pull request closed

Sign in to join this conversation.
No description provided.