OOD and Reporting

datakid · August 16, 2020, 10:37pm

Now that we have OOD poc working successfully in house, there are questions about reporting. I note that there is Grafana support already built in, but I also see that XDMoD have a number of OOD related changes in their 9.0.0 release over the weekend and XDMoD get’s a mention in the 2.0 roadmap.

Is there a preferred solution, or is it about offering flexibility? They do measure different things - Grafana is collecting from Prometheus on nodes about jobs, XDMoD is head node log analysis out of the box with a per worker plugin SUPReMM…

We currently don’t have any per-node metrics, so now it the time to be making decisions about Prometheus/SUPReMM on the nodes.

jeff.ohrstrom · August 17, 2020, 5:00pm

You may end up with both, I don’t think it’s an either/or, they both are sort of useful in different areas as you’ve already indicated.

I think Prometheus is a great operational tool. Our OPs team uses it to alert off of, so it’s great in that regard. Plus it can monitor just about anything in your infrastructure (including OOD!).

The support we have for Grafana is the ability to look at a single job’s metrics. And that’s great for that one job, but XDMoD can give you information about your last 100 jobs. It gives you context about your performance over time because it’s an application built around HPC jobs. So it’s a great tool for your users to diagnose their job’s performance over time.

Hope that helps, I get that it’s kind of a non-answer, but yes from our side it’s about flexibility and really it’s all about what you need to provide to your staff and your users and how much you want to invest in administrating these tools.

Topic		Replies	Views
Best way to get OOD specific usage metrics Get Help	3	195	December 7, 2024
Grafana integrated dashboards not displaying/configuration questions Get Help	20	4068	August 23, 2021
Open Ondemand job Monitoring Get Help	1	77	November 5, 2025
XDMoD integration feedback Get Help	2	751	September 1, 2020
Grafana in OOD - Ability to embed other panels Get Help feature-request , question	6	522	June 21, 2024

OOD and Reporting

Related topics