Open Ondemand job Monitoring

I want to monitor

  • Per-job and per-user resource utilization (CPU, memory, GPU, I/O) in real time.

  • Exporting OOD/Slurm job metadata (job ID, user, queue, partition, start/end times, exit status).

How can i do that. I want display these on Grafana dashboard

We have grafana support in the active jobs page when you expand the row. That said - I believe it relies on Prometheus and a couple exporters as well.