I want to monitor
-
Per-job and per-user resource utilization (CPU, memory, GPU, I/O) in real time.
-
Exporting OOD/Slurm job metadata (job ID, user, queue, partition, start/end times, exit status).
How can i do that. I want display these on Grafana dashboard