SLURM job info in context

Good Afternoon,

Looking to get a bit of feedback to see if anyone has attempted this before. Essentially, what I am trying to do is get information from the SLURM job (groupID and userID) within the context of script.sh.erb for jupyter_lab.

Not sure if OOD already has the slurm job info somewhere in the context.

Thanks in advance!

Juan

Good Morning All,

Was able to resolve this issue (using SLURM environment variables). However, I am encountering a different issue now.

SLURM seems to ignore the clean.sh script that I’ve added to the templates/ directory. It seems that the task gets scancelled prior to the clean.sh script running. I think the appropriate solution here is to use the SLURM epilogue to do our clean up instead. Wondering where I should be setting that to make sure that the sbatch command (run by OOD to start the jupyter process) runs.

Thanks in advance!

Juan

Hey, sorry for the delay. I’m not super sure how to install SLURM epilogues if that’s your question.

If the question is how to determine in the epilogue that it was an OnDemand jupyter job, I think an epilogue is the right place. You can use the SLURM_JOB_NAME to determine if it’s a jupyter job (ours is ondemand/sys/dashboard/sys/bc_osc_jupyter) and take the necessary actions based on that. And SLURM_SUBMIT_HOST to be double sure.

Jeff,

Thanks for your response here. As you can imagine, parsing through every job submitted isn’t ideal for us. It seems that srun allows you to submit a task specific epilog. The same is not true for sbatch unfortunately.

Do you know how other users happen to handle this? My other thought here would be using trap to catch the signal. This seems to work fine when a user quits jupyter notebook, but not when a user deletes the session (which issues an scancel from SLURM).

Best,

Juan