When a user’s slurm association is changed on our cluster, say QoS is modified for a certain partition or a new account is added, the changes are not displayed/refreshed within OOD, the user can’t choose the right slurm association combination to launch a Jupyter notebook. Just let you know the user can submit jobs outside OOD without any issues using the correct slurm parameters. In contrast, a new user on the cluster with the brand new slurm association has no problem to launch an app within OOD. I tested it with changing my own slurm config, the same problem persists.
I have also tried to restart OOD server without any success.
Any advises/suggestions?
Hi - it seems like the users need to restart their Per User Nginx (PUN) at the top right. Tolks should only have to restart their own PUN, not the entire server.
I don’t know how long it takes for a change lot QoS to propogate through Slurm’s systems but it would seem likely that’s it’s nearly immediate?
In any case, the QoS changes on the Slurm side. I’m not even sure how that’d affect the PUN. We issue sbatch commands just like anything else - we don’t hold any state and so on.
But if you’re forms are dynamically generated when the user logs in (i.e., pulls the QoS from squeue) then they’ll have to restart their PUN to regenerate those items. It sounds like this is your issue. That the change is made on the SLURM side, but we need to bounce the PUN to pick up on those changes.
I’m not really sure why bouncing the PUN didn’t help the user. It should have.