"My Interactive Sessions" page times out if pbs is slow

Hi,
I’m running openondemand against a reasonably large cluster (3000 nodes) and every now and then, the PBS scheduler gets really busy and takes up to a minute or two to respond. When this happens, the “My Interactive Sessions” page will time out and fail to load, which understandably upsets our users.

To (possibly) complicate matters, we are using the cluster head nodes as a submit_host so there’s ssh in the middle as well.

Is there any way of decoupling the sessions page from the performance of the scheduler, so that it can at least load, even if it doesn’t show up-to-date information?

(Worst case I’m thinking it might be possible to write a wrapper around the PBS commands that just aborts if the command takes more than a few seconds to run, but I’d like to know if there’s something better I could do).

Cheers

David

Hello and thanks for posting.

Without getting too deep into changing how the view renders depending on the scheduler, have you looked at the bin_overrides for the PBS scheduler?
https://osc.github.io/ood-documentation/latest/installation/resource-manager/pbspro.html

I’m not sure if that would give you the ability to abort or not using commands to the PBS scheduler, but it does allow you to have wrappers for the submission. Would this help give you what you need to check those lag times and abort if needed?

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.