I am using ondemand 3.1.10 on ubuntu22.04 and I have this block in form.yml.erb to check if there are available gpu nodes in my cluster
<%-
cmd_to_check_titanxp_gpu_available = "sinfo --Node|grep titanxp | grep idle &> /dev/null"
begin
output, status = Open3.capture2e(cmd_to_check_titanxp_gpu_available)
if status.success?
_titanxp_gpu_available = true
else
_titanxp_gpu_available = false
end
end
cmd_to_check_a100_gpu_available = "sinfo --Node|grep a100 | grep idle &> /dev/null"
begin
output, status = Open3.capture2e(cmd_to_check_a100_gpu_available)
if status.success?
_a100_gpu_available = true
else
_a100_gpu_available = false
end
end
-%>
then based on this check I allow the users to select a gpu node in form.yml.erb
gpu_type:
label: "GPU type"
widget: select
cacheable: false
options:
<%- if _titanxp_gpu_available -%>
- ["titanxp 12GB gpu memory (8 cores and 64GB RAM)", "titanxp"]
<%- end -%>
<%- if _a100_gpu_available -%>
- ["a100 40GB gpu memory (64 cores and 256GB RAM)", "a100-40g"]
<%- end -%>
help: |
<%- if !_titanxp_gpu_available -%>
**no titanxp gpus availabe. Check the cluster status with squeue command**
<%- end -%>
<%- if !_a100_gpu_available -%>
**no A100 gpus availabe. Check the cluster status with squeue command**
<%- end -%>
This works well. The only issue I have is that when both conditions are false (no gpus available because all them are allocated) the select widget gpu_type is empty and the users cannot select any of the options but the launch button on the bottom is still available and users can click on it, which I find a little bit confusing for users.
Is there any way to disable the launch button if both conditions _titanxp_gpu_available and _a100_gpu_available are false ?
I’m a bit unsure what you are after here. Is the idea that if both those conditions are false, you want the launch button to grey out or not be clickable? If so, there’s the option of the form.js to write some javascript to handle that, though users might still find it odd the button is greyed out with no explanation so an alert or message letting them know GPUs are maxed may be helpful.
But yeah, I’d say selecting off that gpu_type then selecting the input of type submit and checking the logic and disabling the button if both or false is the way to go for this.
There’s another option of using the submit to fail and provide a message back to the user, but that’s more reactive than you want I think.
My initial idea was that If a user selects “I want a gpu” but no gpus are available I would grey out the launch button or show a message when you click the launch button saying “no gpus available”
You also mention There’s another option of using the submit to fail and provide a message back to the user and this sounds like a simpler approach which could be good enough for my needs. Could you point me to an example?
Here’s an example of throwing an error before job submission. You see that we check for a particular Unix group to verify they have access to the license.
For any form.js on our (OSC’s) applications, you need to go back a few years. Here I randomly chose a tag on the same repository and found one.
Thank you very much for your help. I got it working as I want. I decided to trigger an error on submit as I find it simpler than modifying javascript.
I paste below how I implemented it in case it is helpful for anyone else
I added this to form.yml.erb
<%-
cmd_to_check_titanxp_gpu_available = "sinfo --Node | grep titanxp | grep idle"
begin
output, status = Open3.capture2e(cmd_to_check_titanxp_gpu_available)
if status.success?
_titanxp_gpu_available = true
else
_titanxp_gpu_available = false
end
end
cmd_to_check_a100_gpu_available = "sinfo --Node | grep a100 | grep idle"
begin
output, status = Open3.capture2e(cmd_to_check_a100_gpu_available)
if status.success?
_a100_gpu_available = true
else
_a100_gpu_available = false
end
end
-%>
---
gpu_required:
label: "Do you need a GPU?"
widget: select
cacheable: false
options:
- ['No', 'no', data-hide-gpu-type: true]
- ['Yes', 'yes', data-hide-instance-size: true]
gpu_type:
label: "GPU type"
widget: select
cacheable: false
options:
<%- if _titanxp_gpu_available -%>
- ["titanxp 12GB gpu memory (8 cores and 64GB RAM)", "titanxp"]
<%- end -%>
<%- if _a100_gpu_available -%>
- ["a100 40GB gpu memory (64 cores and 256GB RAM)", "a100-40g"]
<%- end -%>
help: |
<%- if !_titanxp_gpu_available -%>
**no titanxp gpus availabe. Check the cluster status with squeue command**
<%- end -%>
<%- if !_a100_gpu_available -%>
**no A100 gpus availabe. Check the cluster status with squeue command**
<%- end -%>
and then this in submit.yml.erb to trigger an error in case the users still tries to start a gpu job but all the gpus are allocated
if gpu_required == 'yes'
if !(gpu_type.eql? "titanxp") && !(gpu_type.eql? "a100-40g")
err_msg = "You requested a GPU but no gpus are available. Check cluster status with squeue or sinfo or contact support"
raise(StandardError, err_msg)
end
case gpu_type
when "titanxp"
slurm_args += ["--partition=dynamic-8cores-64g-1gpu-titanxp", "--cpus-per-task=8", "--mem=0"]
when "a100-40g"
slurm_args += ["--partition=dynamic-64cores-256g-1gpu-A100-40g", "--cpus-per-task=64", "--mem=0"]
end
end