I have configured GPU shard in SLURM to allow the GPU to be shared by multiple tasks. I would like to adopt this setup in the configuration file for Schrodinger’s OOD. In the script, I can use #SBATCH --gres=shard:2
. However, when setting base_slurm_args + ["--gres", "shard:2"]
in submit.yml.erb
, it gives the error ‘no GPU device
’. How can I resolve this issue?”
For a gres
like that, don’t you need --gpus
flag as well?
Don’t quite know why the directive works when the CLI flag doesn’t - maybe Slurm ignores it?