I have configured GPU shard in SLURM to allow the GPU to be shared by multiple tasks. I would like to adopt this setup in the configuration file for Schrodinger’s OOD. In the script, I can use #SBATCH --gres=shard:2. However, when setting base_slurm_args + ["--gres", "shard:2"] in submit.yml.erb, it gives the error ‘no GPU device’. How can I resolve this issue?”
For a gres like that, don’t you need --gpus flag as well?
Don’t quite know why the directive works when the CLI flag doesn’t - maybe Slurm ignores it?
This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.