Off topic: python threading in Slurm (and number of CPUs per job)

This doesn’t have any specific thing to do with OnDemand operationally but more of a general cluster question for the folks that manage them.

We’ve got users that are running python jobs in the cluster that use modules that do threading. (datatable for example). When they submit a job (we are using ondemand for this, jupyter specifically), a number of CPUs are requested, but when the job gets to the node, the python module happily looks at /proc/cpuinfo to determine how many CPUs there are and sets it thread count to that, instead of what slurm has assigned in the cgroup.

The user can do the right thing by modifying their code, for example: dt.options.nthreads = int(os.environ[‘SLURM_CPUS_ON_NODE’]) which is great…but I’m wondering if there’s an environment variable or some other setting that someone knows about that would really make all processes only identify the number of CPUs that are really available in the cgroup and can be set globally.

Thanks for any insight.

If this is too off topic, feel free to delete the post.

No need to delete - it’s perfectly fine to have here and we can support. Though I may say that https://ask.cyberinfrastructure.org/ may be better.

In any case, I know RStudio has a similar issue where R libraries don’t really recognize the cgroup they’re in. I’ve found the nproc reliably returns how many cores you have available.

import system
subprocess.run('nproc')

A quick google search showed 2 things, there’s a bug in python main repo for the same issue
https://bugs.python.org/issue36054
and

people have to write their own thing to detect the same.

So it doesn’t look like there’s any silver bullet here, at least until the actual language updates itself.

Thanks. I know the developer isn’t too thrilled to hear it, but it is what it is. Maybe one magic day python will just fix itself.