So I am having some difficulty at the moment. Users are requesting that my Jupyer app start setting SLURM’s Number of Cores & Max Memory traits. However, the default bc_* fields don’t really help with this (and on a different note, I cannot get any of the auto_* options working despite turning on the feature and updating the ood-portal).
The documentation is a tad confusing regarding how to set things up since it seems bc_num_slot is for some reason set to SLURM’s Number of Nodes (though it isn’t stated anywhere that it is). Crawling through these posts, it seems I need to set up my own cores parameter. However, when I do so, it doesn’t actually change anything despite the fields being present.
Do I need to rebuild things with some command that I am missing?
Thanks for that document. However, I am a bit unsure as to where this would live locally? Is this a once defined I don’t have to define it again? It seems odd that OnDemand doesn’t support this kind of thing natively when you select what kind of Job Manager you have in the main config file.
Are you talking about Interactive Applications that OnDemand uses or are you talking about the 3rd party application Open Composer? I can’t speak to the latter as I’m not the developer for that application.
I am talking Interactive Applications. Apologies, its been quite a while since I did any active work with our OnDemand instance as after initial configuration was done, it has been nearly a “set it, and forget it” instance.
I’m just trying to figure out exactly what files need to be edited to ensure that, when using SLURM as the job manager, users that start up an Interactive Jupyter Notebook (using the basic YAML) can set the number of CPUs per Node ($SLURM_TASKS_PER_NODE) so that we don’t have users accidentally stepping on each other’s feet when they need to run more advanced Notebooks workloads (such as MCMC runs which benefit from >2 threads).
Right, though I am still on Open OnDemand 3.1 as we haven’t migrated yet to 4.0. I think what is confusing me is where it all lives. The /etc/ood/confing/apps directory is not where applications were placed originally, except for bc_desktop. I assume the directory that is being used is /var/www/ood/apps/sys/jupyter/ which looks like:
I am unfamiliar with they syntax used in the .erb section of the file you linked above. What language is this and is there documentation regarding this?
As an additional note, I cannot get any of the auto_* options to work in my forms.yml despite turning on the feature. Is there some sort of a rebuild command I am not running?
ERB is embedded ruby. Basically <%- make_some_computation -%> for basic computations and <%= return_some_string %> that results in a string. I.e., they’re dynamic YAML files after they’re processed.
The /etc vs. /var thing is sub-apps. The ability to reconfigure existing applications. I’d suggest you forget about /etc and focus on your app in /var.
Ummmm maybe inspect the HTML to see if there are hidden options as well. It may be the case that they’re there just hidden for some reason or another.
You can also check your /var/log/ondemand-nginx/$USER/error.log to see the actual commands you’re issuing. IIRC it’s sacct or sacctmgr or scontrol?
That was indeed the issue (regarding the s). Those are now working correctly AND I have gotten the hang of erb enough to get the cores set up specifically thanks to your examples. Much appreciated.
On a similar note, is it possible to add a field for users to rename their Jupyter session that is displayed at the top of their Interactive Jobs list? For example, I have some users that are doing workflows that best work if they can have multiple SLURM interactive jobs across the cluster for timing specific pulsars all at the same time.