Greetings!
I have just finished an installation of a POC of OnDemand in our LSF10 based development cluster. Your documentation was impressively thorough, so thank you that! The File, Editor, Active Jobs, and Shell apps are performing exactly as expected.
However, the Job Composer has a bit of a flaw in the default configuration setting I’m hoping to get some advise on. Our cluster is designed around all submitted jobs running within a Docker container on the exec nodes, and we do this via the LSF “App” configuration service on the LSF master side. For the Job Composer to be functional in our cluster, I’ll need a way for the template to have an option for the user to add the docker image parameter. Other submission parameters would be advantageous to us as well, but the image is required for our cluster. Are there configuration options or known points in the ruby code base that could facilitate this?
Thanks much!
Perhaps a screen shot would help. This is just attempting to run the default “Hello world” job. The error as written is expected if an attempt to “bsub” is executed with a required parameter to the cluster (in this case, -a)
I’m hoping there’s a way in configuration or code to give an extra option to the user to supply their “-a” parameter.
Hi! and Welcome! And Sorry for the delay.
I believe you need to add this functionality in your script’s with #BSUB
directives.
The job composer, as it is, is fairly limited in what options you can change in the UI form. Indeed, it has very few options that you’d want and need (file output location, cores, memory and so on) so most job configurations go in the scripts through scheduler’s directives.
Hope that helps, and again, sorry for the delay.
Thank you for the reply, though I must be misunderstanding something. From my experiments with the composer, the script in question (main_job.sh for the default) is sent into the job scheduler as the target to be run. It uses the configured attributes in the cluster.d/my_cluster.yml to know which ruby packages to use to build the scheduler command and runs it on main_job.sh. So in my case:
bsub -a “docker(ubuntu)” main_job.sh
Maybe I can frame this a different way. I used the cluster test rake to make sure OOD and my cluster are configured correctly. Thanks for including this by the way. I needed to modify the test.rake slightly for LSF10.1 and our esub to play nice, but once done this:
sudo su USER -c ‘scl enable ondemand – bin/rake test:jobs:compute1-lsf RAILS_ENV=production SUBMIT_ARGS="-g /shawn.m.leonard/default -G compute-ris -q general -a ‘docker(ubuntu)’"’
works perfectly, and all of the expected parameters from both SUBMIT_ARGS and the my_cluster.yml are used at the bsub call. Where would the Job composer get the SUBMIT_ARGS equivalent at runtime (bsub call)?
Thanks, and sorry if this is just my misunderstanding
yep, was definitely my misunderstanding. I have a prototype working. Thanks!
Glad to hear it worked out!
Quick follow up on this: there any way that you can think of to deal with runtime environment variables? One of the challenges to running a cluster where all jobs run in a docker container is making sure the scheduler (LSF10) has the proper parameters at the proper time. #BSUB directives get parsed into matching LSB_* environment variables, some of other env vars are actually meant for the esub and can’t be embedded (won’t be seen).
Maybe you can source some_cfg_file
during the job’s execution? That way, the bsub script can source those variables and within the script it can use them to invoke esub
.