My first question is more of an observation and comment for those installing the app in the future…
#1 – Is the “Build/install the updated Apache configuration file” step is missing something?
Prior to deploying RStudio/other interactive apps, one needs to “check the boxes” in the “Setup Interactive Apps” section of the documentation, including enabling the reverse proxy using the update_ood_portal
script. When you modify ood_portal.yml as recommended, and execute that script, it’s very easy to miss this bit…
Generating Apache config using YAML config: '/etc/ood/config/ood_portal.yml'
Generating Apache config checksum file: '/etc/ood/config/ood_portal.sha256sum'
WARNING: Checksum of /opt/rh/httpd24/root/etc/httpd/conf.d/ood-portal.conf does not match previous value, not replacing.
Generating new Apache config at: '/opt/rh/httpd24/root/etc/httpd/conf.d/ood-portal.conf.new'
…which, unless I’m mistaken, means you haven’t actually updated Apache’s config after all. Right? Some searching turned up this probably unrelated post that scared me from proceeding for quite a while, but has some valuable context suggesting (if I correctly may paraphrase) the check is there to prevent future OOD updates from overwriting site-specific config modifications. Examining the contents of that script I found and used the -f
flag to force the update. Should a note about this be added to the tutorial?
#2 – What’s the ideal way to modify the submit-form for use with Slurm?
The modifications we needed to make were
- replacing the bc_num_slots field in favor of one requesting cores instead of nodes (
-c, --cpus-per-task=<ncpus>
) - including a field to set memory allocation (
--mem=<size[units]>
) - invisibly setting Slurm’s qos parameter (
-q, --qos=<qos>
) - setting a custom jobname (vs the default that uses the app path)
The final config is quoted below, but there were sufficiently many stumbling blocks I wanted to document them for the future as well as ask if there isn’t a better way.
The search result that looked the most promising actually turned out to be quite confusing in the long run: @jeff.ohrstrom, am I missing something, or is that suggested native
leaf not actually syntactically correct for Slurm? It seemed to do the job for neranjan but any variation I tried on that yielded a ruby error; the fix for which was in this thread mentioning that “any other adapter but Torque you should convert the value of the “native” key in the submit.yml.erb to an array”. (Also, neranjan used an id
leaf under his custom field for cores but I didn’t see that documented anywhere else – is that just ignored?)
Along with the form.yml documentation, that about covered everything and I initially used the native
leaf to set the job name too before I found the pointer to “generic fields”, which I mention here for future forum searchers.
## /var/www/ood/apps/sys/RStudio/manifest.yml ##
---
name: RStudio Server
category: Interactive Apps
subcategory: Servers
role: batch_connect
description: |
This app will launch an RStudio server.
## /var/www/ood/apps/sys/RStudio/form.yml ##
---
cluster: "monsoon"
form:
- bc_num_hours
- num_mem
- num_cores
- bc_email_on_started
attributes:
num_mem:
widget: "number_field"
label: "Memory (in megabytes)"
value: 500
min: 1
id: 'num_mem'
num_cores:
widget: "number_field"
label: "Number of cores"
value: 1
required: true
min: 1
id: 'num_cores'
## /var/www/ood/apps/sys/RStudio/submit.yml.erb ##
---
batch_connect:
template: "basic"
script:
job_name: "od_rstudio"
native: [ "--mem=<%= num_mem.to_i %>", "-c <%= num_cores.to_i %>", "--qos=ondemand" ]
The above has been working great, but I’d love if someone could verify I’m not accidentally doing something in a brittle way.
#3 – Singularity image
(Again, this is mostly for future forum searchers…) We’re still running Centos 6 here, so as per the docs we built our own barebones Singularity container which we did by simply changing the example’s 7
to 6
…plus one more necessary mod: I’m not sure if this was a site-specific thing, or a Singularity container thing, but we needed to explicitly bind-in the path to libuuid.so.1. (in /var/www/ood/apps/sys/RStudio/template/script.sh.erb
) Failing to do this, one would find the following in the output.log following a launch attempt:
Script starting...
Waiting for RStudio Server to open port 24314...
+ echo 'Starting up rserver...'
Starting up rserver...
+ singularity run -B /tmp/jason/27026039/tmp.8xzKaqrtF2:/tmp /common/contrib/containers/rserver-launcher-centos6-custom.simg --www-port 24314 --auth-none 0 --auth-pam-helper-path /home/jason/ondemand/data/sys/dashboard/batch_connect/sys/RStudio/output/05eb9bfa-da61-4ec1-a5cc-2fb662454028/bin/auth --auth-encrypt-password 0 --rsession-path /home/jason/ondemand/data/sys/dashboard/batch_connect/sys/RStudio/output/05eb9bfa-da61-4ec1-a5cc-2fb662454028/rsession.sh
rserver: error while loading shared libraries: libuuid.so.1: cannot open shared object file: No such file or directory
Apologies for this huge post – I hadn’t forseen it being nearly so long, but I wanted to be as verbose as possible in case it ended up helping anyone.
–Jason Buechler / NAU Monsoon