Getting started - non RedHat shop - Docker?

Hi,

We are an Ubuntu shop and I’m wondering about the simplest way to get started. I notice there is an ansible role that can be used to build from source without using RPMs, but I don’t see how to actually run it (not an ansible user).

I also see a Dockerfile and wonder if that might be the simplest way to go - can Open OnDemand be run from within a Docker container? I don’t see any documentation for the Dockerfile. I tried to build it and it failed with:

== Restarting application server ==

rake aborted!
Command failed with status (1): [PASSENGER_APP_ENV=production /opt/ood/apps...]
/opt/ood/lib/tasks/build.rb:40:in `block (3 levels) in <top (required)>'
Tasks: TOP => build => build:all => build:dashboard
(See full trace by running task with --trace)
The command '/bin/sh -c source /opt/rh/ondemand/enable &&     rake -f /opt/ood/Rakefile -mj$CONCURRENCY build &&     mv /opt/ood/apps/* /var/www/ood/apps/sys/ &&     rm -rf /opt/ood/Rakefile /opt/ood/apps /opt/ood/lib' returned a non-zero code: 1

Does the Docker image already exist in Docker Hub? I tried a basic search and could not find it.

At the moment I’m not looking to create a production deployment of OOD, I just want to stand it up and poke around, so I’m looking for the easiest way to do that.

We would probably be ok with setting up a single RedHat box if that is the easiest way to go, but if that box also needs to be able to submit jobs to our (Slurm) cluster then that could be tricky.

Thanks

Hello and welcome!

Sorry it was a bit difficult finding some docs to run a container.

Here’s the link to the Development readme that will walk you through how to setup a docker container very quickly for some poking around: ondemand/DEVELOPMENT.md at master · OSC/ondemand · GitHub

This will also let you do things like mount code if you’d like in various places and provides a place to mount a local env as well to use many settings.

If you need any further help with those instructions please ask away!

Thanks. I was able to build and launch the container.

Unfortunately the machine I built it on does not have a graphical desktop on it so there is no way to use a GUI browser to hit localhost. When I try and go to http://machine-name:8080 it redirects to http://localhost:8080. Is there a way to prevent that or at least get it to use the correct name of the container host?

I tried changing servername in lib/tasks/development.rb and rebuilding the container, but that didn’t work.

Thanks.

2.1 will support 20.04 if that’s the flavor you’re on. That’ll probably be much easier.

wget -O /tmp/ondemand-release.deb https://apt.osc.edu/ondemand/latest/ondemand-release-web-latest_2_all.deb
apt install -y /tmp/ondemand-release.deb
apt install -y apt-transport-https
echo "deb [arch=amd64] https://apt.osc.edu/ondemand/nightly/web/apt focal main" >> /etc/apt/sources.list.d/ondemand-web.list
apt update
apt install ondemand

If not, you’re likely going to have to bind hostnames into the container with --add-hosts and configure the servername in ood_portal.yml, that you’d to have mount in.

Thanks, it looks like /home/myusername/ondemand is hardcoded as the location that is mounted into the container. Once I figured that out and edited ~/.config/ondemand/container/config/ood_portal.yml to change the servername, I was able to get in.

We are still stuck on Ubuntu 18.04 at the moment.

Do these .deb files have everything you need to run ondemand on ubuntu? Is there a version that will run on 18.04? If not I will just stick with Docker I guess.

Is the procedure for installing with the .deb files documented somewhere?
Thanks.

Yes and we may have 18.04 support by the time we launch 2.1, though I can’t promise it. 20.04 is a definite.

It seems like I have 2.1 already? The bottom right side of the OnDemand browser window says: OnDemand version: v2.1.0-0.start.3-5c00c08. Is it still an unreleased beta?

You likely built from the master branch which is the active development branch. You can checkout the release_2.0 and build from a tag, but there’s little point for a proof of concept. We haven’t released 2.1 but did have to make some tags to bootstrap the repositories.

Here’s information on nightly version. There aren’t any known bugs at this time.

Thanks. I’m not sure if I should start a new thread for each new question, but I have my next question. :grinning:

I have set up a yml file for my cluster, following this example:

https://osc.github.io/ood-documentation/latest/installation/resource-manager/slurm.html

I can now open a shell in OOD but when I try and submit a job (using Job Composer, From Default Template) I got:

The change you wanted was rejected.
Maybe you tried to change something you didn't have access to.
If you are the application owner check the logs for more information.

I see nothing in the container’s logs. I am still using the default “fake” authentication where I picked a password and logged in as myusername@localhost. But, if I can open a shell as me within OOD and run commands as normal, I’m not sure why I can’t submit jobs. Also I tried to get a desktop and it just opened a blank window, and I don’t see any new jobs starting as me. (Oops, I see now that setting up desktops requires additional configuration).

TIA

Is there a tutorial about how to call the scheduler binaries (i.e. in my case (slurm) - sbatch, squeue, etc) from within the Docker container? I have tried a number of things and have not had success yet but I will spare you what I’ve tried for the moment if this is already documented somewhere.

Alternatively, is there any way to get OOD (either current or nightly version) to run on Ubuntu 18.04 without the Docker container? I tried this as well on a sandbox testing cluster without success.
We are likely to stay on 18.04 for the next few years…

Thanks.

Yea the containter’s going to be real tricky becuase of subuids and what slurm will see as your UID.

Ansible can build from the source and install on 18. I’ll look into patching our nightly this & next week.

If you can get single RedHat that’s also an option. But if you can get a RHEL box, why can’t you get a single 20.04 box?

You can use the cluster bin_overrides to then ssh into say a login node that does have slurm on it if you’re unable to install on the web node.

Hope that helps!

1 Like

Also if you’re looking for just a proof of concept to see how it all works - you can try this docker compose set that has slurm and ondemand built in.

1 Like

Thank you. It’s a very slick and impressive demo.

BTW, XDMoD does not come up, it may be due at least in part to this (from the docker-compose logs):

xdmod        | ---> Starting SSSD on xdmod ...
xdmod        | ---> Starting sshd on xdmod...
xdmod        | ---> Starting the MUNGE Authentication service (munged) on xdmod ...
xdmod        | ---> Starting sshd on xdmod...
xdmod        | ---> Open XDMoD Setup: SSO...
xdmod        | ---> Open XDMoD Setup: start
xdmod        | spawn xdmod-setup
xdmod        | You are currently using Open XDMoD 9.5.0, but a newer version
xdmod        | (10.0.0) is available.
xdmod        |
xdmod        | Do you want to continue (yes, no)? [no] 1
xdmod        |
xdmod        | '1' is not a valid option.
xdmod        |
xdmod        | Do you want to continue (yes, no)? [no]
xdmod        | Failed to get prompt
xdmod        | ---> Open XDMoD Setup: hpc resource
xdmod        | spawn xdmod-setup
xdmod        | You are currently using Open XDMoD 9.5.0, but a newer version
xdmod        | (10.0.0) is available.
xdmod        |
xdmod        | Do you want to continue (yes, no)? [no] 4
xdmod        |
xdmod        | '4' is not a valid option.
xdmod        |
xdmod        | Do you want to continue (yes, no)? [no] 1
xdmod        |
xdmod        | '1' is not a valid option.
xdmod        |
xdmod        | Do you want to continue (yes, no)? [no]
xdmod        | Failed to get prompt
xdmod        | ---> Open XDMoD Setup: finish
xdmod        | spawn xdmod-setup
xdmod        | You are currently using Open XDMoD 9.5.0, but a newer version
xdmod        | (10.0.0) is available.
xdmod        |
xdmod        | Do you want to continue (yes, no)? [no] 5
xdmod        |
xdmod        | '5' is not a valid option.
xdmod        |
xdmod        | Do you want to continue (yes, no)? [no]
xdmod        | Failed to get prompt
xdmod        | Open XDMoD Import: Hierarchy
xdmod        | (2022-05-06 16:37:48): [be[default]] [sysdb_get_real_name] (0x0040): Cannot find user [xdmod@default] in cache
xdmod        | (2022-05-06 16:37:48): [be[default]] [sysdb_get_real_name] (0x0040): Cannot find user [xdmod@default] in cache
xdmod        | (2022-05-06 16:37:48): [be[default]] [sysdb_get_real_name] (0x0040): Cannot find user [xdmod@default] in cache
xdmod        | (2022-05-06 16:37:48): [be[default]] [sysdb_get_real_name] (0x0040): Cannot find user [xdmod@default] in cache
xdmod        | No entry for terminal type "unknown";
xdmod        | using dumb terminal settings.
xdmod        | SQLSTATE[HY000] [2002] Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
xdmod        | #0 /usr/share/xdmod/classes/CCR/DB/PDODB.php(88): PDO->__construct('mysql:host=loca...', 'xdmod', '')
xdmod        | #1 /usr/share/xdmod/classes/CCR/DB.php(111): CCR\DB\PDODB->connect()
xdmod        | #2 /usr/share/xdmod/classes/CCR/CCRDBHandler.php(55): CCR\DB::factory('logger')
xdmod        | #3 /usr/share/xdmod/classes/CCR/Log.php(288): CCR\CCRDBHandler->__construct(NULL, NULL, NULL, 200)
xdmod        | #4 [internal function]: CCR\Log::getDbHandler('xdmod-import-cs...', Array)
xdmod        | #5 /usr/share/xdmod/classes/CCR/Log.php(192): call_user_func(Array, 'xdmod-import-cs...', Array)
xdmod        | #6 /usr/share/xdmod/classes/CCR/Log.php(113): CCR\Log::getLogger('xdmod-import-cs...', Array)
xdmod        | #7 /usr/bin/xdmod-import-csv(133): CCR\Log::factory('xdmod-import-cs...', Array)
xdmod        | #8 /usr/bin/xdmod-import-csv(27): main()
xdmod        | #9 {main}

Building from source might be just the ticket. I will give it a try.
Thanks.

Cool. We update them just about yearly for conferences. There’s the additional bonus of having tutorial material in every sub-directory. If you look at the ondemand/README.md you’ll find the material we go over at PEARC or Gateways.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.