Bc_desktop in sandbox

I am building a development workflow for all apps on our OOD sites using the directions here: Enabling App Development — Open OnDemand 3.1.0 documentation.

We’ve implemented Jupyter and VScode apps successfully. However, bc_desktop app appears to have problems running in the sandbox. What is the recommended way to modify bc_desktop in a sandbox?

Hello and welcome!

What are the issues you are running into when you go to launch? Could you share the error messages you are seeing?

So right now I have a bc_desktop repo in my sandbox environment and the production one at these paths:

  • Sandbox: ~/ondemand/dev/bc_desktop
  • Production: /etc/ood/config/apps/bc_desktop

My first issue was when I was viewing the bc_desktop app in My Sandbox Apps, it was complaining that I did not have a manifest.xml file, which I didn’t because I’m assuming it was using the manifest.xml file from the core repo. I added a manifest.xml file to my Production bc_desktop and that made two Desktop apps appear in the apps list. My solution was to remove the manifest.xml file from Production and add it to Sandbox with a .gitignore so when I push Sandbox and then pull it to Production, it won’t pull a manifest.xml file.

My next issue is my current issue, where I have a working Production bc_desktop, but the same code in my Sandbox bc_desktop is erroring with the following error.

Failed to stage the template with the following error:

sending incremental file list
rsync: change_dir "/var/www/ood/apps/dev/redmonp/gateway/bc_desktop/template" failed: No such file or directory (2)

sent 20 bytes  received 12 bytes  64.00 bytes/sec
total size is 0  speedup is 0.00
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1189) [sender=3.1.3]

The sandbox and production would not have anything to do with eachother, so you shouldn’t assume one of these is pulling from another. They are entirely separate. OOD showed 2 apps because the manifest.yml is what is used for the display when rendering those cards, so however many of those manifest files you have in an app, OOD will just list those off as if they are all apps with the relevant metadata you provide.

Why exclude that file in the .gitignore if the prod is a mirror of the sandbox? That way you can update all the relevant files and have the mapping work. What you’re doing is fine, but these apps are in different locations so you shouldn’t worry if they have the same content in the manifest.yml file.

For the current error I haven’t been able to replicate with 3.1 yet. Is that template directory there? If so, what are the permissions on it? What happens if you copy your template directory from the working system app into the sandbox’s path?

The pulling from the sandbox to production is done by a gitlab ci/cd pipeline that I setup. When sandbox is pushed to the gitlab repo, gitlab-runner pulls the changes into production.

The issue I was seeing was that the sandbox app requires a manifest.xml file and the production app appears to get it’s manifest.xml file from the system app, so the production and sandbox apps can’t be exactly the same.

The template directory isn’t in the sandbox app. I was following the directions in the docs, which don’t include creating a template directory.

If I’m understanding bc_desktop correctly, in the production location, the files there override the system app, and then the system app provides whatever isn’t overridden. In the sandbox location, it seems like bc_desktop doesn’t override the system app but is instead intended to be the full app, i.e. production+system.

This is what confused me. Typically the system app is the production app, but it sounds like you have 2 separate literal system and production desktop apps? Is that correct? What locations do you have them in?

Yeah, every app will require a manifest.yml file. They could have the same exact content, as it’s just metadata that is used for the display card. The docs may help clear this config file up:
https://osc.github.io/ood-documentation/latest/how-tos/app-development/interactive/manifest.html?highlight=manifest

The sandbox doesn’t interact with system apps, so any configuration you’ve set for system apps is not picked up there. For example I needed to set a cluster attr in my sandbox but my system app has that already set using a clusters.d/<cluster>.yml file which system apps will default to if nothing is set for an attr.

You could still mimic this behavior of having a core file that sets defaults if not set in the app. You would just need the correct attrs and have that file sitting in a location you hard code in the apps:
https://osc.github.io/ood-documentation/latest/enable-desktops/custom-job-submission.html#custom-job-submission

I’m probably not explaining this very well. I’ll try to define what I’m calling things better. Note that this issue is only with bc_desktop.

  • bc_desktop system app. This gets overwritten when OOD is upgraded.
    • /var/www/ood/apps/sys/bc_desktop
  • bc_desktop production. This customizes the bc_desktop system app.
  • bc_desktop sandbox. This is the sandbox location.
    • ~/ondemand/dev/class/$APP

With other apps, the production app is located in the system app location, so there are only two directories, sandbox and production and they can be identical. With bc_desktop, I’ve created a third location for customizing the system app. The bc_desktop system app and the bc_desktop production app can’t have the same contents, or I’d get duplicate manifests.xml files and probably other stuff. The bc_desktop sandbox and the bc_desktop production app can’t be the identical because the sandbox won’t work properly. I could probably remove the bc_desktop production app and just use the system app, but I believe it would get overwritten if we update OOD and there are changes to the app.

What is the recommended way to customize bc_desktop and have a sandbox for it? Is it just different than other apps, or am I missing something?

Thank you for being patient with me in the explanation. Hopefully I can clear up some confusion.

First, the bc_desktop is no different than any other app. It will still have all the same files and structure. In fact, any app you develop for OOD must have that specific structure with those specific file names as they are known to OOD and will be picked up for their various functions: manifest.yml for app metadata, form.yml for the form, template dir for the scripts to launch your app on the backend, etc. So This app is no different than any other.

I think the simple thing here is to just tell this sandbox app to look in the same location your system app is defaulting to with the submit attr in the doc I shared for Custom Job Submissions.

In my sandbox, I’ve placed the submit.yml.erb into a submit directory and modified the form.xml to point to it via submit: "submit/submit.yml.erb". My sandbox bc_desktop now looks like this:

  • bc_desktop
    • submit
      • submit.yml.erb
    • .gitignore
    • .gitlab-ci.yml
    • form.yml
    • manifest.yml
    • README.md

I’m still getting the template error. I tried copying the template directory from the bc_desktop system app into the bc_desktop sandbox app. That gets rid of the error, but the job fails to load a desktop.

The production app works when there is no manifest.yml, if the submit.yml.erb file is in the bc_desktop dir, and if the template directory exists. My bc_desktop production directory looks like this:

  • bc_desktop
    • .gitignore
    • .gitlab-ci.yml
    • form.yml
    • manifest.yml
    • README.md
    • submit.yml.erb

What’s the output.log say when it fails? Can you post its contents?

Script starting...
Generating connection YAML file...
The system default contains no modules
  (env var: LMOD_SYSTEM_DEFAULT_MODULES is empty)
  No changes in loaded modules

Launching desktop 'xfce'...
Unable to init server: Could not connect: Connection refused
xfce4-session: Cannot open display: .
Type 'xfce4-session --help' for usage.
Desktop 'xfce' ended...
Cleaning up...

That’s odd, I don’t even see anything about the VNC server starting for you. But it starts for the system app, at least given that the system bc_desktop runs I’m inferring it does.

Ok, at this point I will need to see your form as well for the dev version to see what is being passed back.

# /etc/ood/config/apps/bc_desktop/form.yml
---
title: "Hellbender Desktop"
cluster: "hellbender"
submit: "submit/submit.yml.erb"

attributes:
  desktop: "xfce"
  bc_vnc_idle: 0
  bc_vnc_resolution:
    required: true
  bc_num_hours:
    label: "Number of Hours"
    widget: 'number_field'
    value: 1
    min: 1
    max: 4
  bc_num_slots:
    label: "Number of Nodes"
    widget: "number_field"
    value: 1
    min: 1
    max: 2
  num_cpus:
    label: "Number of CPUs"
    widget: "number_field"
    value: 1
    min: 1
    max: 8
  memory:
    label: "Memory (MB)"
    widget: "number_field"
    max: 512000
    min: 2000
    step: 2000
    value: 2000
    help: "Enter a value in MB between 2000 and 512000"
  partition: rss-class
  qos: p-class
form:
  - bc_vnc_idle
  - bc_vnc_resolution
  - desktop
  - auto_accounts
  - bc_num_hours
  - bc_num_slots
  - num_cpus
  - memory
  - partition
  - qos

This looks to be the form from your system app based on the location in the comment, I was looking to see the dev form you are using. Or is that a typo?

That’s a typo, or at least, that’s the production location. This is the file used in the sandbox app, but it’s also identical to the production app.

Ok, again thank you so much for the patience on this.

First, since you have both the cluster and submit set in the form.yml it is essentially have the same thing set twice. You only need one.

Now, I tested with both those set and it will go with whatever is in the submit. What’s strange is that this looks to have worked because, as the docs state, that path is relative to /etc/ood/config/apps/bc_desktop/, which means if you had a submit folder in there with a submit.yml.erb, then that’s what is being used. Otherwise you would have been shown an error at launch.

So, assuming that /etc/ood/config/apps/bc_desktop/submit/submit.yml.erb is indeed what you were hoping to use, the next question I’d have is: have you installed the VNC software on the cluster you are submitting to? Can you confirm it is there?

I’d like the sandbox to use the sandbox submit.yml.erb, which is ~/ondemand/dev/bc_desktop/submit/submit.yml.erb.

Yes, the software listed here: 1. Software Requirements — Open OnDemand 3.1.0 documentation and xfce are installed on the cluster.

The production app at /etc/ood/config/apps/bc_desktop works and is able to open up a desktop when using the same configs as the sandbox.

You need to use the full path when you set that submit then because it is relative to /etc/ood/config/apps/bc_desktop/. It’s stated in the docs but kinda muddled in a paragraph so easy to miss. It’s right below the first example:
https://osc.github.io/ood-documentation/latest/enable-desktops/custom-job-submission.html#custom-job-submission

I’m hesitant to say more until we ensure you are hitting the expected cluster and have this config setup right because looking at that previous output.log it looked to fail due to missing software, but we also may have been hitting the wrong cluster.

I’ve updated sandbox form.xml to have submit use the absolute path to submit.yml.erb. I’m getting the same result.

Script starting...
Generating connection YAML file...
The system default contains no modules
  (env var: LMOD_SYSTEM_DEFAULT_MODULES is empty)
  No changes in loaded modules

Launching desktop 'xfce'...
Unable to init server: Could not connect: Connection refused
xfce4-session: Cannot open display: .
Type 'xfce4-session --help' for usage.
Desktop 'xfce' ended...
Cleaning up...

Ok, let’s just remove that submit directory in the local dev app, I think I’m wrong on that for these.

Just create a submit.yml.erb in the root of the app and make sure to set:

---
cluster: <your cluster>
batch_connect:
  template: vnc

And see if this finally submits to the correct cluster and fires the vnc adapter correctly.