Need help, tried everything- Ondemand - Interactive Apps Schedule Desktop fails Auth

Warning: Permanently added ‘reactor.arc.kent.edu’ (ED25519) to the list of known hosts.
plthomas@reactor.arc.kent.edu: Permission denied (publickey,password).

Ssh works from Shell access, just have to add password.

Interactive Apps Schedule Desktop fails Auth, fails, but does not ask for password.

/etc/ood/config/clusters.d/arc.yml


v2:
metadata:
title: “arc”
url: “https://ondemand-1.arc.kent.edu
hidden: false
login:
host: “reactor.arc.kent.edu”
job:
adapter: “slurm”
cluster: “arc”
bin: “/usr/bin”
conf: “/etc/slurm/slurm.conf”
ssh_hosts: “reactor.arc.kent.edu”
submit_host: “reactor.arc.kent.edu”
strict_host_checking: false
bin_overrides:
sbatch: “/usr/local/bin/sbatch_wrapper”
# squeue: “”
# scontrol: “”
# scancel: “”
copy_environment: false

batch_connect:
basic:
script_wrapper: |
module purge
%s
set_host: “host=$(hostname -A | awk ‘{print $1}’)”
vnc:
script_wrapper: |
module purge
export PATH=“/opt/TurboVNC/bin:$PATH”
export WEBSOCKIFY_CMD=“/usr/local/bin/websockify”
%s
set_host: “host=$(hostname -A | awk ‘{print $1}’)”

Logfile info.
n /tmp/sbatch.log the final line is

INFO:sh.command:<Command ‘/usr/bin/ssh myusername@head -oBatchMode=yes /opt/slurm/bin/sbatch -D /home/myusername/ondemand/data/sys/dashboard/batch_connect/sys/bc_desktop/arc/output/60cd0c36-db12-4ac3-8b95-111c5def3e17 -J sys/dashboard/sys/bc_desktop/arc -o /home/myusername/ondemand/data/sys/dashboard/batch_connect/sys/bc_desktop/arc/output/60cd0c36-db12-4ac3-8b95-111c5def3e17/output.log -t 01:00:00 --export NONE -N 1 --parsable -M arc --export=THE_ANSWER=42’, pid 128862>: process started
INFO:root:ssh: Could not resolve hostname head: Temporary failure in name resolution

Hello and welcome.

Looking at the log entry I see head is not the correct hostname but it is trying to login to that host to run the command.

Do you have a submit.yml for the app you can share? 4. Custom Job Submission — Open OnDemand 4.0.0 documentation

Thank you so much for responding.
Where would I find the submit.yml?

Phil

I found the doc, that talked about submit.yml

Document does not give location to find File, I did update
/etc/ood/config/clusters.d/

Added this.
basic:
header: “#!/bin/bash”
vnc:
header: “#!/bin/bash”

Still get error. changes listed below.

/etc/ood/config/clusters.d/arc.yml


v2:
metadata:
title: “arc”
url: “https://ondemand-1.arc.kent.edu
hidden: false
login:
host: “reactor.arc.kent.edu”
job:
adapter: “slurm”
cluster: “arc”
bin: “/usr/bin”
conf: “/etc/slurm/slurm.conf”
ssh_hosts: “reactor.arc.kent.edu”
submit_host: “reactor.arc.kent.edu”
strict_host_checking: false
bin_overrides:
sbatch: “/usr/local/bin/sbatch_wrapper”
# squeue: “”
# scontrol: “”
# scancel: “”
copy_environment: false

batch_connect:
basic:
script_wrapper: |
module purge
%s
set_host: “host=$(hostname -A | awk ‘{print $1}’)”
header: “#!/bin/bash”
vnc:
script_wrapper: |
module purge
export PATH=“/opt/TurboVNC/bin:$PATH”
export WEBSOCKIFY_CMD=“/usr/local/bin/websockify”
%s
set_host: “host=$(hostname -A | awk ‘{print $1}’)”
header: “#!/bin/bash”

Let’s start from the beginning as there’s a few things here we need to address to get you all setup. I’ve got some steps below to work through and relevant doc sections to help guide you where needed.

Also, just an FYI for the future, formatting your configs in discourse using markdown helps a ton for us in debugging :slight_smile:

Ok so first, you need to add a cluster.yml file to the app itself so that it knows what cluster to use:

Make sure to follow that warning and set the cluster attribute to arc as that looks to be the name of the cluster in your /etc/ood/config/clusters.d/ directory.

Then, within that app specific cluster config file, you need to set the submit option for the submit.yml.erb file you intend to use for the desktops: 4. Custom Job Submission — Open OnDemand 4.0.0 documentation

Lastly, the submit.yml you shared looks to have some typos such as basic:. I’d recommend parsing that
file more closely and ensuring you don’t have any little hiccups like that by using the config reference for the submit.yml: submit.yml.erb — Open OnDemand 4.0.0 documentation

In summary, you need to add a cluster file to the app itself letting the app know which cluster you are submitting to, add a submit option to that app cluster file with the path to the submit.yml you are using for the desktops, ensure the typos in the submit.yml are addressed, and that should be enough for the moment to iterate on.

I did also just notice in the original post you mentioned that your ssh works but you “just have to add password”. I am starting to wonder if this will be another problem as you will need to ensure you are allowing passwordless ssh for ood to work, otherwise it can’t login to run the commands to submit jobs and will fail silently.

Thank you for your advise,
Working on getting discourse installed, don’t do much programming.

I was under the impression you would not need the submit.yml, if you were going to do the global version in the clusters.d/arc.yml.

I created the submit.yml.erb file

/etc/ood/config/apps/bc_desktop/submit
#a simple script.yml.erb file
script:
native:
- “-n”
batch_connect:
template: “vnc”

-----You comment on password issue is still true,
Not sure why its trying to use my Plthomas account with no password.
I installed the munge stuff on both servers,
We just tried set up the ssh key auth between servers.

I log into web portal using campus AD, --yea
When I start a Desktop Job, should it not be using the server to server auth connections to start job on cluster node, and not my auth

I just seems to want to connect using plthomas with no password.

You can set the submit options globally in the cluster file, I didn’t realize that was what you were going for, sorry about that.

But later on if you add more apps you are going to want to follow the previous steps to make your configs modular. I’d strongly recommend a modular setup rather than global options though, even when starting out.

OOD is running your job on the cluster, so it is using your system account to do this. Munge is not a factor in this. When you submit jobs, your system account on the OS is ssh-ing to the cluster to issue commands first, that’s why there can’t be a password in the way or OOD just fails when the SSH password prompt is issued. This is done for many reasons with the most obvious being resource accounting and also things like file permission/ownership. When the job is ready the URL is passed back and you can simply click the Connect button to login to your interactive app.

Hopefully this is making more sense and helping with progress. We do have a page that goes over the architecture of this from a very high level that might help with understanding: Architecture — Open OnDemand 4.0.0 documentation

This is probably a stupid question, I seem to have a lot of them.

How to you get on demand web portal to pass the user name and password to server node?

@plthomas1 if you’re still having issues getting the desktop to run, please post the output.log from the job for us to inspect. It should have some output on why it’s failing.

We have fixed this issue, thanks so much
More and better issues in anther post.