Programmatic host name for bc_desktop app

We have several slurm clusters running from a single ood server. One issue we have is that a couple of these clusters share nodenames (i.e. “hostname -s”). Our ood server has /etc/resolv.conf configured to search through the various search domains we have (e.g. cluster1.org.com and cluster2.org.com). If I move the search order around in the resolv file, it will allow bc_desktop to connect from the ood portal to the node that comes first in the search order. But I can never get to the other cluster’s nodes unless I change it back. In other words, if I have cluster1.org.com as the first search domain, the ood server won’t try to VNC to cpu01 on cluster2.org.com even if that is the cluster being submitted to. The slurm end looks fine, it’s just the VNC part. My connection.yml file for a given job starts with something like “host: cpu01 port: 5901". Is there a way to force ood to send “cpu01” to the appropriate slurm cluster but use something like “cpu01.cluster1.org.com” for the remote desktop part?

Hello and welcome!

I know you can get a fully qualified domain name with hostname -f so that is likely the command needed to solve the URL you are looking for.

You may need to tweak the host_regex for the reverse proxy in your ood_portal.yml and do the Apache restart to ensure those FQDN’s still work with the regex as well. There may be some other tweaks needed, but that should move things along some.

Thanks Travis.

I am testing something like this:

[root@ood clusters.d]# cat cluster3.yml

v2:
metadata:
title: “Cluster3 HPC Cluster”
login:
host: “cluster3.org.com
submit_host: “cluster3.org.com
default: true
job:
adapter: “slurm”
cluster: “cluster3”
bin: “/cm/shared/apps/slurm/24.05.8/bin”
conf: “/somepath/slurm.conf”
batch_connect:
vnc:
script_wrapper: |
export PATH=“/opt/TurboVNC/bin:$PATH”
export WEBSOCKIFY_CMD=“/usr/bin/websockify -6”
%s
set_host: “host=$(hostname -s)‘.cluster3.org.com’”

For some reason, the host in connection.yml is still the regular hostname without the appended text. Any ideas?

Actually it seems to be working now. Maybe I needed to restart the slurm controller and database daemons. Thanks. I will report back after more testing.