Randomly my development apps no longer can find our "cluster"

So… we have been working on some updates in our dev environment to make certain queues available globally and some hidden. Before actually making changes we took backups of our cluster.yml. However, randomly it seems that OOD doesn’t see our cluster anymore. All of the apps say “This app requires a cluster that does not exist.” and there I have confirmed, other than the name of the cluster/cluster file, the setup is the EXACT SAME as our production. Any help with this?

title: “elonia”
host: “dev-hostname.domain”
adapter: “pbspro”
host: “dev-hostname.domain”
exec: “/apps/pbspro/current”
qsub: “/apps/pbspro/current/bin/qsub”
script_wrapper: |
module purge
set_host: “host=$(hostname -A | awk ‘{print $1}’)”
script_wrapper: |
module purge
export PATH="/opt/TurboVNC/bin:PATH" export WEBSOCKIFY_CMD="/usr/bin/websockify" %s set_host: "host=(hostname -A | awk ‘{print $1}’)"

The name of the cluster file is the corresponding cluster we’re searching for. So in your example cluster configuration, you apps should be looking for the cluster: "elonia", only because of the filename elonia.yml.

Right. Which is the name of the file and what the apps are looking for. But its like it isn’t seeing them at all. Are there specific permissions that need to be set up there?


Oh i see. Just readable by a given unprivileged user. I guess if you can cat /etc/ood/config/clusters.d/elonia.yml as a regular user, then you should be OK (note it needs to be able to traverse into that directory as well).

If you can cat this file, then I’d look at the yml indentation problems. You can check your /var/log/ondemand-nginx/$USER/error.log for anything that may indicate an issue.