I try to follow the doc to create a “Global Static List” of queues in the cluster configuration.
We use LSF 10.2.0.9, ondemand 2.0.29 and not multiple clusters.
Since we have only one cluster there is only one file under /etc/ood/config/clusters.d/rivm_hpc.yml
It is my understanding that the name of this file defines a cluster “rivm_hpc”, it should not be necessary to include the line
job:
cluster: "something"
however, when I omit this line I cant access the cluster-configuration in the form.yml.erb: queues = OodAppkit.clusters[:rivm_hpc].custom_config[:queues]
the result is a nil (and consequently exceptions when iterating over it)
When I do include a cluster-name in the cluster-config, I see the defined queues.
But then the job submit will not work: A parameter is added to the bsub command: bsub -m rivm_hpc
this leads to: rivm_hpc: Bad host name, host group name or cluster name. Job not submitted.
When I remove the cluster name from the config and define the queues locally in the form.yml.erb I can submit (without the parameter -m).
Schouldnt OodAppkit.clusters[:rivm_hpc] always give me access to the configuration regardless of the cluster name in the cluster-config?
I cannot replicate. Your understanding of the issue is correct. One thing (the job.cluster configuration`) shouldn’t have anything to do with the other (the name of the cluster).
I tried to replicate with this file
---
v2:
metadata:
title: "LSF Test"
url: "https://www.osc.edu/supercomputing/computing/owens"
hidden: false
login:
host: "owens.osc.edu"
job:
adapter: "lsf"
custom:
queues:
- a
- b
- c
There’s something else going on here. I’m quite sure that adding cluster to that rivm_hpc.yml will not impact this functionality.
One thing here that could be affecting you is caching. Be sure to restart your webserver every time you edit your cluster.d file otherwise it’ll continue to use the cached version.
When I debug, I like to raise Standard Errors to inspect what’s going on
<%-
# see what this object looks like
raise StandardError, OodAppkit.clusters[:rivm_hpc].custom_config[:queues]
# see what this cluster object looks like
raise StandardError, OodAppkit.clusters[:rivm_hpc].inspect
-%>
---