How to enable shell/terminal access to compute nodes but not head/admin node

Hello,

Is it possible to configure OOD so that users have shell/terminal/SSH access to compute nodes where they have active jobs and/or interactive sessions but not to the cluster admin/head node? Right now we’re running OOD 3.0.3 in a test environment with a slurm/warewulf/OpenHPC3 cluster, the OOD server separate from the cluster admin node.

I know it’s possible to disable the shell app entirely, or to do essentially the opposite with ood_bc_ssh_to_compute_node=false, but there doesn’t seem to be a way to set ~“ssh to your allocated compute nodes only”. We’d like to use OOD as the primary front end to the cluster with minimal/no user activity on the admin node.

Thanks!

Chris: I’m not exactly sure what you are trying to accomplish here or what the issue is? I’m also note sure what you mean by ‘cluster admin/head node’? In a typical setup like we have at OSC, our clusters have a handful of ‘login’ nodes and a whole bunch of ‘compute’ nodes. Clients can generally SSH into the login nodes at will (which is the whole purpose of them). Our OnDemand instances are run in VMs, and we have those VMs configured to not allow clients to generally be able to SSH into them (they can obviously connect via web browser to them). The shell apps in OnDemand are configured to connect clients to the previously mentioned login nodes(because again, that is the whole purpose of having them).

In general, OOD relies upon the underlying operating systems and software for most everything. So if you want to prevent/allow somebody from being able to SSH into a particular node you’ll need to configure that outside of OOD via things like trust relationships and ssh daemon configurations.

Hope this helps!

Thanks for the quick response. By the cluster admin node I mean the server doing the provisioning and resource management, basically warewulf and slurmctld, the machine that’s “submit_host” in a clusters.d/clustername.yml. I’d like that node to not be a login node, so the setup would be like:
A. OnDemand server – users log in via the ood portal
|||
B. Cluster admin node / slurm+warewulf server – users don’t access directly
|||
C1-C999. Compute nodes – users can ssh to nodes allocated to their jobs

In typical setups is what I’m calling B always a login node, or is there a way to have the user activity on A even though users can submit jobs to the cluster managed by B?

Thanks again!

OK so

  1. Your SSH_ALLOWLIST can help you here especially if your admin node’s hostname is sufficiently different from your compute node names. That is all the compute nodes are on the allowlist, but the admin node isn’t. This will ensure that you users can’t use the shell app because that host won’t be on your allowlist. That’s how our clusters work. Our admin nodes start with the string slurm where our compute nodes don’t.

  2. You’re better off setting up real network and ssh security here. Sure maybe folks can’t ssh directly from the OOD node → admin node. But what about OOD → compute → admin? All they need to do is ssh to somewhere else from OOD then try to ssh into the admin node from that middle point. OOD can’t help you in this case because you’re using the machine in middle, so your better off setting up this admin node with real networking and ssh policies that disable this from anywhere.

Thanks guys, that’s very helpful. I had OOD_SSH_ALLOWLIST set to only my compute nodes, but I didn’t have v2.metadata.hidden set, so the cluster admin host “B” was getting added in there as well.

Now that I understand that the cluster login host doesn’t have to be the same as the cluster submit host, I believe I can get the configuration I want by setting both the cluster login host and the shell default ssh host to localhost (host “A” above, my OOD server). This is what I have now:

/etc/ood/config/apps/shell/env:

   OOD_SSHHOST_ALLOWLIST="node-[1-9]: ... (compute nodes only)"
   OOD_DEFAULT_SSHHOST=localhost

/etc/ood/config/clusters.d/clustername.yml:

   v2:
      login:
         host: localhost
         default: true

this seems to do what I want. “B” is still my admin node for slurm, but user shells go to “A” or a compute node, and I can lock down “B”.