Hello - I’m looking into an easier way for users to SSH into a compute node using something like Putty. My thought is that when a Desktop is launched the option to download a pem key for that user is an option. Is there a way to create the key on launch and present that to the user in the OoD console/job so that the user can download it ?
I would suggest that you create a reference document on how to create the SSH keys, and for the users to create their own. You can provide different steps depending if they use plain SSH (Mac, Linux, WSL) or putty.
Another possibility is that the keys are created along with the user onboarding, where the first time they log into the head node (or similar host) they are asked to create their key and given instructions on where to put it on their client. This would be a terminal based script, not necessarily in OOD.
You can easily create ssh key, sync the pub key into some destination as part of your compute node script when you creating the pcluster. It is better to keep that feature as part of pcluster rather than ood.
That makes sense. So on the compute node boot script in pcluster are you thinking as part of that script I copy the key to their home directory ? Seems like that could work.
What I like about all of this is that the keys are temporary and only for the users compute node which terminates after the job is complete.
This is how OpenHPC creates user keys on first login. You can probably modify it to also provide instructions on how to download that key to their client. These were in the /etc/profile.d directory on the nodes. You probably just need them on login nodes or maybe the OOD node too if they onboard via the built-in terminal there.
[root@node001 profile.d]# cat ssh-keygen.sh
if [ "$(id -u 2>/dev/null)" != "0" ]; then
if [ ! -f ~/.ssh/id_ecdsa ]; then
if [ -w ~ ]; then
echo Creating ECDSA key for ssh
ssh-keygen -t ecdsa -f ~/.ssh/id_ecdsa -q -N ""
cat ~/.ssh/id_ecdsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys ~/.ssh/id_ecdsa ~/.ssh/id_ecdsa.pub
fi
fi
fi
[root@node001 profile.d]# cat ssh-keygen.csh
if ( "`id -u`" != "0" ) then
if ( ! -e ~/.ssh/id_ecdsa ) then
if ( -w ~/ ) then
echo Creating ECDSA key for ssh
ssh-keygen -t ecdsa -f ~/.ssh/id_ecdsa -q -N ""
cat ~/.ssh/id_ecdsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys ~/.ssh/id_ecdsa ~/.ssh/id_ecdsa.pub
endif
endif
endif
While it may be an established practice, please avoid creating unprotected, long-lived SSH keys to be given to users. If the same key is used over and over, it will inevitably get mis-used for general-purpose authentication in your CI or someone else’s, and the unprotected private key will end up in the wrong hands. As one might imagine, this type of security intrusion has the potential to span organizations and incur substantial costs.
Going back to the original problem, SSH certificates might be a way forward, which address two challenges: limiting the lifetime of the keypair, and authorizing the keypair on a host that may not already have the user’s authorized_keys file. It could work something like this (some assembly required):
OOD has a signing key that only root can use.
The compute node SSH configuration trusts this signing key for non-system/non-privileged accounts.
When a user requests a direct-SSH keypair, they initiate a sequence of commands on the OOD host (see: sudo and SUDO_USER environment variable for inspiration) that will generate a fresh key pair and an SSH certificate signed by (1); the user’s account name must be baked into the certificate’s principal list, along with a reasonably short but usable lifetime (few hours, few days?)
This keypair is saved where the user’s PUN can pick it up, both in OpenSSH and PuTTY flavor. (mind the umask and permissions)
The user downloads their key and loads it into their SSH client.
The user attempts to SSH in to the compute node, which will accept the authentication as long as: the user’s client provides the certificate and evidence of having the private key, the certificate is valid/not expired, and the set of allowed principals contains the account name that is being logged in to.
Alternatively, if an intermediate jump-host isn’t a problem for the users’ workflows, they can log in to the jump-host using the same credentials as OOD (unless it’s OIDC… sorry!) and use hostbased from the jump-host to the compute node.
In any case, some mechanism needs to prevent users from logging in to a compute node unless they’re supposed to be there (think: running one of the user’s jobs vs someone else’s). That however is a different issue.