Hi,
We are a small branch campus and have a couple of GPU servers. We recently implemented slurm and are working on fine tuning it. Our biggest challenge has been to restrict users from ssh’ing into the compute nodes directly, but that’s not an OOD issue.
Our users typically run direct python code from pip virtual environments or they use Jupyter notebooks (at least these are the 2 most common use cases I’ve seen in the past year). We realized that launching jupyter requires ssh tunnel hopping and sadly there seems to be inconsistencies in user experience and we constantly need to troubleshoot and figure out for almost every other user.
I don’t know much about OOD, but just glanced through the install docs and read up a bit on what it does. Seems like it would be pretty good solution for running jobs directly, as well as running jupyter notebooks without the ssh tunneling hassle. Is this a correct understanding or I’m oversimplifying it? Secondly, we use Shibboleth and came across a bunch of posts here which suggest OOD and Shibboleth are not necessarily best friends (Jupyter interactive app with Shibboleth SSO). Hence wanted to ask if auth is required only for sysadmin purposes or even for users? If its only sysadmins, then we are just a couple of folks here and can setup Dex? or whatever is easier, but if its all users submitting jobs needing to auth then we would need to figure out and make it work. Would also request info if running OOD requires pretty involved sysadmin work once setup so that we know what to expect.
Just thought to post first to get some feedback before we dive in.
Welcome! Yes, the Jupyter use case you describe is exactly what a large majority of the 2,100+ sites that utilize Open OnDemand setup.
Authentication is generally required for all users. You really want to require it for all users anyways since without it you have no control processes in place around who does what on your system (plus it’s just general best practice for any shared IT resource). I’ll let others chime in with more knowledgeable details regarding Shibboleth in particular.
Running Open OnDemand typically doesn’t require much sys admin work. We regularly hear stories of skilled sys admins getting it setup and running as a production service with less than half a days work, and then it requiring very minimal maintenance / attention. (This is actually a huge issue for us, since many sites don’t upgrade to newer versions regularly since they don’t see the need to invest any significant time in maintaining it since it ‘just works’!)
“Our biggest challenge has been to restrict users from ssh’ing into the compute nodes directly, but that’s not an OOD issue.“ If it is a SLURM cluster, perhaps slurm_pam_adopt could help?
And yes, OOD is just the thing for us to save users from SSH tunneling as it does Proxy things like JupyterLab notebooks automatically.
Thanks so much @gshamov and @alanc and my apologies for coming back to this late. Same old story, got pulled into other production stuff
It looks like we will certainly give it a try, it seems very promising and the fact that so many institutions are using it speaks for itself.
I didn’t know about pam_slurm_adopt, but it might be just the thing we need, thanks for that.
One other question I wanted to ask before I install and configure is regarding OOD integration with Slurm for ssh access. Currently we have a login node where the users ssh in and run their code with srun or sbatch, they have their home directories on the login node and exported to the compute node. I’m trying to understand with OOD how that would play out? From the docs I understand that OOD provides a browser based shell but if some of our users prefer an actual ssh shell and want to continue using the login node, would that mean OOD would not have information on those jobs? Secondly, since OOD will be on a different VM, I was thinking to export the users home directories from the login node also to the OOD machine, does that make sense?
Thanks once again
Thanks @jeff.ohrstrom for your prompt reply. It certainly clarifies it a bit more, otherwise I was thinking if I should install OOD on the login node itself and was not sure on the right approach. I’m planning to do the basic install/configure by tomorrow and see how Shibboleth auth works out.
Hi,
So far things haven’t been too bad. I was able to install and configure ood with shibboleth auth. I see the following dashboard, after auth it gets redirected to this url: “https://ood.mydomain/pun/sys/dashboard/” I only see ‘Desktop’ under ‘Interactive Apps’ but I guess that’s because I haven’t configured a cluster and/or any apps yet.
I will go through the docs to understand what else is needed to be configured, but also wanted to shamelessly ask if anyone can point me to just the relevant docs as our use case is pretty minimal at this point. We are currently looking to just have jupyter notebooks available through OOD. This was the main reason we came to OOD in order to avoid ssh tunneling and its headaches. For direct ssh access for users to run their code, we have a login node which is working fine. So if anyone can point to or give a tldr version of jupyter notebooks config, it would be awesome.
Another thing I noticed that the install/config docs for Ubuntu seem to leave some things out (maybe that’s on purpose) but just wanted to ask if anyone wants, I can write up a quick and dirty how-to to fill in the gaps for OOD installation and apache config on Ubuntu 24 as well as integrating with Shibboleth auth.
Thanks much
Thanks @jeff.ohrstrom for the Jupyter app docs. It doesn’t seem too complicated but before I start on it, I thought of something that I had not done yet. So, the authentication works through our Shibboleth IDP but I don’t see any place how/where to define admin users vs. regular users. I looked through the docs and came across “user_map_match” under Setup User Mapping but I haven’t been able to find the separation of admin users vs. regular users. I see that when I ran “/opt/ood/ood-portal-generator/sbin/update_ood_portal” it supposedly updated ood_portal.yml and the apache site file ood-portal.conf which have a reference to user_map_match. As far as I understand, it just removes the email part and extracts the username from REMOTE_USER. Some other applications/platforms I have worked with which integrate with SAML/Shibboleth typically have a users section where we can select certain users to have admin access but I don’t see something like that in OOD. Am I missing something obvious?
There is no such thing as ‘admin user’, but there are ‘app developers’ where certain users can develop apps that only they can see. This is mostly what you’re thinking about. Here you’ll be able to develop applications in your own sandbox in your own HOME. Then when you’re ready to publish them to your users you can move them to a /var/www directory.
Thanks for the super fast reply @jeff.ohrstrom . I was just about to head out and saw your response. I’ll take a look at App Development that you linked but just so I understand correctly, there is no concept of admin users as far as OOD is concerned? Is that mainly because at the browser/dashboard level, there are no admin tasks to be carried out? I was mainly thinking along the lines of an admin user being able to view and/or cancel/modify other users submitted jobs from OOD, but maybe I’m misunderstanding OOD conceptually? (Of course I can do those things from Slurm or directly from the compute nodes.)
Yea, OnDemand runs as your regular/unprivileged user. Generally buttons to cancel a job only apply to your jobs as we don’t assume any user can cancel other users jobs. So even if your unprivileged/regular user can cancel other users’ jobs, the buttons to do so won’t appear in the application.
But in sum, OOD is running as you - your UID/GID - not root or anything like that. So no, there’s no such thing as ‘admin’ user or similar. The app runs as your unprivileged/regular user and interacts with the scheduler as that user without privilege escalation.