We are looking to deploy OOD and have a question regarding the Reverse Proxy Setup. We are curious is it possible to configure the reverse proxy on a separate host from our main OOD server? The reason for this is for security reasons we have our OOD isolated from the rest of our environment and it currently cannot talk to compute nodes directly which I believe is needed for the reverse proxy to work. We are curious if we can shift this part of the workflow to another host that does have the necessary connectivity to the compute nodes?
Sorry in advance if this question seems confusing and thanks in advance for the support!
Hi,
yes, you can. I had this setup at our site where I had two OODs running, each within the networks of different clusters, but wanted users to be able to submit and launch jobs across the clusters so I had to pass off the proxy from one OOD server to the other. Once the job was launched, the other OOD server was just handling the proxy. IIRC, I think you can, for instance, add an ONDEMAND_HOST.
Eric gave me this suggestion when I was doing this, change the submit.yml.erb file, for instance:
@rsettlag One quick and maybe silly question … does the host that is running the reverse proxy (in your case inside each cluster) need to be exposed publicly for the proxy to work correctly? I suspect the answer is yes, and when users open something just a Jupyter notebook they are handed over to the host running the reverse proxy.
Sorry @rsettlag, I should have been more clear … does the user need to access the reverse proxy directly via port 443 (behind VPN is fine)?
For example does the following occur:
Assumptions:
Main OOD Server (192.168.100.1)
Proxy Server running on Cluster (10.0.0.1)
Compute Node (10.0.0.2)
When a user wants to access a Jupyter Notebook, does the following workflow occur:
User launches an Jupyter Notebook on 192.168.100.1
Scheduler spins up a Jupyter Notebook on a compute node
User select “Connect to Jupyter” on 192.168.100.1
User is redirected to 10.0.0.1 which proxies out to 10.0.0.2
Hopefully that makes more sense? But it still might be as clear as mud?
Basically what I am getting as if the the Proxy Server is firewalled off so external users can’t access it directly, does the proxy process still work? I assume it does not.
I think the workflow is stated a little differently for clarity:
User uses the Jupyter Notebook APP on 192.168.100.1 which submits a job to scheduler
Scheduler spins up a Jupyter Notebook on a compute node
User select “Connect to Jupyter” on 192.168.100.1
User is redirected to 10.0.0.1 which proxies out to 10.0.0.2
So, now the question is what do users have access to.
if your users don’t have access to the internal proxy, they need a proxy from 192.168.100.1 to 10.0.0.1 which in turn proxies to 10.0.0.2. But it sounds like you can redirect.
As Nate mentioned, we are working to deploy OOD using the reverse proxy on a separate host because of security reasons. Our main OOD server is isolated from the compute nodes.
However so far, I’ve not been successful in enabling the Interactive Desktop using this configuration. I’m looking for advice regarding what OOD or BC Desktop files I need to change to make this happen.
Example:
Main OOD Server: ood.example.org
Proxy Server: apache.example.org (just Apache is installed in here but it has network access to the compute node).
Compute Node: cn01
I followed the advice of @rsettlag and generate the variable ONDEMAND_HOST in the connection.yml file. However, apparently, the only effect this has is creating the following variables in the compute node environment:
Thus, of course, the reverse proxy fails because the OOD server cannot communicate with cn01.
How can I generate the right connection file so the link generated goes though the host running the Apache server (to perform the OOD reverse proxy) and then come back to the OOD host to be able to display the Desktop session through noVNC?
Is this even possible using the latest OOD release?
@HPCworksnoVNC-1.1.0/vnc.html is all client side, noVNC’s RFB.js on the client is used to directly authenticate the connection from the query parameters. Try setting the path as the FQDN of the remote WebSockify server that’s running instead of a relative one.
Mario, am I correct in that he needs WebSockify on the server he has Apache on if that ultimately the proxy server? The OOD server will simply redirect to that server via the view button. Perhaps I am totally off on the scenario.
@HPCworks@rsettlag Not quite, if the ultimate goal is to run WebSockify on another server then WebSockify is the only thing that needs to be reachable. OOD will redirect to wherever novnc_link references:
Take this for example, path is set to rnode/cn01/61518/websockify then when the client is redirected to noVNC-1.1.0/vnc.html and then the clients browser will take over completely from there and make a WebSocket connection to rnode/cn01/61518/websockify from the client itself.
rnode/cn01/61518/websockify is a relative path so if our domain was localhost the FQDN is https://localhost/rnode/cn01/61518/websockify
The starts a WebSocket connection to wss://localhost/rnode/cn01/61518/websockify but that path can be anything, as long as it’s a valid WS endpoint running WebSockify.
Thanks for your support and thanks to the OSC team for OOD, a fantastic HPC solution.
Our main roadblock to enable OOD interactive apps is that our OOD server is isolated from the compute nodes.
I followed your advice and even if I use the FQDN instead of a relative one the Desktop won’t display on the client browser as there’s NOT network connection between the OOD server and the compute node cn01. Thus, the WebSockify server cannot be reached (error: Failed to connect to server):
However, we have configured an Apache server in a host that can access cn01. We configured this Apache server using the ood-portal.conf file from our OOD server which contains the OOD reverse proxy instructions. However, this host doesn’t have OOD because that would defeat the purpose of our security requirements.
To try your advice, we installed noVNC in the host that runs Apache and after the Desktop Slurm Job starts at cn01, I tried to connect using a link like this:
This time noVNC connects but instead of displaying the Mate Desktop running on the compute node, it presents a login shell for the OOD server:
CentOS Linux 7 (Core) Kerner 3.10.0-1160.31.1.e17.centos.plus.x86_64 on an x86_64
ood login:
Again, we’re looking for advice on what OOD core or batch connect YML files, we need to customize to enable this set up, if at all possible. For example, I know that the Dashboard parses the connection.yml file to generate the Dashboard link to connect to the running Desktop session in the compute node. Also, that the parameters in the connection.yml file can be set up in the cluster.yml.erb “embedded Ruby” batch connect file.
Do your team at OSC has any cluster.yml.erb example files with similar reverse proxy complex set ups?
(2) Then you’ll need to pass back the real host through conn_params. Here’s how that works. You should also set these to be cluster wide (as a pose to setting it for every single app)
(3) Now at this point, your apps will know all the information they need, you need to use them in the form. I’d suggest starting with an app like Jupyter instead of desktop, just to defer the whole noVNC situation, because that may need additional stuff that I don’t have offhand.
Here’s a view.html.erb I took from our Rstudio (changing rnode to node). You see the URL we’re redirecting to has host which you’ve set as a static string in #1 and the_real_host you’ve configured in #2.
I believe you’ll have to always use /node as a post to /rnode. Our proxy will remove /rnode/host/port portions of the URL when sending to rnode (because it’s relative). You’ll need to keep these intact so your intermediary can then determine what to do/remove.
(4) At this point we’re talking to your intermediary, we’re proxying to it. Now comes the hard bit because you likely have to modify the URL to fit what the app is expecting and do the actual proxying. I don’t believe you can just use another OOD installation as the URL has an extra host in it so it may throw off our parsing expectations.
We’re still working on trying to make this intermediary proxy set up functional for Jupyter Notebooks.
While working on this, I found another way to execute Jupyter Notebooks on our compute nodes (cn01) but it involves a couple of “manual” steps that I would like to eliminate, if possible.
I access one of our Cluster login nodes and run Firefox to access our OOD server. The login nodes have network access to our compute nodes. After Slurm start the Jupyter job, (1) I use the Dashboard Files App to open the output.log and connection.yml files to get the Web link and password for the Jupyter Server in the compute node. (2) I copy and paste the link in Firefox to access Jupyter and execute Notebooks.
The links at the output.log are similar to this. But the port varies (between 2000 to 65535) for each Slurm job:
All,
I just solved my own question. The view.html.erb for this case looks like this:
<form action="http://<%= host %>:<%= port %>/node/<%= host %>/<%= port %>/login" method="post" target="_blank">
<input type="hidden" name="password" value="<%= password %>">
<button class="btn btn-primary" type="submit">
<i class="fa fa-eye"></i> Connect to Jupyter
</button>
</form>
In addition, I need to secure the Jupyter Server using SSL/HTTPS so that the password is not sent unencrypted by the browser to the web server. Right now, I’m running on a private network but when running the notebook on the public internet this is absolutely required.
Hi all - just to give an update to this thread - We’re aware of the security issues around the current proxy. Given time and resources, I’d love to replace it.
We’re tracking the work in this project here - and as you may see there’s not much movement on it.
Here’s our thinking. First We need to document the current problem space.
Then we need to explore some proxy solutions. Another apache on another instance may be in this.
We’re thinking about writing an apache module to give that layer a bit of authorization where only I can get proxied to my sessions, and I can’t even see others. This is the main problem we need to solve. Then we can start to solve SSL termination to the compute nodes and more.
We’re open to looking at existing proxies but the administrative overhead of an Envoy or similar seems like a lot, perhaps too much especially for small sites.
Hope that helps! Let me know what sort of progress you make on this.
Glad to hear you all set up Jupyter, at least a demonstration of what/how this works.
I know we have issues with /rnode URLs that I don’t have a solution for off the top, but I’ll keep thinking on it and hacking. I’ll let you know what I come up with.
Re-reading that comment I had said this. So we’ll see how to hack /node to get this to work if it can.