OOD app that uses websockets

Has anyone made a pass at building a streamlit OOD app? We are not sure what to do about the websocket connections it seems would be required. Looks like something like the following is required to happen in apache parlance:

RewriteEngine On
ProxyPreserveHost Off

 <Proxy *>
         Order deny,allow
         Allow from all
 </Proxy>

RewriteCond %{HTTP:Upgrade} =websocket
RewriteRule /(.*) ws://localhost:8501/$1 [P]
RewriteCond %{HTTP:Upgrade} !=websocket
RewriteRule /(.*) http://localhost:8501/$1 [P]
ProxyPassReverse / http://localhost:8501

Where 8501 is replaced by the open port on the compute host which can be found via similar means as the jupyter app. Is there an example that’s less complicated than the shell app that uses websockets?

I want to say we support this out of the box. I mean that apps can have websockets and we proxy to them just fine.

Not sure what prompted this, but I’d imagine you’ve found a dropped connection somewhere from the client → apache → origin.

Again - I don’t believe you’d need extra apache configs to make this work. What’s the behaviour with the default apache conf?

Thanks, I think you’re probably right. Some form of overly aggressive chrome caching plus some other weird behavior which I’ll describe below was confusing the situation a bit. In view.html.erb I have tried several things but it seems like something similar to what’s in our jupyter app is close to working. The app essentially searches for an open port, fires up the server using that port and then should proxy back using the rule defined in view.html.erb for the button, etc.

“Should” … what we are seeing is a 405 in the browser until the page is refreshed a couple times and then the app seems to proxy through ok and work. It doesn’t seem to be timeout related at the start of the job, as in I can wait for several minutes after the app starts and that error still occurs. I’ve cranked up the log level on the app side and it doesn’t seem to be getting any requests, so seems to be stuck somewhere in the plumbing. Any ideas?

Here’s our view.html.erb

<form action="/rnode/<%= host %>/<%= port %>/" method="post" target="_blank">  
<button class="btn btn-primary" type="submit">
    <i class="fa fa-eye"></i> Connect
  </button>
</form>

We don’t see any 405s in the user nginx logs but we do see 405s in the apache ssl error log, e.g.,

[Mon Aug 02 20:13:32.463102 2021] [lua:info] [pid 19978:tid 139943170987776] [client redacted:52842] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“POST” req_accept=“text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.9” req_user_agent=“Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36” res_content_length=“87” req_content_type=“application/x-www-form-urlencoded” res_content_encoding="" req_status=“405” req_origin=“https://redacted” time_user_map=“55.888” local_user=“redacted” req_referer=“https://redacted/pun/sys/dashboard/batch_connect/sessions” res_content_language="" req_port=“443” req_is_websocket=“false” req_server_name=“redacted” log_hook=“ood” req_accept_charset="" req_hostname=“redacted” log_id=“YQiKLJg6N7MXT0HwCqaJBQAABhc” res_content_location="" res_location="" log_time=“2021-08-03T00:13:32.463011Z” remote_user=“redacted” req_user_ip=“redacted” res_content_disp="" req_is_https=“true” req_filename=“proxy:http://gpu-006.storage:47157/” req_uri="/rnode/gpu-006.storage/47157/" time_proxy=“2.359” res_content_type=“text/html; charset=utf-8” req_accept_language=“en-us,en;q=0.9” req_cache_control=“max-age=0” req_accept_encoding=“gzip, deflate, br”, referer: https://redacted/pun/sys/dashboard/batch_connect/sessions

A couple things here:

You’re using /rnode so the request seen by the origin (streamlit in this case) gets just /. I mean /rnode/some.host.edu/3005 gets turned into just / to the app. It doesn’t see the rest of that path.

The 405 is method not allowed. You’re sending a POST request to / and it doesn’t like that. You’ll notice that Jupyter we send to /login. Meaning we POST credentials to /login which is what Jupyter expects.

You’ll have to find what path your app (streamlit) wants you to login through. And you’ll definitely want some sort of authentication here. I don’t see username or password in the form there, so I’m guessing you redacted it. That’s fine, but I cannot stress this enough - that you need to set the origin app (streamlit) up to authenticate, otherwise anyone will be able to get into the app.

Yeah, I don’t think there’s any real form of auth built in to streamlit that I can enforce easily. There seem to be ways to have the indvidual streamlit app request a password, but our developers want control of that code via filepicker.

I’m not sure where to redirect other than /, I don’t think there’s a notion of a /login in streamlit (tried that), we are doing that POST primarily so it would open in a new tab, not sure if there’s another way there. By auth in this context there does seem to be some cookie functionality available but in terms of actual login to streamlit, does not seem available.

I setup a standalone VM with the above apache config in the first post of this thread and I don’t see any of these 405 issues and the VM instance is accessible by accessing just the hostname with no subpath in the browser. With that instance we are using basic auth against our LDAP. That level of control is already in OnDemand of course. Our compute nodes are on a private network not accessible from the corporate/public internal network (the OnDemand host obviously has access to that network) so I’d figure that’s less of a concern? Maybe I’m missing something though in terms of a security hole.

OK well you may be able to change this to GET method instead of POST.

<form action="/rnode/<%= host %>/<%= port %>/" method="get" target="_blank"> 

To the security flaw - I could find your instance through brute forcing requests /rnode/<host>/<port>. It may take me some time, but eventually I’d get apache to redirect to your instance of the app. Now if that app has authentication - it’s going to prompt me (or reject my request based off of cookies/headers). If the app doesn’t prompt or reject me - well then I’m being redirected to directly your app instance same as you and well, that’s that.

I’ll try the GET and report back.

With regard to the security vulnerability, maybe I am still missing something but the user would have to already have authenticated against OnDemand to do that brute force walk right?

Do you trust everyone who’s authenticated? Yes this attack requires authentication, but if you’re like OSC - getting an account is fairly easy for student OSU or any other Ohio university (so lots of folks logging).

Note that the GET method worked, I’m pretty sure I tried that prior, once again misled due to overly aggressive caching in chrome. In any case, thanks again for the nudge!

Our users are trusted corporate citizens, so yeah, I guess the risk is getting access to someone else’s data primarily, it’s not really a risk of elevating to admin privileges. I’m much more concerned about someone compromising apache, say, but probably should think about this a bit more. Would be nice if streamlit would support this out of the box.