Blank Screen for Interactive Jupyter Session

Hello,

I’ve installed the OSC bc_jupyter example to use as an interactive session, but it only opens to a blank screen. When inspecting the console, I see a lot of failed requests for JS bundles, all 503s.

The output.log looks fine, and says the user gets logged in successfully. I am able to use an interactive desktop session successfully, so I think the web proxying works.

You’ll have to open one of those request up to see the URL path being requested.

What’s happening is we’re consuming that PATH to find the host and port to proxy too.

So if a path starts like this: /node/mycoolhost/1234 apache on the OOD server needs connectivity to mycoolhost:1234. Given you had to edit the set_host I’d guess you have something going on where it should be mycoolhost.some.domain or it has the domain and doesn’t need it. Or there’s no connectivity to the port 1234.

I’m able to curl the host and port (desktop-dy-desktop-cr-1.imaging-poc.pcluster:32634) in the URL path and get a good response from OOD. It’s using /node and not /rnode, does that matter?

/node/desktop-dy-desktop-cr-1.imaging-poc.pcluster/32634/tree

:man_facepalming: yes it’s /node sorry I updated the comment.

Also - I’m now noticing the status codes are 503, not 404 as I’d expected. We need to find where these 503s are coming from - either apache or the origin server (jupyter in this case).

Open one of those requests up to see the response headers. The Server header should indicate if it’s coming from apache or Juypter. Whichever case it is - check the logs of that application.

Hmm, I don’t see any of these requests for bundle.js in the httpd error or access logs. I wonder what’s returning the 503 then, and how it gets a Server header of Apache/2.4.37 (CentOS Stream) OpenSSL/1.1.1k

Is it from the Amazon load balancer?

The ALB logs show it forwarded the request to httpd (forward and one of the 503s is the response from httpd), so kinda weird it’s not showing any logs. Guess I’ll keep digging.

h2 2023-10-10T14:34:33.889935Z app/OpenOnD-ALB-1CUIV2YFP9X61/639b79ab36bae8e6 163.116.252.74:60042 10.0.3.181:443 0.000 0.008 0.000 503 503 404 772 "GET https://iwd.aws-research-7225140000-d3b-sandbox01-dev.aws.cloud.chop.edu:443/node/desktop-dy-desktop-cr-1.imaging-poc.pcluster/40837/static/notebook/6774.bundle.js HTTP/2.0" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 arn:aws:elasticloadbalancing:us-east-1:452734729332:targetgroup/OpenOn-OpenO-7CCXQKI9NNDT/bf5d185635b0f57f "Root=1-652560f9-69c45a11546f810a2795b3a9" "iwd.aws-research-7225140000-d3b-sandbox01-dev.aws.cloud.chop.edu" "arn:aws:acm:us-east-1:452734729332:certificate/93ade204-5913-4bd8-9918-96d83a51a995" 0 2023-10-10T14:34:33.881000Z "forward" "-" "-" "10.0.3.181:443" "503" "-" "-"

Something I noticed is that curl with https:// to the notebook JS bundle hangs, while http:// returns successfully immediately. Unsure if this helps.

Fwiw, this is just icing on the cake for the consuming team so it’s no biggie if this doesn’t work. I suppose I can always just run Jupyter inside the interactive desktop.

Maybe it’s something to do with SSL re-encryption/offloading.

I still can’t believe you don’t see the requests in httpd. Did you check all the logs? there’s a _error.log but there’s also just an error.log too. If httpd doesn’t see it/log it then I doubt the ALB is actually forwarding the request. Or at least - you should confirm with maybe nestat that the tcp connection is being made.

They haven’t tried to copy and paste in a destop yet! My guess is that’s how the feel now, give it time and they’ll ask for it again.

Sadly, grep -rnw /var/log/ -e 'bundle.js' returned nothing :frowning:

It’s pretty weird. I see that a 503 is the default error for an OOD outage, ErrorDocument 503 /public/maintenance/index.html in /etc/httpd/conf.d/ood-portal, but at this point I’m grasping at straws.

Looks like the TCP connection is indeed being made on both OOD and the compute node. Not sure why all these ports are getting hit though, when Jupyter is listening on port 13045. Apparently TIME_WAIT means the connection was closed locally. Not sure what to make of this.

OOD Netstat:

tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:33134 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:33552 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:33196 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-161.ec2:45324 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:32098 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:32614 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:34016 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:34050 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:34264 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:34418 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:32152 TIME_WAIT
tcp6       0      0 ip-10-0-3-181.ec2:https ip-10-0-0-47.ec2.:32064 TIME_WAIT

Compute netstat:

tcp        0      0 desktop-dy-deskto:13405 ip-10-0-3-181.ec2:49098 TIME_WAIT
tcp        0      0 desktop-dy-deskto:13405 ip-10-0-3-181.ec2:41270 TIME_WAIT
tcp        0      0 desktop-dy-deskto:13405 ip-10-0-3-181.ec2:49156 TIME_WAIT
tcp        0      0 desktop-dy-deskto:13405 ip-10-0-3-181.ec2:41324 TIME_WAIT
tcp        0      0 desktop-dy-deskto:13405 ip-10-0-3-181.ec2:49070 TIME_WAIT
...

TIME_WAIT status is fine and expected to see in this scenario. The good news is that at least we know we’re proxying back to the origin server (Jupyter).

If you turn debug logs on for Jupyter does it have any new insight into what’s going on?

I’d also wonder if you got a Juypter session running - if you were able to connect to it through some other mechanism (maybe a desktop + browser and connect to the desktop-dy-deskto:13405 directly) what the behavior would be. I guess I’m trying to eliminate OOD from the request routes and verify that the app is actually running correctly. so you connect directly tot he app without routing through an ALB or through OOD’s httpd.

Well, I can at least get a webpage back from Jupyter when running on an interactive desktop on the same node.

Here are the debug logs. The line Path bundle.js served from /usr/local/lib/python3.8/site-packages/notebook/static/bundle.js seems interesting. Is OOD looking for the bundle paths on its own filesystem? But then how is the favicon getting served? I would assume the *_bundle.js requests get proxied too.

Discovered Jupyter Notebook server listening on port 33504!
TIMING - Wait ended at: Tue Oct 10 19:23:06 UTC 2023
Generating connection YAML file...
[D 2023-10-10 19:24:18.712 ServerApp] Generating new user for token-authenticated request: 8a5fee7697884051933bb3c2a4660761
[I 2023-10-10 19:24:18.714 ServerApp] User 8a5fee7697884051933bb3c2a4660761 logged in.
[I 2023-10-10 19:24:18.715 ServerApp] 302 POST /node/desktop-dy-desktop-cr-1.imaging-poc.pcluster/33504/login (8a5fee7697884051933bb3c2a4660761@10.0.3.181) 4.16ms
[I 2023-10-10 19:24:18.786 ServerApp] 302 GET /node/desktop-dy-desktop-cr-1.imaging-poc.pcluster/33504/ (@10.0.3.181) 0.42ms
[D 2023-10-10 19:24:18.852 ServerApp] Paths used for configuration of page_config: 
    	/etc/jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:18.853 ServerApp] Paths used for configuration of page_config: 
    	/usr/local/etc/jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:18.853 ServerApp] Paths used for configuration of page_config: 
    	/usr/etc/jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:18.853 ServerApp] Paths used for configuration of page_config: 
    	/shared/home/Admin/.local/etc/jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:18.854 ServerApp] Paths used for configuration of page_config: 
    	/shared/home/Admin/.jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:18.855 JupyterNotebookApp] Using contents: services/contents
[D 2023-10-10 19:24:18.861 JupyterNotebookApp] 200 GET /node/desktop-dy-desktop-cr-1.imaging-poc.pcluster/33504/tree? (8a5fee7697884051933bb3c2a4660761@10.0.3.181) 10.38ms
[D 2023-10-10 19:24:19.012 ServerApp] Paths used for configuration of page_config: 
    	/etc/jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:19.012 ServerApp] Paths used for configuration of page_config: 
    	/usr/local/etc/jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:19.013 ServerApp] Paths used for configuration of page_config: 
    	/usr/etc/jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:19.013 ServerApp] Paths used for configuration of page_config: 
    	/shared/home/Admin/.local/etc/jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:19.013 ServerApp] Paths used for configuration of page_config: 
    	/shared/home/Admin/.jupyter/labconfig/page_config.json
[D 2023-10-10 19:24:19.014 JupyterNotebookApp] 200 GET /node/desktop-dy-desktop-cr-1.imaging-poc.pcluster/33504/custom/custom.css (8a5fee7697884051933bb3c2a4660761@10.0.3.181) 3.20ms
[D 2023-10-10 19:24:19.015 ServerApp] Path bundle.js served from /usr/local/lib/python3.8/site-packages/notebook/static/bundle.js
[D 2023-10-10 19:24:19.016 ServerApp] 200 GET /node/desktop-dy-desktop-cr-1.imaging-poc.pcluster/33504/static/notebook/bundle.js (8a5fee7697884051933bb3c2a4660761@10.0.3.181) 1.01ms
[D 2023-10-10 19:24:20.215 ServerApp] Path favicons/favicon.ico served from /usr/local/lib/python3.8/site-packages/jupyter_server/static/favicons/favicon.ico
[D 2023-10-10 19:24:20.216 ServerApp] 200 GET /node/desktop-dy-desktop-cr-1.imaging-poc.pcluster/33504/static/favicons/favicon.ico (8a5fee7697884051933bb3c2a4660761@10.0.3.181) 1.16ms

Installed the same version of Python on OOD and installed notebook to have the same files at /usr/local/lib/python3.8/site-packages/notebook/static/*.bundle.js with no luck :frowning:

Could this be related? c.NotebookApp.base_url = '/node/desktop-dy-desktop-cr-1.imaging-poc.pcluster/57415/', but I see in this doc that it should be /pun....

Granted, this doc is for OOD 3.0.0, but I’m using OOD 2.0.28.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.