Websocket Connections Error

I have a new build of Open OnDemand and everything seems to be working as it should such as Files, Jobs, Job Monitor. The one thing that is not working is the ‘Clusters’ terminal access. We built this up in a lab environment and all was working fine, but after installing on the corporate network, the terminal access gives an error of “Failed to establish a websocket connection. Be sure you are using a browser that supports websocket connections.” I’m suspecting this might be a setting on the web browser, but have been unable to pin point. This happens with Chrome and Microsoft Edge. Any thoughts as what to look at? I did look at the /var/log/ondemand-nginx/$USER/error.log. nothing stuck out to me as being an error.

Thanks in advance, Kyle & Dom

There is a similar question here in which I identify troubleshooting steps. In fact, Firefox now ships within it’s dev tools the ability to see websocket messages (if there are any). Your suspicion about where the error is correct. Or at least it’s between you (the client browser) and the OOD server (apache).

You mention /var/log/ondemand-nginx, did you check the httpd24 logs as well? (the other user said there was nothing there, but maybe there is for you?).

Then just to be clear, and for documentation purposes, here’s what a successful connection looks like. Note the response code is 101 switching protocols.

Thanks for the reply. Kyle who is cc’d here has access to the server so he will check this and let us both know.

Thanks

Dom

Hi Jeff, I’ll attempt to get access to the server tomorrow and let you know what we see. Thanks, Kyle

Hi Jeff, I did get access today. In the httpd24 error logs I just see a mention of “req_is_websocket="false ”
not sure how to read this.

In the access log I didn’t see any mention of websocket.

I’ve attached a snippet of these two log files if you’d like to look. Thanks, Kyle

(Attachment error_log.txt is missing)

(Attachment access_log.txt is missing)

It says those txt files are missing. Also can you provide information on what your client browser sees (like I have in the devtools pane.)? I get a 101 status upgrade. What happens in your client? You may have to look into the console as well as the network tab.

I see the txt file was blocked. Here is a bit of the error log just so you can see. Kyle

[Tue Feb 04 07:47:20.659013 2020] [lua:info] [pid 4003] [client 10.48.17.42:55275] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“GET” req_accept="/"
req_user_agent=“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36” res_content_length=“354781” req_content_type="" res_content_encoding="" req_status=“200” req_origin="" time_user_map=“366.972”
local_user=“nortech” req_referer=“http://10.48.195.9/pun/sys/file-editor/edit/home/nortech/fluent-13-error.log” res_content_language="" req_port=“80” log_time=“2020-02-04T13:47:20.658887Z” req_server_name=“10.48.195.9” log_hook=“ood” req_accept_charset=""
req_hostname=“10.48.195.9” res_content_location="" res_content_disp="" req_is_websocket=“false” remote_user=“nortech” res_location="" req_user_ip=“10.48.17.42” req_is_https=“false” req_filename=“proxy:http://localhost/pun/sys/file-editor/ace/1.2.6/ace.js
req_uri="/pun/sys/file-editor/ace/1.2.6/ace.js" time_proxy=“117.446” res_content_type=“application/javascript” req_accept_language=“en-us,en;q=0.9” req_cache_control="" req_accept_encoding=“gzip, deflate”, referer: http://10.48.195.9/pun/sys/file-editor/edit/home/nortech/fluent-13-error.log

[Tue Feb 04 07:47:20.849867 2020] [lua:info] [pid 4003] [client 10.48.17.42:55275] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“GET” req_accept="/"
req_user_agent=“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36” res_content_length=“18028” req_content_type="" res_content_encoding="" req_status=“200” req_origin=“http://10.48.195.9” time_user_map=“95.167”
local_user=“nortech” req_referer=“http://10.48.195.9/pun/sys/file-editor/assets/application-ce8832b5e187a2e1c108b3ce5e808a12ee8c9a806c42eb27a160ca9724e7d974.css” res_content_language="" req_port=“80” log_time=“2020-02-04T13:47:20.849730Z” req_server_name=“10.48.195.9”
log_hook=“ood” req_accept_charset="" req_hostname=“10.48.195.9” res_content_location="" res_content_disp="" req_is_websocket=“false” remote_user=“nortech” res_location="" req_user_ip=“10.48.17.42” req_is_https=“false” req_filename=“proxy:http://localhost/pun/sys/file-editor/assets/bootstrap/glyphicons-halflings-regular-fe185d11a49676890d47bb783312a0cda5a44c4039214094e7957b4c040ef11c.woff2
req_uri="/pun/sys/file-editor/assets/bootstrap/glyphicons-halflings-regular-fe185d11a49676890d47bb783312a0cda5a44c4039214094e7957b4c040ef11c.woff2" time_proxy=“1.603” res_content_type=“application/octet-stream” req_accept_language=“en-us,en;q=0.9” req_cache_control=""
req_accept_encoding=“gzip, deflate”, referer: http://10.48.195.9/pun/sys/file-editor/assets/application-ce8832b5e187a2e1c108b3ce5e808a12ee8c9a806c42eb27a160ca9724e7d974.css

[Tue Feb 04 07:47:21.039163 2020] [lua:info] [pid 4003] [client 10.48.17.42:55275] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“GET” req_accept="/"
req_user_agent=“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36” res_content_length=“2360” req_content_type="" res_content_encoding="" req_status=“200” req_origin="" time_user_map=“174.478”
local_user=“nortech” req_referer=“http://10.48.195.9/pun/sys/file-editor/edit/home/nortech/fluent-13-error.log” res_content_language="" req_port=“80” log_time=“2020-02-04T13:47:21.39045Z” req_server_name=“10.48.195.9” log_hook=“ood” req_accept_charset="" req_hostname=“10.48.195.9”
res_content_location="" res_content_disp="" req_is_websocket=“false” remote_user=“nortech” res_location="" req_user_ip=“10.48.17.42” req_is_https=“false” req_filename=“proxy:http://localhost/pun/sys/file-editor/ace/1.2.6/theme-solarized_light.js” req_uri="/pun/sys/file-editor/ace/1.2.6/theme-solarized_light.js"
time_proxy=“1.201” res_content_type=“application/javascript” req_accept_language=“en-us,en;q=0.9” req_cache_control="" req_accept_encoding=“gzip, deflate”, referer: http://10.48.195.9/pun/sys/file-editorit/home/nortech/fluent-13-error.log

[Tue Feb 04 07:47:21.050863 2020] [lua:info] [pid 1561] [client 10.48.17.42:55274] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“GET” req_accept=“text/plain,
/; q=0.01” req_user_agent=“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36” res_content_length=“3419” req_content_type="" res_content_encoding="" req_status=“200” req_origin="" time_user_map=“175.297”
local_user=“nortech” req_referer=“http://10.48.195.9/pun/sys/file-editor/edit/home/nortech/fluent-13-error.log” res_content_language="" req_port=“80” log_time=“2020-02-04T13:47:21.50728Z” req_server_name=“10.48.195.9” log_hook=“ood” req_accept_charset="" req_hostname=“10.48.195.9”
res_content_location="" res_content_disp="" req_is_websocket=“false” remote_user=“nortech” res_location="" req_user_ip=“10.48.17.42” req_is_https=“false” req_filename=“proxy:http://localhost/pun/sys/files/api/v1/fs/home/nortech/fluent-13-error.log” req_uri="/pun/sys/files/api/v1/fs/home/nortech/fluent-13-error.log"
time_proxy=“7.403” res_content_type=“text/plain; charset=utf-8” req_accept_language=“en-us,en;q=0.9” req_cache_control="" req_accept_encoding=“gzip, deflate”, referer: http://10.48.195.9/pun/sys/file-editor/edit/home/nortech/fluent-13-error.log

[Tue Feb 04 07:47:40.738097 2020] [lua:info] [pid 4266] [client 10.48.17.42:55278] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“GET” req_accept=“text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3”
req_user_agent=“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36” res_content_length=“0” req_content_type="" res_content_encoding="" req_status=“304” req_origin="" time_user_map=“139.499” local_user=“nortech”
req_referer=“http://10.48.195.9/pun/sys/dashboard/” res_content_language="" req_port=“80” log_time=“2020-02-04T13:47:40.737941Z” req_server_name=“10.48.195.9” log_hook=“ood” req_accept_charset="" req_hostname=“10.48.195.9” res_content_location="" res_content_disp=""
req_is_websocket=“false” remote_user=“nortech” res_location="" req_user_ip=“10.48.17.42” req_is_https=“false” req_filename=“proxy:http://localhost/pun/sys/shell/ssh/grvhpc01.na.corp.owenscorning.com” req_uri="/pun/sys/shell/ssh/grvhpc01.na.corp.owenscorning.com"
time_proxy=“12.313” res_content_type="" req_accept_language=“en-us,en;q=0.9” req_cache_control="" req_accept_encoding=“gzip, deflate”, referer: http://10.48.195.9/pun/sys/dashboard/

OK… 304 Not Modified huh? Are you being cached somewhere in between yourself and the OOD server?

Looks like you’re hitting this function. The response is not cacheable, so the only other bit is the request is not fresh. For ‘freshness’ we look at etag and last-modified headers with this package. If you look at my request in the image above I use Cache-Control: no-cache which subverts this whole thing. I did not set that myself, it must be set somewhere/somehow.

    if (this.isCachable() && this.isFresh()) {
      this.notModified()
      return
    }

Do you have any updates? Are you being cached somewhere?

Hi Jeff, unfortunately we are finding that the client and the OOD server are on separate VLAN’s which are very tightly controlled. We are working on a
separate issue to open the firewall from client to where the OOD server resides. Once this is complete we will retest and see if we still see the issues with the terminal. Kyle

1 Like

@kgross were you able to resolve the issue?

Hi Jeff, Sorry for the late response. We ended up facing other non-“open ondemand” issues that has taken our time away from chasing this one. As of now
this issue still exists. Kyle

No problem, I was just checking in. I guess just let us know when you get back around to it. Again, you seem to be being cached somewhere between you (the web browser) and the OOD server. Is there something doing L7 caching/header modification in between (or maybe even a browser plugin itself)? This is what I would ask when you find time to get back to the issue.

I am seeing the same issue. I have read through the discussion and I find myself with the same error:
Failed to establish a websocket connection. Be sure you are using a browser that supports websocket connections.
I have made sure they are enabled in the browser, etc… If you have any further suggestions, many thanks.

@georges Hi! Please open the web browsers’ dev tools and see what the response code is (I’ve shared an image of what it should be above). That should help us narrow down the issue.

We are getting a 500 response code.
Request URL:ws://<hostname_here>/pun/sys/shell/ssh/default?csrf=qsPB5dp5-VkhlSAACDY0pzbesZoIFjXqVv5w

Request Method:GET

Remote Address:10.150.18.68:80

Status Code:

500

Version:HTTP/1.1

Oh boy. Can you look into /var/log/ondemand-nginx/$USER/error.log log on the OOD web server and see if there are errors in that log? (of course here $USER is your username)

App 31089 output: Rails Error: Unable to access log file. Please ensure that /var/www/ood/apps/sys/dashboard/log/production.log e
xists and is writable (ie, make it writable for user and group: chmod 0664 /var/www/ood/apps/sys/dashboard/log/production.log). T
he log level has been raised to WARN and the output directed to STDERR until the problem is fixed.
App 31089 output: [2020-04-10 18:46:16 +0000 ] INFO “method=GET path=/pun/sys/dashboard/ format=html controller=DashboardControl
ler action=index status=200 duration=48.28 view=20.32”
App 31284 output: Listening on 3000

We’d see a stack trace or similar if there was something wrong. Is that it from that log file that you can glean? What about /var/log/httpd24/error_log or /var/log/httpd24/<servername>.net_error_ssl.log?

[Fri Apr 10 19:16:15.022028 2020] [lua:info] [pid 37785:tid 140719536715520] [client 10.149.10.130:37552] req_server_name=“hostname_here” req_origin="" req_accept_language=“en-us,en;q=0.5” req_protocol=“HTTP/1.1” req_is_https=“false” res_co
ntent_length="" req_referer=“http://hostname_here/pun/sys/dashboard/batch_connect/sessions” req_method=“GET” re
s_content_encoding="" log_time=“2020-04-10T19:16:15.21820Z” req_content_type="" req_port=“80” req_is_websocket=“false” res_locati
on="" req_user_agent=“Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:75.0) Gecko/20100101 Firefox/75.0” log_hook=“ood” time_user_map=
“96.937” req_accept=“text/javascript, application/javascript, application/ecmascript, application/x-ecmascript, /; q=0.01” req_
user_ip=“10.149.10.130” req_cache_control="" res_content_disp="" req_hostname=“hostname_here” req_filename=“pro
xy:http://localhost/pun/sys/dashboard/batch_connect/sessions.js?_=1586545886749” req_uri="/pun/sys/dashboard/batch_connect/sessio
ns.js" res_content_location="" time_proxy=“10.497” local_user=“spock” res_content_language="" req_handler=“proxy-server” req_stat
us=“200” res_content_type=“text/javascript; charset=utf-8” req_accept_charset="" req_accept_encoding=“gzip, deflate” remote_user=
“spock”, referer: http://hostname_here/pun/sys/dashboard/batch_connect/sessions