Desktop app fails to connect

Hello, I am working on a fresh install of OOD on Ubuntu 22.04 (migrating from CentOS 7). All dependencies seem to be installed but running into the following issue:

Setting VNC password...
Starting VNC server...

WARNING: n006.cluster.pssclabs.com:1 is taken because of /tmp/.X1-lock
Remove this file if there is no X server n006.cluster.pssclabs.com:1
Killing Xvnc process ID 754372
Xvnc process ID 754372 already killed
Xvnc did not appear to shut down cleanly. Removing /tmp/.X11-unix/X1
Xvnc did not appear to shut down cleanly. Removing /tmp/.X1-lock

Desktop 'TurboVNC: n006.cluster.pssclabs.com:1 (faizanbadami)' started on display n006.cluster.pssclabs.com:1

Log file is vnc.log
Successfully started VNC server on n006.cluster.pssclabs.com:5901...
Script starting...
Starting websocket server...
ERROR: Collection default cannot be found
Launching desktop 'xfce'...
WebSocket server settings:
  - Listen on :45376
  - No SSL/TLS support (no cert file)
  - Backgrounding (daemon)
Scanning VNC log file for user authentications...
Generating connection YAML file...

(xfwm4:754842): xfwm4-WARNING **: 12:44:37.223: Unsupported GL renderer (llvmpipe (LLVM 15.0.7, 256 bits)).

** (xiccd:754891): WARNING **: 12:44:38.010: EDID is empty

** (xiccd:754891): CRITICAL **: 12:44:38.028: failed to create colord device: failed to obtain org.freedesktop.color-manager.create-device auth

(polkit-gnome-authentication-agent-1:754884): polkit-gnome-1-WARNING **: 12:44:38.040: Unable to determine the session we are in: No session for pid 754884

** (xfce4-screensaver:754877): WARNING **: 12:44:38.064: screensaver already running in this session

Those errors are hard to diagnose because I’m never quite sure if they matter or not and I don’t see anything there that’s really fatal.

I’d look for these 2 things:

  • What is the behavior you see from the client?
  • When you start this job up - get a shell session to the compute node if you can. Issue ps and so on to see if everything is infact up and running like vncserver and xfwm4,

Hi Jeff,

Thank you for the quick response.

  1. I see a “Failed to connect to server” message on the noVNC page
  2. Here is the output I get on the node where this job is running:
root@n006:~# ps -aux |grep xfce
faizanb+  755542  0.0  0.0 447784 79124 ?        Sl   12:53   0:00 xfce4-session
faizanb+  755559  0.0  0.0 231004  6448 ?        Sl   12:53   0:00 /usr/lib/x86_64-linux-gnu/xfce4/xfconf/xfconfd
faizanb+  755567  0.0  0.0 297120 27196 ?        Sl   12:53   0:00 /usr/bin/xfce4-screensaver
faizanb+  755593  0.0  0.0 262124 24928 ?        Sl   12:53   0:00 xfce4-panel
faizanb+  755618  0.0  0.0 256536 17516 ?        Sl   12:53   0:00 /usr/lib/x86_64-linux-gnu/xfce4/notifyd/xfce4-notifyd
root      756238  0.0  0.0   4024  2132 pts/0    S+   13:05   0:00 grep --color=auto xfce
root@n006:~# ps -aux |grep xfwm4
faizanb+  755581  0.0  0.0 488348 96860 ?        Sl   12:53   0:00 xfwm4
root      756246  0.0  0.0   4024  2168 pts/0    S+   13:05   0:00 grep --color=auto xfwm4
root@n006:~# ps -aux |grep vncserver
root      756255  0.0  0.0   4024  2092 pts/0    S+   13:05   0:00 grep --color=auto vncserver

OK - before you do what I’ve laid out below I want do to this quick spot check.

  • have you enabled node_uri and rnod_uri in ood_portal.yml. You need to if you haven’t.
  • Be sure you’ve got the correct host_regex in ood_portal.yml too.

Once you’ve done those and it still isn’t working - we have to debug more in Chrome because you need to set this feature that only chrome has:

  • Open up the developer tools in Chrome
  • Navigate to the options (the cog near the top of the developer tools)
  • check this box.

image

With this enabled

  • open your dev tools on the interactive sessions page
  • when you click the ‘connect to desktop’ button - dev tools should still be open on the new tab
  • on the new tab (that presumably says ‘failed to connect to server’) look at the network tab for anything that’s failed.

Hi Jeff,

here is what I have in the ood_portal.yml

host_regex: '[\w.-]+\.cluster\.com'

#ONLY FOR TESTING
auth:
  - 'AuthType Basic'
  - 'AuthName "Open OnDemand"'
  - 'AuthBasicProvider PAM'
  - 'AuthPAMService ood'
  - 'Require valid-user'
user_map_cmd: "/opt/ood/ood_auth_map/bin/basic.mapfile"


node_uri: '/node'

rnode_uri: '/rnode'

After changing the chrome setting I see the following error on the “console” nothing for the “network” tab:

websock.js:231 WebSocket connection to 'ws://clusterhn.cluster.pssclabs.com/rnode/n006.cluster.pssclabs.com/50812/websockify' failed: 
open @ websock.js:231
rfb.js:607 WebSocket on-error event
_socketError @ rfb.js:607
rfb.js:831 Failed when connecting: Connection closed (code: 1006)
_fail @ rfb.js:831
vnc.html:1 The resource http://clusterhn.cluster.pssclabs.com/pun/sys/dashboard/noVNC-1.3.0/app/images/warning.svg was preloaded using link preload but not used within a few seconds from the window's load event. Please make sure it has an appropriate `as` value and it is preloaded intentionally.
vnc.html:1 The resource http://clusterhn.cluster.pssclabs.com/pun/sys/dashboard/noVNC-1.3.0/app/images/info.svg was preloaded using link preload but not used within a few seconds from the window's load event. Please make sure it has an appropriate `as` value and it is preloaded intentionally.

n006.cluster.pssclabs.com is the hostname, but your host_regex, ('[\w.-]+\.cluster\.com') doesn’t account for the pssclabs portion.

I changed the regex to:
host_regex: '[\w.-]+\.cluster\.pssclabs\.com'

But I am still seeing the same error.

Be sure to bounce apache for it to take affect. Plus in the network tab you should see 403 forbidden for this request specifically (if it’s still the same issue).

Ok great. That seems to have done the trick.

Thank you so much for your help.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.