Where’s the firewall? The network path in question is between the OOD server (the machine OOD is installed on) and the compute node where the job is running. Your client browser, for example, doesn’t need network connectivity to the computeNode:5901 - the OOD server does.
When you connect to the desktop - we’re proxying you through OOD to it. So the full path is
you -> OOD on port 443 -> the compute node on port 5901
So if there’s a firewall between you and OOD - all you need to open up is 443, the https port.
We have two different kinds of applications. Those that need a desktop GUI which need noVNC (a technology we use to pass VNC data over the web) and those that run straight http servers themselves like Jupyter.
HTTP applications - like Jupyter, you could specify the port range. Apparently that’s not the case for GUI applications like desktops.
Thank you very much. I’m still stuck, but I think my issue is something else, so I will open another ticket if anything. I really appreciate your help.
I’m trying to start an interactive desktop, but getting the wonderful Failed to connect to server error message.
My output.log:
Setting VNC password...
Starting VNC server...
WARNING: node805.host.edu:1 is taken because of /tmp/.X1-lock
Remove this file if there is no X server node805.host.edu:1
Killing Xvnc process ID 3534720
Xvnc process ID 3534720 already killed
Xvnc did not appear to shut down cleanly. Removing /tmp/.X11-unix/X1
Xvnc did not appear to shut down cleanly. Removing /tmp/.X1-lock
Desktop 'TurboVNC: node805.host.edu:1 (aadvorki)' started on display node805.host.edu:1
Log file is vnc.log
Successfully started VNC server on node805.ionic.cs.princeton.edu:5901...
Script starting...
Starting websocket server...
Launching desktop 'xfce'...
Failed to init libxfconf: Error spawning command line “dbus-launch --autolaunch=e785394556274b3fb9c1573b6d7e44f1 --binary-syntax --cl
ose-stderr”: Child process exited with code 1.
Failed to init libxfconf: Error spawning command line “dbus-launch --autolaunch=e785394556274b3fb9c1573b6d7e44f1 --binary-syntax --cl
ose-stderr”: Child process exited with code 1.
_IceTransmkdir: Owner of /tmp/.ICE-unix should be set to root
[websockify]: pid: 3615206 (proxying 14220 ==> localhost:5901)
[websockify]: log file: ./websockify.log
[websockify]: waiting ...
(xfwm4:3615204): xfwm4-WARNING **: 15:11:11.346: Could not create GLX context.
(xfwm4:3615204): Gdk-WARNING **: 15:11:11.347: The program 'xfwm4' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadValue (integer parameter out of range for operation)'.
(Details: serial 2764 error_code 2 request_code 150 (GLX) minor_code 24)
(Note to programmers: normally, X errors are reported asynchronously;
that is, you will receive the error a while after causing it.
To debug your program, run it with the GDK_SYNCHRONIZE environment
variable to change this behavior. You can then get a meaningful
backtrace from your debugger if you break on the gdk_x_error() function.)
[websockify]: started successfully (proxying 14220 ==> localhost:5901)
Scanning VNC log file for user authentications...
Generating connection YAML file...
xfsettingsd: No window manager registered on screen 0.
root@ondemand /etc/ood/config/apps/bc_desktop/submit $ cat slurm.yml.erb
---
batch_connect:
before_script: |
# Export the module function if it exists
[[ $(type -t module) == "function" ]] && export -f module
# MATE acts strange in pitzer-exp and doesn't like /var/run/$(id -u)
export XDG_RUNTIME_DIR="$TMPDIR/xdg_runtime"
# reset SLURM_EXPORT_ENV so that things like srun & sbatch work out of the box
export SLURM_EXPORT_ENV=ALL
$ cat ionic.yml
---
title: "HPC Desktop"
cluster: "ionic"
attributes:
desktop: "xfce"
bc_account:
help: "You can leave this blank to use default SLURM account."
bc_queue: "compute"
root@node805 /etc/slurm $ ls -la /tmp/.X1*
-r--r--r--. 1 aadvorki guest 11 Nov 12 15:11 /tmp/.X1-lock
/tmp/.X11-unix:
total 304
drwxrwxrwx. 2 aadvorki guest 4096 Nov 12 15:11 .
drwxrwxrwt. 2408 root root 303104 Nov 12 15:11 ..
srwxrwxrwx. 1 aadvorki guest 0 Nov 12 15:11 X1
Is there anything else I can provide you with? Thank you very much!
This seems to be the source of your issues. From what I’ve seen (or can recall off the top of my head) this seems to be an issue with your video card drivers (specifically maybe even nVidia drivers?).
If you ran it as root, it would write files to /etc/X11. It’s odd you don’t have anything there at all…
You can try nvidia-xconfig -o ~/some-tmp-dir or similar to write files to your own HOME to see what files could be generated before you actually run it as root.
Yes sorry, I realized I had to run it as root:) Sorry.
Same error message.
$ more vnc.log
TurboVNC Server (Xvnc) 64-bit v3.1.1 (build 20240127.sdl9)
Copyright (C) 1999-2024 The VirtualGL Project and many others (see README.md)
Visit http://www.TurboVNC.org for more information on TurboVNC
12/11/2024 15:50:23 Using security configuration file /etc/turbovncserver-security.conf
12/11/2024 15:50:23 Enabled security type 'tlsvnc'
12/11/2024 15:50:23 Enabled security type 'tlsotp'
12/11/2024 15:50:23 Enabled security type 'tlsplain'
12/11/2024 15:50:23 Enabled security type 'x509vnc'
12/11/2024 15:50:23 Enabled security type 'x509otp'
12/11/2024 15:50:23 Enabled security type 'x509plain'
12/11/2024 15:50:23 Enabled security type 'vnc'
12/11/2024 15:50:23 Enabled security type 'otp'
12/11/2024 15:50:23 Enabled security type 'unixlogin'
12/11/2024 15:50:23 Enabled security type 'plain'
_XSERVTransmkdir: Owner of /tmp/.X11-unix should be set to root
12/11/2024 15:50:23 Desktop name 'TurboVNC: node805.host.edu:1 (aadvorki)' (node805.ionic.cs.princeton.edu:1)
12/11/2024 15:50:23 Protocol versions supported: 3.3, 3.7, 3.8, 3.7t, 3.8t
12/11/2024 15:50:23 Listening for VNC connections on TCP port 5901
12/11/2024 15:50:23 Interface 0.0.0.0
12/11/2024 15:50:23 Framebuffer: BGRX 8/8/8/8
12/11/2024 15:50:23 New desktop size: 800 x 600
12/11/2024 15:50:23 New screen layout:
12/11/2024 15:50:23 0x00000040 (output 0x00000040): 800x600+0+0
12/11/2024 15:50:23 Maximum clipboard transfer size: 1048576 bytes
12/11/2024 15:50:23 VNC extension running!
$ more output.log
Setting VNC password...
Starting VNC server...
WARNING: node805.ionic.cs.princeton.edu:1 is taken because of /tmp/.X1-lock
Remove this file if there is no X server node805.host.edu:1
Killing Xvnc process ID 3615141
Xvnc process ID 3615141 already killed
Xvnc did not appear to shut down cleanly. Removing /tmp/.X11-unix/X1
Xvnc did not appear to shut down cleanly. Removing /tmp/.X1-lock
Desktop 'TurboVNC: node805.host.edu:1 (aadvorki)' started on display node805.ionic.cs.princeton.edu:1
Log file is vnc.log
Successfully started VNC server on node805.host.edu:5901...
Script starting...
Starting websocket server...
Launching desktop 'xfce'...
Failed to init libxfconf: Error spawning command line “dbus-launch --autolaunch=e785394556274b3fb9c1573b6d7e44f1 --binary-syntax --cl
ose-stderr”: Child process exited with code 1.
Failed to init libxfconf: Error spawning command line “dbus-launch --autolaunch=e785394556274b3fb9c1573b6d7e44f1 --binary-syntax --cl
ose-stderr”: Child process exited with code 1.
_IceTransmkdir: Owner of /tmp/.ICE-unix should be set to root
[websockify]: pid: 3624325 (proxying 13937 ==> localhost:5901)
[websockify]: log file: ./websockify.log
[websockify]: waiting ...
(xfwm4:3624323): xfwm4-WARNING **: 15:50:25.538: Could not create GLX context.
(xfwm4:3624323): Gdk-WARNING **: 15:50:25.540: The program 'xfwm4' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadValue (integer parameter out of range for operation)'.
(Details: serial 2764 error_code 2 request_code 150 (GLX) minor_code 24)
(Note to programmers: normally, X errors are reported asynchronously;
that is, you will receive the error a while after causing it.
To debug your program, run it with the GDK_SYNCHRONIZE environment
variable to change this behavior. You can then get a meaningful
backtrace from your debugger if you break on the gdk_x_error() function.)
[websockify]: started successfully (proxying 13937 ==> localhost:5901)
Scanning VNC log file for user authentications...
Generating connection YAML file...
xfsettingsd: No window manager registered on screen 0.
xfce4-panel: No window manager registered on screen 0. To start the panel without this check, run with --disable-wm-check.
** (wrapper-2.0:3624377): WARNING **: 15:50:39.391: No outputs have backlight property
(wrapper-2.0:3624376): libnotify-WARNING **: 15:50:39.439: Failed to connect to proxy
(wrapper-2.0:3624377): Gtk-CRITICAL **: 15:50:39.508: gtk_icon_theme_has_icon: assertion 'icon_name != NULL' failed
(wrapper-2.0:3624377): Gtk-CRITICAL **: 15:50:39.510: gtk_icon_theme_has_icon: assertion 'icon_name != NULL' failed
(wrapper-2.0:3624377): Gtk-CRITICAL **: 15:50:39.510: gtk_icon_theme_has_icon: assertion 'icon_name != NULL' failed
(wrapper-2.0:3624377): Gtk-CRITICAL **: 15:50:39.554: gtk_icon_theme_has_icon: assertion 'icon_name != NULL' failed
No window manager registered on screen 0. To start the xfdesktop without this check, run with --disable-wm-check.
ERROR: A supplied argument is invalid
(nm-applet:3624627): libnotify-WARNING **: 15:50:58.027: Failed to connect to proxy
(nm-applet:3624627): nm-applet-WARNING **: 15:50:58.029: Failed to show notification: GDBus.Error:org.freedesktop.DBus.Error.ServiceU
nknown: The name org.freedesktop.Notifications was not provided by any .service files
(tracker-miner-fs-3:3624624): Tracker-CRITICAL **: 15:50:58.053: Could not create store/endpoint: no such table: nrl:Ontology
Xlib: extension "DPMS" missing on display ":1.0".
Xlib: extension "DPMS" missing on display ":1.0".
Xlib: extension "DPMS" missing on display ":1.0".
Xlib: extension "DPMS" missing on display ":1.0".
Xlib: extension "DPMS" missing on display ":1.0".
root@node805 /etc/X11 $ ls -la
total 52
drwxr-xr-x. 7 root root 4096 Nov 12 15:49 .
drwxr-xr-x. 164 root root 12288 Nov 12 15:09 ..
drwxr-xr-x. 2 root root 4096 Aug 9 2021 applnk
drwxr-xr-x. 2 root root 4096 Jul 2 09:44 fontpath.d
drwxr-xr-x. 2 root root 4096 Jul 2 09:39 mwm
drwxr-xr-x. 5 root root 4096 Feb 16 2024 xinit
-rw-r--r--. 1 root root 547 Aug 10 2021 Xmodmap
-rw-r--r--. 1 root root 1235 Nov 12 15:49 xorg.conf
-rw-r--r--. 1 root root 1228 Nov 12 15:49 xorg.conf.backup
drwxr-xr-x. 2 root root 4096 Oct 17 09:45 xorg.conf.d
-rw-r--r--. 1 root root 0 Nov 12 15:34 xorg.conf.nvidia-xconfig-original
-rw-r--r--. 1 root root 493 Aug 10 2021 Xresources
Now my I get a quick connection, but the job failes within seconds. When I can connect, I am still getting Failed to connect to server error message. output.log has changed, though.
Setting VNC password...
Starting VNC server...
Desktop 'TurboVNC: node805.host:1 (aadvorki)' started on display node805.host.edu:1
Log file is vnc.log
Successfully started VNC server on node805.host.edu:5901...
Script starting...
Starting websocket server...
+ sleep 5
+ xfwm4 --sm-client-disable
(xfwm4:3721310): xfwm4-CRITICAL **: 10:36:57.213: Xfconf could not be initialized
(xfwm4:3721310): xfwm4-WARNING **: 10:36:57.213: Missing data from default files
[websockify]: pid: 3721340 (proxying 47792 ==> localhost:5901)
[websockify]: log file: ./websockify.log
[websockify]: waiting ...
[websockify]: started successfully (proxying 47792 ==> localhost:5901)
Scanning VNC log file for user authentications...
Generating connection YAML file...
Cleaning up...
Killing Xvnc process ID 3721295
Not quite sure what this could be, but there maybe logs at ~/.xfce4-session.verbose-log* or similar. A google search appears to be indicating there’s some issue with your configuration?
You should have ~/.config/xfce4/ (and maybe -session too). Maybe try removing those directories to see if there’s some wrong config in your HOME?