Problems with mate desktop launch on compute nodes

Is there any reason why when i put the FQDN in the oodportal.yml that it wouldn’t show that in the URL?
Worth noting
Prod=1.7
Dev=1.6.22
ref:
Prod=/rnode/hpc1gn002/38805/websockify

Dev=
/rnode/hpc1cn001.interna.cluster/35810

After some digging I found the log file, not sure why it isn’t being written back but here is what it looks like:
Setting VNC password…

Starting VNC server…

Desktop ‘TurboVNC: pplhpc1gn002:1 (bpette)’ started on display pplhpc1gn002:1

Log file is vnc.log

Successfully started VNC server on pplhpc1gn002:5901…

Script starting…

Starting websocket server…

WebSocket server settings:

  • Listen on :38805

  • No SSL/TLS support (no cert file)

  • Backgrounding (daemon)

Scanning VNC log file for user authentications…

Generating connection YAML file…

cmdTrace.c(713):ERROR:104: ‘restore’ is an unrecognized subcommand

cmdModule.c(411):ERROR:104: ‘restore’ is an unrecognized subcommand

Launching desktop ‘mate’…

cat: /etc/xdg/autostart/gnome-keyring-gpg.desktop: No such file or directory

cat: /etc/xdg/autostart/pulseaudio.desktop: No such file or directory

cat: /etc/xdg/autostart/rhsm-icon.desktop: No such file or directory

cat: /etc/xdg/autostart/spice-vdagent.desktop: No such file or directory

cat: /etc/xdg/autostart/xfce4-power-manager.desktop: No such file or directory

generating cookie with syscall

generating cookie with syscall

generating cookie with syscall

generating cookie with syscall

mate-session[140587]: WARNING: Could not parse desktop file /home/bpette/.config/autostart/spice-vdagent.desktop: Key file does not start with a group

mate-session[140587]: GLib-GObject-CRITICAL: object GsmAutostartApp 0x72f320 finalized while still in-construction

mate-session[140587]: GLib-GObject-CRITICAL: Custom constructor for class GsmAutostartApp returned NULL (which is invalid). Please use GInitable instead.

mate-session[140587]: WARNING: could not read /home/bpette/.config/autostart/spice-vdagent.desktop

mate-session[140587]: WARNING: Could not parse desktop file /home/bpette/.config/autostart/pulseaudio.desktop: Key file does not start with a group

mate-session[140587]: GLib-GObject-CRITICAL: object GsmAutostartApp 0x72f250 finalized while still in-construction

mate-session[140587]: GLib-GObject-CRITICAL: Custom constructor for class GsmAutostartApp returned NULL (which is invalid). Please use GInitable instead.

mate-session[140587]: WARNING: could not read /home/bpette/.config/autostart/pulseaudio.desktop

mate-session[140587]: WARNING: Could not parse desktop file /home/bpette/.config/autostart/gnome-keyring-gpg.desktop: Key file does not start with a group

mate-session[140587]: GLib-GObject-CRITICAL: object GsmAutostartApp 0x72f4c0 finalized while still in-construction

mate-session[140587]: GLib-GObject-CRITICAL: Custom constructor for class GsmAutostartApp returned NULL (which is invalid). Please use GInitable instead.

mate-session[140587]: WARNING: could not read /home/bpette/.config/autostart/gnome-keyring-gpg.desktop

mate-session[140587]: WARNING: Could not parse desktop file /home/bpette/.config/autostart/xfce4-power-manager.desktop: Key file does not start with a group

mate-session[140587]: GLib-GObject-CRITICAL: object GsmAutostartApp 0x72f0b0 finalized while still in-construction

mate-session[140587]: GLib-GObject-CRITICAL: Custom constructor for class GsmAutostartApp returned NULL (which is invalid). Please use GInitable instead.

mate-session[140587]: WARNING: could not read /home/bpette/.config/autostart/xfce4-power-manager.desktop

mate-session[140587]: WARNING: Could not parse desktop file /home/bpette/.config/autostart/rhsm-icon.desktop: Key file does not start with a group

mate-session[140587]: GLib-GObject-CRITICAL: object GsmAutostartApp 0x72f0b0 finalized while still in-construction

mate-session[140587]: GLib-GObject-CRITICAL: Custom constructor for class GsmAutostartApp returned NULL (which is invalid). Please use GInitable instead.

mate-session[140587]: WARNING: could not read /home/bpette/.config/autostart/rhsm-icon.desktop

SELinux Troubleshooter: Applet requires SELinux be enabled to run.

(nm-applet:140703): nm-applet-WARNING **: 16:36:09.055: NetworkManager is not running

/usr/share/system-config-printer/applet.py:44: PyGIWarning: Notify was imported without specifying a version first. Use gi.require_version(‘Notify’, ‘0.7’) before import to ensure that the right version gets loaded.

from gi.repository import Notify

system-config-printer-applet: failed to start NewPrinterNotification service

system-config-printer-applet: failed to start PrinterDriversInstaller service: org.freedesktop.DBus.Error.AccessDenied: Connection “:1.6082” is not allowed to own the service “com.redhat.PrinterDriversInstaller” due to security policies in the configuration file

Initializing caja-image-converter extension

Initializing caja-open-terminal extension

*** ERROR ***

TI:16:36:09 TH:0x6cda60 FI:gpm-manager.c FN:gpm_manager_systemd_inhibit,1784

OK! We can’t see the stack there, but I’m guessing it’s the same as this topic. The mate-power-manager package is bad news. It just doesn’t work for multi user systems. (It won’t let users see the power buttons on that machine). We don’t have it on our systems, and indeed when I build a MATE singularity image for desktops I don’t include it. So I would get rid of it. Again, users shouldn’t need to see that panel anyhow - it’s the button to stop/restart, so it’s useless to them anyhow.

The very last line of your output is exactly the same as the other topic. For whatever reason the stderr of your job isn’t combined with the stdout. What kind of scheduler do you have? I want to be sure there’s not something amiss on our side.

We use PBSPro 13.X atm. I will give removing the mate-power-manager a try and see if that resolves the issue. Also, i found the “output.log” gets populated after the job finishes only when it fails…

After some fiddling with the system pathing I am now getting the correct output.log after each job. Please find below:

Setting VNC password…
Starting VNC server…

Desktop ‘TurboVNC: pplhpc1ood01:1 (bpette)’ started on display pplhpc1ood01:1

Log file is vnc.log
Successfully started VNC server on pplhpc1ood01:5901…
Script starting…
Starting websocket server…
WebSocket server settings:

  • Listen on :61830
  • No SSL/TLS support (no cert file)
  • Backgrounding (daemon)
    Scanning VNC log file for user authentications…
    Generating connection YAML file…
    cmdTrace.c(713):ERROR:104: ‘restore’ is an unrecognized subcommand
    cmdModule.c(411):ERROR:104: ‘restore’ is an unrecognized subcommand
    Launching desktop ‘mate’…
    cat: /etc/xdg/autostart/gnome-keyring-gpg.desktop: No such file or directory
    cat: /etc/xdg/autostart/rhsm-icon.desktop: No such file or directory
    generating cookie with syscall
    generating cookie with syscall
    generating cookie with syscall
    generating cookie with syscall
    mate-session[119775]: WARNING: Could not parse desktop file /home/bpette/.config/autostart/gnome-keyring-gpg.desktop: Key file does not start with a group
    mate-session[119775]: GLib-GObject-CRITICAL: object GsmAutostartApp 0x767410 finalized while still in-construction
    mate-session[119775]: GLib-GObject-CRITICAL: Custom constructor for class GsmAutostartApp returned NULL (which is invalid). Please use GInitable instead.
    mate-session[119775]: WARNING: could not read /home/bpette/.config/autostart/gnome-keyring-gpg.desktop
    mate-session[119775]: WARNING: Could not parse desktop file /home/bpette/.config/autostart/rhsm-icon.desktop: Key file does not start with a group
    mate-session[119775]: GLib-GObject-CRITICAL: object GsmAutostartApp 0x767410 finalized while still in-construction
    mate-session[119775]: GLib-GObject-CRITICAL: Custom constructor for class GsmAutostartApp returned NULL (which is invalid). Please use GInitable instead.
    mate-session[119775]: WARNING: could not read /home/bpette/.config/autostart/rhsm-icon.desktop
    vmware-user: could not open /proc/fs/vmblock/dev
    /usr/bin/vmtoolsd: symbol lookup error: /lib64/libvmtools.so.0: undefined symbol: intf_close
    SELinux Troubleshooter: Applet requires SELinux be enabled to run.
    /usr/share/system-config-printer/applet.py:44: PyGIWarning: Notify was imported without specifying a version first. Use gi.require_version(‘Notify’, ‘0.7’) before import to ensure that the right version gets loaded.
    from gi.repository import Notify
    system-config-printer-applet: failed to start NewPrinterNotification service
    system-config-printer-applet: failed to start PrinterDriversInstaller service: org.freedesktop.DBus.Error.AccessDenied: Connection “:1.2140” is not allowed to own the service “com.redhat.PrinterDriversInstaller” due to security policies in the configuration file

(nm-applet:119889): nm-applet-WARNING **: 13:59:27.630: NetworkManager is not running
Initializing caja-image-converter extension
Initializing caja-open-terminal extension

(caja:119852): GLib-CRITICAL **: 13:59:28.127: g_hash_table_foreach: assertion ‘version == hash_table->version’ failed
mate-session[119775]: WARNING: Detected that screensaver has left the bus
Window manager warning: Fatal IO error 11 (Resource temporarily unavailable) on display ‘:1’.
mate-settings-daemon: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.
[1592254767,000,xklavier.c:xkl_engine_start_listen/] The backend does not require manual layout management - but it is provided by the application
Gdk-Message: 13:59:45.746: abrt: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.

Gdk-Message: 13:59:45.748: mate-session: Fatal IO error 104 (Connection reset by peer) on X server :1.

Okay, so I am running out of things to test here, but here goes:
I recently did a fresh install of a cluster node with mate desktop to try and determine where this was having an issue. No matter how I configured the node I am still not able to launch VNC sessions to mate (or any other desktop) for the interactive desktop. After some deep diving into the web page I am seeing:

WebSocket connection to ‘wss://pdlhpc1ln1.childrens.sea.kids/rnode/pdlhpc1cn001/13431/websockify’ failed: Error during WebSocket handshake: Unexpected response code: 404

Does this help point to a problem? Or am I still not able to pinpoint the reason why I can’t launch a remote desktop?

Alright, figured out the issue. the ‘rnode/$hostname’ is not properly resolving to ‘rnode/$hostname.example.com’ if I manually put it in there it works. I have it set up for the “host_regex” to use the FQDN but it isn’t working.

Awesome! Sorry for the delay.

I see earlier we set the set_host parameter. It must be choosing the short name instead of the long name? I would fix the FQDN issue there because that’s how that URL is being populated to begin with.

The regex is there so folks limit what folks can proxy to, I don’t believe it can change the url, only block it.

Though the nodes are set to use the FQDN apparently it wants the base hostname to be that as well. I ran a 'hostnamectl set-hostname $hostname.fqdn.com" and that fixed it… I spent WAAAAAAAYYYY to long troubleshooting something so simple.

Note: This has to be run on each compute node.

@charles8ronson in retrospect, is there something we could have provided that would have either identified or help identify the problem and made it easier to quickly solve?

I think the biggest issue I had was with section 3.1of the “Setup Interactive Apps” the way it is described it makes it sound as if the first part (proxy) will handle the fqdn part and the second part doesn’t describe how important that is. When looking at the submit script the value [hostname -A | awk ‘{print $1}’] returns the correct value expected so I made a poor assumption that it would use that for the connection as well. I would stress the level of importance on the FQDN returning from just ‘hostname’.
Ref:

3.1. Requirements

  • a regular expression that best describes all the hosts that you would want a user to connect to through the proxy (e.g., [\w.-]+\.osc\.edu )
  • confirm that if you run the command hostname from a compute node it will return a string that matches the above regular expression