Session briefly starts then immediately crashes; User authentication problem?

We are trying to setup remote desktop with OOD 2.0.23. By appearances OOD itself works just dandy, and successful integration with both LDAP (on the ondemand-dex side) and SLURM are confirmed.

I have no problem logging in to OOD, requesting an interactive desktop and watching ‘squeue’ as it starts.

However, it appears that unless I am already externally logged into the node I get via ssh, the session crashes as soon as it starts (the job dies in approximately 12 seconds). I say “appears” because this is the only correlate I have been able to find; The output logs generated gnome in systemctl are useless… there is no error or explanation at the point where the failed sessions diverge from successful ones.

Is there some additional authentication that is required on the compute nodes? It has no problem creating jobs on them as me, and I have no problems with submitting a helloworld.sh script, and this does not appear under the interactive software configuration guide.

Hoping to finally have this working soon,
– Erik

Hi Erik.

Thank you for your post. I know you mentioned the logs from systemctl. However, in the box where the job launches, there is a link to the files related to the job and its current launch. In that directory is a file called output.log. Can you please paste the contents of that file? If it contains any information that you do not want the whole world to see, please remove that.

Thanks,
-gerald

Unfortunately, nothing useful afaict:

Setting VNC password...
Starting VNC server...

Desktop 'TurboVNC: axis1:3 (erik-k)' started on display axis1:3

Log file is vnc.log
Successfully started VNC server on axis1.stor:5903...
Script starting...
Starting websocket server...
ERROR: Collection default cannot be found
Launching desktop 'gnome'...
cat: /etc/xdg/autostart/gdu-notification-daemon.desktop: No such file or directory
WebSocket server settings:
  - Listen on :23775
  - No SSL/TLS support (no cert file)
  - Backgrounding (daemon)
Scanning VNC log file for user authentications...
Generating connection YAML file...
/home/users/erik-k/ondemand/data/sys/dashboard/batch_connect/sys/bc_desktop/axis/output/cdcf0eec-fb7b-4f4d-8618-184da191f4ec/desktops/gnome.sh: line 17: 2200206 Terminated              /etc/X11/xinit/Xsession gnome-session
Desktop 'gnome' ended...
Cleaning up...
Killing Xvnc process ID 2200151

Here is a reference from a working session after I separately logged in,

Setting VNC password...
Starting VNC server...

Desktop 'TurboVNC: axis1:3 (erik-k)' started on display axis1:3

Log file is vnc.log
Successfully started VNC server on axis1.stor:5903...
Script starting...
Starting websocket server...
ERROR: Collection default cannot be found
Launching desktop 'gnome'...
WebSocket server settings:
  - Listen on :27851
  - No SSL/TLS support (no cert file)
  - Backgrounding (daemon)
cat: /etc/xdg/autostart/gdu-notification-daemon.desktop: No such file or directory
Scanning VNC log file for user authentications...
Generating connection YAML file...
Setting VNC password...
Generating connection YAML file...
Setting VNC password...
Generating connection YAML file...
Desktop 'gnome' ended...
Cleaning up...
Killing Xvnc process ID 2201160