I’m stuck, have to be overlooking something silly and hoping to get some help.
I can do a submission on OOD, and it submits the job and starts, but when it goes to start the remote desktop session, it gives “failed to connect to server”
OOD Version: 3.0.3
Host OS: Rocky 9.1
Output.log looks like this:
Setting VNC password...
Starting VNC server...
Desktop 'TurboVNC: hercules-devel-1.hpc.msstate.edu:1 (jhrogers)' started on display hercules-devel-1.hpc.msstate.edu:1
Log file is vnc.log
Successfully started VNC server on hercules-devel-1.hpc.msstate.edu:5901...
Script starting...
Starting websocket server...
The system default contains no modules
(env var: LMOD_SYSTEM_DEFAULT_MODULES is empty)
No changes in loaded modules
Launching desktop 'xfce'...
Failed to set property.
Failed to set property.
WebSocket server settings:
- Listen on :58000
- No SSL/TLS support (no cert file)
- Backgrounding (daemon)
Scanning VNC log file for user authentications...
Generating connection YAML file...
/apps/other/ood-depends//xfce-4.18.0/bin/startxfce4: X server already running on display :1
/usr/bin/iceauth: creating new authority file /run/user/7233/ICEauthority
Terminated
Desktop 'xfce' ended...
Cleaning up...
Killing Xvnc process ID 395436
The top of our ood_portal.yml. The rest is just our SSL/auth info and I can login so that should be fine.
---
servername: Hercules-ood.hpc.msstate.edu
host_regex: '(hercules|Hercules)\-((devel|Devel)\-[12]|[01][0-9]\-[0-6][0-9])\.(hpc|HPC)\.(msstate|MsState)\.(edu|Edu)'
node_uri: '/node'
rnode_uri: '/rnode'
port: 443
From the noVNC Settings on the failed to connect screen:
Shared Mode: Checked
Advanced
>Websocket
>>Encrypt: Checked
>>Host: hercules-ood.hpc.msstate.edu
>>Port: 443
>>Path: rnode/hercules-devel-1.hpc.msstate.edu/58000/websockify
/var/log/messages from hercules-devel-1:
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug2: Processing RPC: REQUEST_BATCH_JOB_LAUNCH
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: task/affinity: task_p_slurmd_batch_request: task_p_slurmd_batch_request: 471350
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: task/affinity: _get_avail_map: slurmctld s 2 c 40; hw s 2 c 40 t 1
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: task/affinity: _get_avail_map: StepId=471350.batch core mask from slurmctld: 0x00000000000000000001
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: task/affinity: _get_avail_map: StepId=471350.batch CPU final mask for local node: 0x00000000000000000001
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: task/affinity: task_p_slurmd_batch_request: task_p_slurmd_batch_request: 471350
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: task/affinity: batch_bind: job 471350 CPU input mask for node: 0x00000000000000000001
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: task/affinity: _lllp_map_abstract_masks: _lllp_map_abstract_masks
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: task/affinity: batch_bind: job 471350 CPU final HW mask for node: 0x00000000000000000001
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug: Waiting for job 471350's prolog to complete
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: task/affinity: batch_bind: job 471350 CPU input mask for node: 0x00000000000000000001
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: task/affinity: batch_bind: job 471350 CPU final HW mask for node: 0x00000000000000000001
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug2: prep/script: _run_subpath_command: prolog success rc:0 output:
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: _spawn_prolog_stepd: call to _forkexec_slurmstepd
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: slurmstepd rank 0 (hercules-devel-1), parent rank -1 (NONE), children 0, depth 0, max_depth 0
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395332]: select/cons_tres: common_init: select/cons_tres loaded
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395332]: [471350.extern] task/affinity: init: task affinity plugin loaded with CPU mask 0xffffffffffffffffffff
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395332]: [471350.extern] cred/munge: init: Munge credential signature plugin loaded
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395332]: [471350.extern] route/default: init: route default plugin loaded
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395332]: [471350.extern] topology/none: init: topology NONE plugin loaded
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395332]: [471350.extern] task/cgroup: _memcg_initialize: job: alloc=3150MB mem.limit=3150MB memsw.limit=unlimited
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395332]: [471350.extern] task/cgroup: _memcg_initialize: step: alloc=3150MB mem.limit=3150MB memsw.limit=unlimited
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: _spawn_prolog_stepd: return from _forkexec_slurmstepd 0
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug: Finished wait for job 471350's prolog to complete
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: _get_user_env: get env for user jhrogers here
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: _get_user_env: get env for user jhrogers here
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug2: Finish processing RPC: REQUEST_LAUNCH_PROLOG
Feb 12 09:46:40 hercules-devel-1 su[395337]: (to jhrogers) root on none
Feb 12 09:46:40 hercules-devel-1 systemd[1]: Created slice User Slice of UID 7233.
Feb 12 09:46:40 hercules-devel-1 systemd[1]: Starting User Runtime Directory /run/user/7233...
Feb 12 09:46:40 hercules-devel-1 systemd[1]: Finished User Runtime Directory /run/user/7233.
Feb 12 09:46:40 hercules-devel-1 systemd[1]: Starting User Manager for UID 7233...
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Queued start job for default target Main User Target.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Created slice User Application Slice.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Started Mark boot as successful after the user session has run 2 minutes.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Started Daily Cleanup of User's Temporary Directories.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Reached target Paths.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Reached target Timers.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Starting D-Bus User Message Bus Socket...
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Starting Create User's Volatile Files and Directories...
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Finished Create User's Volatile Files and Directories.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Listening on D-Bus User Message Bus Socket.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Reached target Sockets.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Reached target Basic System.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Reached target Main User Target.
Feb 12 09:46:40 hercules-devel-1 systemd[1]: Started User Manager for UID 7233.
Feb 12 09:46:40 hercules-devel-1 systemd[395339]: Startup finished in 61ms.
Feb 12 09:46:40 hercules-devel-1 systemd[1]: Started Session c27 of User jhrogers.
Feb 12 09:46:40 hercules-devel-1 systemd[1]: Starting Hostname Service...
Feb 12 09:46:40 hercules-devel-1 systemd[1]: Started Hostname Service.
Feb 12 09:46:40 hercules-devel-1 systemd[1]: session-c27.scope: Deactivated successfully.
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: Launching batch job 471350 for UID 7233
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: Launching batch job 471350 for UID 7233
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: _rpc_batch_job: call to _forkexec_slurmstepd
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: slurmstepd rank -1 (hercules-devel-1), parent rank -1 (NONE), children 0, depth 0, max_depth 0
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: select/cons_tres: common_init: select/cons_tres loaded
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: [471350.batch] task/affinity: init: task affinity plugin loaded with CPU mask 0xffffffffffffffffffff
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: [471350.batch] cred/munge: init: Munge credential signature plugin loaded
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: [471350.batch] route/default: init: route default plugin loaded
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: [471350.batch] topology/none: init: topology NONE plugin loaded
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug3: _rpc_batch_job: return from _forkexec_slurmstepd: 0
Feb 12 09:46:40 hercules-devel-1 slurmd[3082]: slurmd: debug2: Finish processing RPC: REQUEST_BATCH_JOB_LAUNCH
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: [471350.batch] task/cgroup: _memcg_initialize: job: alloc=3150MB mem.limit=3150MB memsw.limit=unlimited
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: [471350.batch] task/cgroup: _memcg_initialize: step: alloc=3150MB mem.limit=3150MB memsw.limit=unlimited
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: [471350.batch] debug levels are stderr='error', logfile='debug4', syslog='debug'
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: [471350.batch] starting 1 tasks
Feb 12 09:46:40 hercules-devel-1 slurmstepd[395387]: [471350.batch] task 0 (395391) started 2024-02-12T09:46:40
Feb 12 09:46:41 hercules-devel-1 systemd[395339]: Starting D-Bus User Message Bus...
Feb 12 09:46:41 hercules-devel-1 dbus-broker-launch[395466]: Policy to allow eavesdropping in /usr/share/dbus-1/session.conf +31: Eavesdropping is deprecated and ignored
Feb 12 09:46:41 hercules-devel-1 dbus-broker-launch[395466]: Policy to allow eavesdropping in /usr/share/dbus-1/session.conf +33: Eavesdropping is deprecated and ignored
Feb 12 09:46:41 hercules-devel-1 systemd[395339]: Started D-Bus User Message Bus.
Feb 12 09:46:41 hercules-devel-1 journal[395466]: Ready
Feb 12 09:46:42 hercules-devel-1 systemd[395339]: Starting Accessibility services bus...
Feb 12 09:46:42 hercules-devel-1 systemd[395339]: Started Accessibility services bus.
Feb 12 09:46:42 hercules-devel-1 at-spi-bus-launcher[395494]: Policy to allow eavesdropping in /usr/share/defaults/at-spi2/accessibility.conf +15: Eavesdropping is deprecated and ignored
Feb 12 09:46:42 hercules-devel-1 at-spi-bus-launcher[395494]: Policy to allow eavesdropping in /usr/share/defaults/at-spi2/accessibility.conf +17: Eavesdropping is deprecated and ignored
Feb 12 09:46:42 hercules-devel-1 journal[395494]: Ready
Feb 12 09:46:42 hercules-devel-1 systemd[395339]: Created slice Slice /app/dbus-:1.8-org.a11y.atspi.Registry.
Feb 12 09:46:42 hercules-devel-1 systemd[395339]: Started dbus-:1.8-org.a11y.atspi.Registry@0.service.
Feb 12 09:46:42 hercules-devel-1 at-spi2-registryd[395499]: SpiRegistry daemon is running with well-known name - org.a11y.atspi.Registry
Feb 12 09:46:50 hercules-devel-1 systemd[1]: Stopping User Manager for UID 7233...
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Activating special unit Exit the Session...
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped target Main User Target.
Feb 12 09:46:50 hercules-devel-1 dbus-broker[395495]: Dispatched 19 messages @ 3(±4)μs / message.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopping Accessibility services bus...
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopping dbus-:1.8-org.a11y.atspi.Registry@0.service...
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped dbus-:1.8-org.a11y.atspi.Registry@0.service.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Removed slice Slice /app/dbus-:1.8-org.a11y.atspi.Registry.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped Accessibility services bus.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped target Basic System.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped target Paths.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped target Sockets.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped target Timers.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped Mark boot as successful after the user session has run 2 minutes.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped Daily Cleanup of User's Temporary Directories.
Feb 12 09:46:50 hercules-devel-1 dbus-broker[395467]: Dispatched 291 messages @ 2(±4)μs / message.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopping D-Bus User Message Bus...
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped Create User's Volatile Files and Directories.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Stopped D-Bus User Message Bus.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Closed D-Bus User Message Bus Socket.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Removed slice User Application Slice.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Reached target Shutdown.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Finished Exit the Session.
Feb 12 09:46:50 hercules-devel-1 systemd[395339]: Reached target Exit the Session.
Feb 12 09:46:50 hercules-devel-1 systemd[1]: user@7233.service: Deactivated successfully.
Feb 12 09:46:50 hercules-devel-1 systemd[1]: Stopped User Manager for UID 7233.
Feb 12 09:46:50 hercules-devel-1 systemd[1]: Stopping User Runtime Directory /run/user/7233...
Feb 12 09:46:50 hercules-devel-1 systemd[1]: run-user-7233.mount: Deactivated successfully.
Feb 12 09:46:50 hercules-devel-1 systemd[1]: user-runtime-dir@7233.service: Deactivated successfully.
Feb 12 09:46:50 hercules-devel-1 systemd[1]: Stopped User Runtime Directory /run/user/7233.
Feb 12 09:46:50 hercules-devel-1 systemd[1]: Removed slice User Slice of UID 7233.
Feb 12 09:46:51 hercules-devel-1 slurmstepd[395387]: [471350.batch] task 0 (395391) exited with exit code 0.
Feb 12 09:46:52 hercules-devel-1 slurmstepd[395387]: [471350.batch] job 471350 completed with slurm_rc = 0, job_rc = 0
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug3: in the service_connection
Feb 12 09:46:52 hercules-devel-1 slurmstepd[395387]: [471350.batch] done with job
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug2: Start processing RPC: REQUEST_TERMINATE_JOB
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug2: Processing RPC: REQUEST_TERMINATE_JOB
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug: _rpc_terminate_job: uid = 903 JobId=471350
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug: credential for job 471350 revoked
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug4: found StepId=471350.extern
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug2: container signal 18 to StepId=471350.extern
Feb 12 09:46:52 hercules-devel-1 slurmstepd[395332]: [471350.extern] Sent signal 18 to StepId=471350.extern
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug4: found StepId=471350.extern
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug2: container signal 15 to StepId=471350.extern
Feb 12 09:46:52 hercules-devel-1 slurmstepd[395332]: [471350.extern] Sent signal 15 to StepId=471350.extern
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug4: sent SUCCESS
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug4: found StepId=471350.extern
Feb 12 09:46:52 hercules-devel-1 slurmstepd[395332]: [471350.extern] done with job
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug2: set revoke expiration for jobid 471350 to 1707752932 UTS
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug: Waiting for job 471350's prolog to complete
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug: Finished wait for job 471350's prolog to complete
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug2: prep/script: _run_subpath_command: epilog success rc:0 output:
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug: completed epilog for jobid 471350
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug: JobId=471350: sent epilog complete msg: rc = 0
Feb 12 09:46:52 hercules-devel-1 slurmd[3082]: slurmd: debug2: Finish processing RPC: REQUEST_TERMINATE_JOB
Feb 12 09:47:10 hercules-devel-1 systemd[1]: systemd-hostnamed.service: Deactivated successfully.