Desktop not leveraging Virtualgl - NVIDIA a30 GPU on worker

Correction. We do have some small progress. The correction in the submit.yml has now passed the GPU params to the slurm job. But still getting permissions issues

I am in the vglusers group:

(base) [chris.welsh@gpu002 ~]$ groups
stuff deleted... jupyterhub_users vglusers

/dev/dri/*

[root@gpu002 dri]# ls -l
total 0
drwxr-xr-x. 2 root root          100 Sep 11 15:40 by-path
crw-rw----. 1 root vglusers 226,   0 Sep 11 15:40 card0
crw-rw----. 1 root vglusers 226,   1 Sep 11 15:40 card1
crw-rw----. 1 root vglusers 226, 128 Sep 11 15:40 renderD128

gres.conf

NodeName=gpu002 Name=gpu Type=a30 File=/dev/nvidia0

/etc/slurm/slurm.conf

GresTypes=gpu,gpu:H100:2,gpu:a30:1,l40s:1
AccountingStorageTRES=gres/gpu,gres/gpu:H100,gres/gpu:a30,gres/gpu:l40s
NodeName=gpu002 NodeAddr=10.deleted CPUs=64 Feature=NUMA,AMD,GPU Boards=1 SocketsPerBoard=2 CoresPerSocket=32 ThreadsPerCore=1 RealMemory=515024 Gres=gpu:a30:1 Weight=60





[root@gpu002 ~]# nvidia-smi
Sat Sep 13 10:44:38 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A30                     Off |   00000000:21:00.0 Off |                    0 |
| N/A   31C    P0             30W /  165W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

submit.yml.erb

---
batch_connect:
  template: vnc
script:
  native:
    - "--gpus-per-node=1"
    - "--gres=gpu"
    - "--ntasks=1"

I can now see GPU stuff below but still no permissions.

[root@login001 ~]# scontrol show job 593703
JobId=593703 JobName=sys/dashboard/dev/bc_desktop
   UserId=chris.welsh(37738) GroupId=chris.welsh.dg(41023) MCS_label=N/A
   Priority=1 Nice=0 Account=itec1 QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:04:10 TimeLimit=01:00:00 TimeMin=N/A
   SubmitTime=2025-09-13T10:53:04 EligibleTime=2025-09-13T10:53:04
   AccrueTime=2025-09-13T10:53:04
   StartTime=2025-09-13T10:53:06 EndTime=2025-09-13T11:53:06 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-09-13T10:53:06 Scheduler=Main
   Partition=GPU_SHORT AllocNode:Sid=172.16.12.16:251917
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=gpu002
   BatchHost=gpu002
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   ReqTRES=cpu=1,mem=8G,node=1,billing=1,gres/gpu=1
   AllocTRES=cpu=1,mem=8G,node=1,billing=1,gres/gpu=1,gres/gpu:a30=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryCPU=8G MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=(null)
   WorkDir=/home/chris.welsh/ondemand/data/sys/dashboard/batch_connect/dev/bc_desktop/output/1107da7e-6730-4779-a026-cd69d5cf6a7c
   StdErr=/home/chris.welsh/ondemand/data/sys/dashboard/batch_connect/dev/bc_desktop/output/1107da7e-6730-4779-a026-cd69d5cf6a7c/output.log
   StdIn=/dev/null
   StdOut=/home/chris.welsh/ondemand/data/sys/dashboard/batch_connect/dev/bc_desktop/output/1107da7e-6730-4779-a026-cd69d5cf6a7c/output.log
   TresPerNode=gres/gpu:1,gres/gpu

output.log

Desktop 'TurboVNC: gpu002.meerkat.mcri.edu.au:1 (chris.welsh)' started on display gpu002.meerkat.mcri.edu.au:1

Log file is vnc.log
Successfully started VNC server on gpu002.meerkat.mcri.edu.au:5901...
Script starting...
Starting websocket server...
Launching desktop 'xfce'...
[websockify]: pid: 301508 (proxying 30159 ==> localhost:5901)
[websockify]: log file: ./websockify.log
[websockify]: waiting ...
[VGL] Shared memory segment ID for vglconfig: 262205
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
/usr/bin/iceauth:  creating new authority file /run/user/37738/ICEauthority
[VGL] Shared memory segment ID for vglconfig: 262206
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.

(xfwm4:301559): xfwm4-WARNING **: 10:53:10.634: GLX extension missing, GLX support disabled.
[VGL] Shared memory segment ID for vglconfig: 294914
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294915
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294916
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294917
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294925
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294926
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] Shared memory segment ID for vglconfig: 294927
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294928
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294930
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] Shared memory segment ID for vglconfig: 294931
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294932
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294933
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)

ERROR: The current user does not have permission for operation

[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.
[VGL] Shared memory segment ID for vglconfig: 294934
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[websockify]: started successfully (proxying 30159 ==> localhost:5901)
Scanning VNC log file for user authentications...
Generating connection YAML file...
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.

** (wrapper-2.0:301607): WARNING **: 10:53:11.582: No outputs have backlight property
[VGL] Shared memory segment ID for vglconfig: 294940
[VGL] VirtualGL v3.1.3 64-bit (Build 20250409)
[VGL] NOTICE: Replacing dlopen("libGLX.so.1") with dlopen("libvglfaker.so")
[VGL] WARNING: The EGL back end requires a 2D X server with a GLX extension.

(wrapper-2.0:301606): libnotify-WARNING **: 10:53:11.771: Failed to connect to proxy

(wrapper-2.0:301607): Gtk-CRITICAL **: 10:53:11.793: gtk_icon_theme_has_icon: assertion 'icon_name != NULL' failed

(wrapper-2.0:301607): Gtk-CRITICAL **: 10:53:11.802: gtk_icon_theme_has_icon: assertion 'icon_name != NULL' failed

(wrapper-2.0:301607): Gtk-CRITICAL **: 10:53:11.802: gtk_icon_theme_has_icon: assertion 'icon_name != NULL' failed

(wrapper-2.0:301607): Gtk-CRITICAL **: 10:53:11.829: gtk_icon_theme_has_icon: assertion 'icon_name != NULL' failed

(nm-applet:301640): libnotify-WARNING **: 10:53:11.840: Failed to connect to proxy

(nm-applet:301640): nm-applet-WARNING **: 10:53:11.841: Failed to show notification: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.Notifications was not provided by any .service files

(nm-applet:301640): nm-applet-WARNING **: 10:53:11.848: Failed to show notification: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.Notifications was not provided by any .service files

** (xfdesktop:301587): WARNING **: 10:53:13.026: Failed to register the newly set background with AccountsService '/usr/share/backgrounds/xfce/xfce-leaves.svg': GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: No such interface “org.freedesktop.DisplayManager.AccountsService”

(wrapper-2.0:301606): pulseaudio-plugin-WARNING **: 10:53:17.982: Disconnected from the PulseAudio server. Attempting to reconnect in 5 seconds...
Failed to create secure directory (/run/user/37738/pulse): No such file or directory
Failed to create secure directory (/run/user/37738/pulse): No such file or directory
Xlib:  extension "DPMS" missing on display ":1.0".
Xlib:  extension "DPMS" missing on display ":1.0".
Xlib:  extension "DPMS" missing on display ":1.0".
Xlib:  extension "DPMS" missing on display ":1.0".

But as you can see no permissions in the log still (Above)

Also evedent below.

Hi All, Thanks so much for your help. I have this working now. TBH, not sure what fixed it since the last message, but I installed this “dnf install mesa-libGL-devel” and rebooted. Honestly not sure what fixed it. But Jeff’s observation about the bad formatting in the submit.yml was a key factor. Thx all.

1 Like