VirtualGL on Shared Node with Multiple GPUs

ping.luo · March 24, 2022, 10:11pm

First thank @mcuma for raising the question in this post and everyone else for providing additional information.

We were able to test direct GPU access through the EGL backend using @Micket’s method. However, for one of the applications we tested, older versions didn’t work properly: only the latest version could render smooth images, and all older versions showed corrupted images. The VirtualGL 3.0 user guide specifically mentioned that “As of this writing, the EGL back end does not yet support all of the GLX extensions and esoteric OpenGL features that the GLX back end supports.” Maybe that’s the reason the older versions of the software failed.

We then tested the GLX backend. With this approach, all versions of the software worked fine. However, we encountered a problem when testing two jobs shared the same node with multiple GPUs. The X server were started on the right GPU device. However, the execution of vglrun that happened later would always make the first vglrun terminated with the following error:

[VGL] ERROR: OpenGL error 0x0502
[VGL] ERROR: in readPixels--
[VGL]    435: Could not read pixels

Any ideas are appreciated.

Ping

PS, I want to mention that instead of changing TaskProlog in Slurm, I wrote a wrapper script for vglrun. It sets up the correct environment variables to use either the EGL backend or the GLX backend.

jeff.ohrstrom · March 25, 2022, 1:58pm

I’m not sure, but I did find this where you may be able to find some debugging tips.

github.com/VirtualGL/virtualgl

ERROR: OpenGL error 0x0502

opened 06:19PM - 03 Aug 20 UTC

closed 03:37PM - 05 Aug 20 UTC

ihauh

general support

Hello, I'm on Ubuntu 20.04, nvidia driver 440. Virtualgl not working, e.g.: `…`` vglrun +v glxgears [VGL] Shared memory segment ID for vglconfig: 32802 [VGL] VirtualGL v2.6.4 64-bit (Build 20200626) [VGL] Opening connection to 3D X server :0 [VGL] Using Pbuffers for rendering [VGL] Using pixel buffer objects for readback (BGR --> BGRA) [VGL] ERROR: OpenGL error 0x0502 [VGL] ERROR: in readPixels-- [VGL] 463: Could not Read Pixels ``` This error started on Ubuntu 18.04 actually. I used to be able to connect to this machine via command line and vglconnect. Thanks for any help!

These may be naive questions that you’ve already sorted through - but they popped into my mind and may help you debug.

Sounds like you know it’s an issue that only arises when you use multiple GPUs. Is that right?
Are you sure you’re attaching to the right $DISPLAY in each session? Lot’s of chatter about $DISPLAY on the github issue.

mcuma · March 25, 2022, 5:04pm

Hi Ping,

what’s the application that does not run right with EGL?

We are setting up EGL with our new Rocky 8 setup and it’s working well, except for one application - IDV - which uses Java 3D and we suspect it’s hitting the EGL’s incomplete GLX implementation.

ping.luo · March 25, 2022, 5:27pm

Thanks Jeff. I’ll take a look at the discussion from the link you posted and see if anything will be useful.

To answer your question:

yes, it only popped up when multiple GPUs are used, and each GPU runs a different X server.
1. I am sure I have attached the right $DISPLAY since the second vglrun always works while it causes the first running virtual crashes. In other words, the first vglrun works fine until the second vglrun starts running.

ping.luo · March 25, 2022, 6:29pm

Hi Martin,

The application having problems with the EGL backend is synopsis (version 5.4 and older). Version 5.5 works fine.

system · September 21, 2022, 6:29pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multi-GPU, Xorg and VirtualGL Get Help question	7	2146	May 1, 2022
Problems using vglrun on the interactive desktop Get Help question	7	1939	April 12, 2023
Need Assistance Getting VritualGL Configured on GPU Nodes Get Help	10	784	August 7, 2023
Setup Open OnDemand Desktop with GPU Get Help	3	283	November 9, 2024
VirtualGL in OOD apps Get Help	10	3108	May 26, 2022

VirtualGL on Shared Node with Multiple GPUs

Related topics