App doesn't have desktop background

I’ve got two ondemand systems, an old one and a new one. The old one is version 1.6.25-1.el7, the new one is 2.0.29-1.el8. I’m trying to get the new one setup and running. I copied over my RStudio app that lives in /var/www/ood/apps/sys/RStudio and adjusted names so that it works.

But when I launch it on the new version I don’t get a resizable window on a desktop


Does anyone know how to get the desktop on the new version? I’ve got the bc_desktop running by itself.

Hi, sorry for the trouble.

Looking at the screen shots it, I’m not sure the window manager is running.

Could you share the submit.yml.erb and the output.log from the session logs when you launch? It will help to know what is set and if any errors or warnings are in that output.log.

submit.yml.erg

---
batch_connect:
  template: "vnc"
script:
  account_name: "bch"
  queue_name: "dev"
  native:
    - "-n"
    - "<%= num_cores %>"
    - "--mem"
    - "<%= bc_mem %>"

output.log

Setting VNC password...
Starting VNC server...

Desktop 'TurboVNC: compute-0.internal:1 (rbryant)' started on display compute-0.internal:1

Log file is vnc.log
Successfully started VNC server on compute-0:5901...
Script starting...
Starting websocket server...
The system default contains no modules
  (env var: LMOD_SYSTEM_DEFAULT_MODULES is empty)
  No changes in loaded modules

+ xfwm4 --compositor=off --daemon --sm-client-disable
xfwm4: Unknown option --daemon.
Type "xfwm4 --help" for usage.
+ xsetroot -solid '#D3D3D3'
+ xfsettingsd --sm-client-disable
WebSocket server settings:
  - Listen on :6095
  - No SSL/TLS support (no cert file)
  - Backgrounding (daemon)
Scanning VNC log file for user authentications...
Generating connection YAML file...
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-rbryant'
xfsettingsd: No window manager registered on screen 0.

(xfsettingsd:241709): xfsettingsd-WARNING **: 15:28:59.789: Failed to get the _NET_NUMBER_OF_DESKTOPS property.
xfsettingsd: Another instance took over. Leaving...
+ xfce4-panel --sm-client-disable

(xfce4-panel:242075): xfce4-panel-WARNING **: 15:29:02.624: Failed to connect to the D-BUS session bus: Could not connect: No such file or directory

(xfce4-panel:242075): xfce4-panel-CRITICAL **: 15:29:02.625: Name org.xfce.Panel lost on the message dbus, exiting.
xfce4-panel: There is already a running instance

Setting VNC password...
Generating connection YAML file...
Setting VNC password...
Generating connection YAML file...
slurmstepd: error: *** JOB 39 ON compute-0 CANCELLED AT 2023-02-21T21:28:56 DUE TO TIME LIMIT ***

I’d first make sure the window manager is installed on the compute node, you can insert a query to check this in the script.sh.erb.

If it is installed, try and adjust the script.sh.erb by removing the --no-daemon option and seeing what happens at launch.

xfwm4 is installed

I removed the --daemon flag and this is what I got:

Setting VNC password...
Starting VNC server...

Desktop 'TurboVNC: compute-0.internal:1 (rbryant)' started on display compute-0.internal:1

Log file is vnc.log
Successfully started VNC server on compute-0:5901...
Script starting...
Starting websocket server...
The system default contains no modules
  (env var: LMOD_SYSTEM_DEFAULT_MODULES is empty)
  No changes in loaded modules

+ xfwm4 --compositor=off --sm-client-disable
WebSocket server settings:
  - Listen on :5908
  - No SSL/TLS support (no cert file)
  - Backgrounding (daemon)
Scanning VNC log file for user authentications...
Generating connection YAML file...
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-rbryant'
Setting VNC password...
Generating connection YAML file...
+ xsetroot -solid '#D3D3D3'
+ xfsettingsd --sm-client-disable
xfsettingsd: Could not connect: No such file or directory.

(xfsettingsd:265112): xfsettingsd-ERROR **: 20:42:18.999: Failed to connect to the dbus session bus.
/users/rbryant/ondemand/data/sys/dashboard/batch_connect/sys/RStudio/output/9208e3d4-c73a-4c10-9ca5-c06a5b5594cf/script.sh: line 26: 265112 Trace/breakpoint trap   (core dumped) xfsettingsd --sm-client-disable
+ xfce4-panel --sm-client-disable

(xfce4-panel:265123): xfce4-panel-WARNING **: 20:42:19.623: Failed to connect to the D-BUS session bus: Could not connect: No such file or directory

(xfce4-panel:265123): xfce4-panel-CRITICAL **: 20:42:19.623: Name org.xfce.Panel lost on the message dbus, exiting.
xfce4-panel: There is already a running instance

It doesn’t look like the connection is made to the dbus but the app is running. Can you insert a command in the script.sh.erb to check if dbus is in fact running? Something like ps | grep dbus-daemon may work.

output.log

Setting VNC password...
Starting VNC server...

Desktop 'TurboVNC: compute-0.internal:1 (rbryant)' started on display compute-0.internal:1

Log file is vnc.log
Successfully started VNC server on compute-0:5901...
Script starting...
Starting websocket server...
The system default contains no modules
  (env var: LMOD_SYSTEM_DEFAULT_MODULES is empty)
  No changes in loaded modules

+ whoami
rbryant
+ sh -c 'ps -ef | grep dbus-daemon'
dbus         855       1  0 Feb13 ?        00:00:03 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
rbryant   279311  279290  0 15:27 ?        00:00:00 sh -c ps -ef | grep dbus-daemon
rbryant   279315  279311  0 15:27 ?        00:00:00 grep dbus-daemon
+ xfwm4 --compositor=off --sm-client-disable
WebSocket server settings:
  - Listen on :6050
  - No SSL/TLS support (no cert file)
  - Backgrounding (daemon)
Scanning VNC log file for user authentications...
Generating connection YAML file...
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-rbryant'
Setting VNC password...
Generating connection YAML file...
+ xsetroot -solid '#D3D3D3'
+ xfsettingsd --sm-client-disable
xfsettingsd: Could not connect: No such file or directory.

(xfsettingsd:279777): xfsettingsd-ERROR **: 15:27:23.629: Failed to connect to the dbus session bus.
/users/rbryant/ondemand/data/sys/dashboard/batch_connect/sys/RStudio/output/423acdec-8ec5-4a29-9973-cfae1ef07d61/script.sh: line 28: 279777 Trace/breakpoint trap   (core dumped) xfsettingsd --sm-client-disable
+ xfce4-panel --sm-client-disable

(xfce4-panel:279781): xfce4-panel-WARNING **: 15:27:23.844: Failed to connect to the D-BUS session bus: Could not connect: No such file or directory

(xfce4-panel:279781): xfce4-panel-CRITICAL **: 15:27:23.844: Name org.xfce.Panel lost on the message dbus, exiting.
xfce4-panel: There is already a running instance

Is there anything in the syslog messages around dbus?

When you did the upgrade for OOD, did anything get changed on the compute nodes as well? Any changes in OS or just anything at all?

What happens with also trying this command in the script.sh.erb as well:

rm -R ~/.cache/sessions/*

There is nothing in /var/log/messages from today or yesterday when I grep dbus.

This is not an upgrade, but a rebuild… So the existing cluster is CentOS7 and the new cluster is on Rocky8. I use the same Ansible code to deploy new compute nodes in the existing cluster as I do in this rebuild cluster, but I do have switches for things that are OS dependent so that it picks the older version or the newer one.

There was nothing in the ~/.cache/sessions directory to delete.

We’re finding that the XFCE scripts we use do not work well on RHEL/8.

We can easily replicate the same core dumps on our RHEL/8 systems.

(xfsettingsd:279777): xfsettingsd-ERROR **: 15:27:23.629: Failed to connect to the dbus session bus.
/users/rbryant/ondemand/data/sys/dashboard/batch_connect/sys/RStudio/output/423acdec-8ec5-4a29-9973-cfae1ef07d61/script.sh: line 28: 279777 Trace/breakpoint trap   (core dumped) xfsettingsd --sm-client-disable
+ xfce4-panel --sm-client-disable

We haven’t migrated any of our applications to our new system so we haven’t started work for the same.

That said, we’ll continue to look into it on our side as we will need to migrate our apps to our new system anyhow, so we may as well do it now while other folks have the same issues.

So for now I should stick with CentOS7?

Yes. I’ll work on RHEL/8 to see what’s what and update this ticket if I have any fixes available.

But yes as of right now, I would say there’s nothing wrong with your system, these scripts just don’t work for RHEL/8 and we need to figure out what will.

Any idea of a timeline for getting this working? I’m hoping to migrate everything to RHEL8/Rocky8 this summer.

Soon, I’ve found with XFCE 14, somethings are daemons and some are not. You see previously we had everything in a backgrounded block () &. I’ve found that you have to background xfwm4 and xfce4-panel. I’ve had to add a few sleep commands to just to be sure everything has time to do it’s business before the next command is issued.

Instead of using the () & block in your script.sh.erb, try this:

export SEND_256_COLORS_TO_REMOTE=1
export XDG_CONFIG_HOME="<%= session.staged_root.join("config") %>"
export XDG_DATA_HOME="<%= session.staged_root.join("share") %>"
export XDG_CACHE_HOME="$(mktemp -d)"
module restore
set -x
xfwm4 --sm-client-disable &
sleep 5
xsetroot -solid "#D3D3D3"
xfsettingsd --daemon --sm-client-disable
xfce4-panel --sm-client-disable &

sleep 5

# instead of booting firefox here, boot the program you want to start.
firefox

Still not working with those commands.

I’m going to set this particular cluster up as CentOS7 and then migrate later. This is not our main cluster and this particular one would be an easy transition. Our main cluster has a whole order of things for the 8 transition where other pieces of tech debt must come before and others must come after so that’s why I was asking about timing.

Hey @jeff.ohrstrom did this ever get fixed?

Hey @jeff.ohrstrom any ideas with this? I actually just upgraded to OOD 3 and I’m still having this issue.

Hey sorry I didn’t see the other message. We (OSC) still run EL/7 systems so we never updated.

Though there is another topic here

But basically there’s some work you need to do on the script so that the app boots right. That firefox hack I’ve got above is as far as I’ve gotten. We haven’t deployed any production updates yet so I’m a bit unaware beyond the updates I’ve provided here in this and the other thread I’ve linked.