I was just wondering if there’s any updated on this.
We moved to OOD 1.8 but we still have this issue with LSF.
VNC is properly killed but websockify keeps running and the job doesn’t get killed.
After starting the bc_desktop app:
367818 115279 /lsf/10.1/linux3.10-glibc2.17-x86_64/etc/res -d /lsf/conf -m rkanc004is02 /home/maffiaa/.lsbatch/1607523224.412572
115279 115368 /bin/sh /home/maffiaa/.lsbatch/1607523224.412572
115368 115372 /bin/bash /home/maffiaa/.lsbatch/1607523224.412572.shell
1 115404 /opt/TurboVNC/bin/Xvnc :1 -desktop TurboVNC: rkanc003is01:1 (maffiaa) -auth /home/maffiaa/.Xauthority -geometry 800x600 -depth 24 -rfbwait 120000 -rfbauth vnc.passwd -x509cert /home/maffi
aa/.vnc/x509_cert.pem -x509key /home/maffiaa/.vnc/x509_private.pem -rfbport 5901 -fp catalogue:/etc/X11/fontpath.d -deferupdate 1 -dridir /usr/lib64/dri -registrydir /usr/lib64/xorg -idletimeout 0
115372 115430 bash /home/maffiaa/ec-hub/data/sys/dashboard/batch_connect/dev/bc_desktop/output/03b5de19-a899-4540-98cb-3fa582bf9328/script.sh
115372 115446 /bin/bash /home/maffiaa/.lsbatch/1607523224.412572.shell
115446 115448 /bin/bash /home/maffiaa/.lsbatch/1607523224.412572.shell
115448 115449 tail -f --pid=115430 vnc.log
1 115450 /usr/bin/python /bin/websockify -D 23292 localhost:5901
1 115458 dbus-launch --autolaunch cf648096f92e4bd689dca505bfde2ea6 --binary-syntax --close-stderr
1 115459 /usr/bin/dbus-daemon --fork --print-pid 6 --print-address 8 --session
1 115461 /usr/lib64/xfce4/xfconf/xfconfd
115430 115467 xfce4-session
1 115470 /bin/dbus-launch --sh-syntax --exit-with-session xfce4-session
1 115471 /usr/bin/dbus-daemon --fork --print-pid 6 --print-address 8 --session
1 115476 /usr/lib64/xfce4/xfconf/xfconfd
115467 115478 xfwm4 --display :1.0 --sm-client-id 211e4a471-7c5f-48d7-87d4-6d25a172c7eb
115467 115480 xfce4-panel --display :1.0 --sm-client-id 24ed787fb-e753-4e0f-93d7-d0b725e01cca
1 115481 xfsettingsd --display :1.0 --sm-client-id 25e79dc77-b78a-480c-a879-8d02501b59dc
115467 115485 xfdesktop --display :1.0 --sm-client-id 2a23afc15-7a4f-4cbb-ba9b-bb3160c5c004
1 115492 /usr/libexec/gvfsd
115467 115497 abrt-applet
1 115500 /usr/libexec/gvfsd-fuse /home/maffiaa/.gvfs -f -o big_writes
115467 115502 nm-applet
1 115512 /usr/libexec/imsettings-daemon
115467 115520 /usr/bin/python /usr/share/system-config-printer/applet.py
115480 115530 /usr/lib64/xfce4/panel/wrapper-1.0 /usr/lib64/xfce4/panel/plugins/libsystray.so 6 14680094 systray Notification Area Area where notification icons appear
115480 115533 /usr/lib64/xfce4/panel/wrapper-1.0 /usr/lib64/xfce4/panel/plugins/libactions.so 2 14680095 actions Action Buttons Log out, lock or other system actions
1 115536 /usr/libexec/gvfs-udisks2-volume-monitor
1 115553 /usr/libexec/gvfs-mtp-volume-monitor
1 115585 /usr/libexec/gvfs-gphoto2-volume-monitor
1 115605 /usr/libexec/gvfs-afc-volume-monitor
1 115658 /usr/libexec/at-spi-bus-launcher
115658 115675 /usr/bin/dbus-daemon --config-file=/usr/share/defaults/at-spi2/accessibility.conf --nofork --print-address 3
1 115679 /usr/libexec/at-spi2-registryd --use-gnome-session
115492 115682 /usr/libexec/gvfsd-trash --spawner :1.11 /org/gtk/gvfs/exec_spaw/0
1 115711 /usr/libexec/gvfsd-metadata
115512 115800 /usr/bin/ibus-daemon -r --xim
1 115802 /usr/libexec/dconf-service
115800 115808 /usr/libexec/ibus-dconf
115800 115809 /usr/libexec/ibus-ui-gtk3
1 115811 /usr/libexec/ibus-x11 --kill-daemon
1 115815 /usr/libexec/ibus-portal
115800 115827 /usr/libexec/ibus-engine-simple
After pressing on “launch desktop” we have the exact same processes plus:
115450 119722 /usr/bin/python /bin/websockify -D 23292 localhost:5901
so that’s the second websockify process started
After logout:
1 115450 /usr/bin/python /bin/websockify -D 23292 localhost:5901
So the first websockify process is still alive after the logout.
After killing the job the process gets killed as well.
I have also added a “set -x” to the script to get the commands that get’s executed and the “clean_up” function for vnc template:
clean_up () {
echo "Cleaning up..."
[[ -e "/home/maffiaa/ec-hub/data/sys/dashboard/batch_connect/dev/bc_desktop/output/6948fa9c-ee05-47c5-b2ed-bb05aebbb2f8/clean.sh" ]] && source "/home/maffiaa/ec-hub/data/sys/dashboard/batch_connect/dev/bc_desktop/output/6948fa9c-ee05-47c5-b2ed-bb05aebbb2f8/clean.sh"
vncserver -list | awk '/^:/{system("kill -0 "$2" 2>/dev/null || vncserver -kill "$1)}'
[[ -n ${display} ]] && vncserver -kill :${display}
[[ ${SCRIPT_PID} ]] && pkill -P ${SCRIPT_PID} || :
pkill -P $$
exit ${1:-0}
}
generate this output:
+ clean_up
+ echo 'Cleaning up...'
Cleaning up...
+ [[ -e /home/maffiaa/ec-hub/data/sys/dashboard/batch_connect/dev/bc_desktop/output/03b5de19-a899-4540-98cb-3fa582bf9328/clean.sh ]]
+ source /home/maffiaa/ec-hub/data/sys/dashboard/batch_connect/dev/bc_desktop/output/03b5de19-a899-4540-98cb-3fa582bf9328/clean.sh
+ vncserver -list
+ awk '/^:/{system("kill -0 "$2" 2>/dev/null || vncserver -kill "$1)}'
+ [[ -n 1 ]]
+ vncserver -kill :1
Killing Xvnc process ID 115404
Gdk-Message: 14:56:51.046: nm-applet: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.0.
+ [[ -n 115430 ]]
+ pkill -P 115430
+ :
+ pkill -P 115372
+ exit 0
So here you can also see the pid of the processes that get killed in the cleaning phase:
1 115404 /opt/TurboVNC/bin/Xvnc :1 -desktop TurboVNC: rkanc003is01:1 (maffiaa) -auth /home/maffiaa/.Xauthority -geometry 800x600 -depth 24 -rfbwait 120000 -rfbauth vnc.passwd -x509cert /home/maffi
aa/.vnc/x509_cert.pem -x509key /home/maffiaa/.vnc/x509_private.pem -rfbport 5901 -fp catalogue:/etc/X11/fontpath.d -deferupdate 1 -dridir /usr/lib64/dri -registrydir /usr/lib64/xorg -idletimeout 0
115372 115430 bash /home/maffiaa/ec-hub/data/sys/dashboard/batch_connect/dev/bc_desktop/output/03b5de19-a899-4540-98cb-3fa582bf9328/script.sh
115368 115372 /bin/bash /home/maffiaa/.lsbatch/1607523224.412572.shell
- TurboVNC has “1” as parent but get killed by the clean directly (pid 115404)
- script.sh get killed (pid 115430)
- script.sh’s parent, that is LSF shell, gets killed (pid 115372)
second websockify (pid 119722) gets cleaned but I have no idea when.
First websockify process:
1 115450 /usr/bin/python /bin/websockify -D 23292 localhost:5901
with “1” as parent stays until the job is killed.
I hope that’s enough info to understand if the issue can be solved just adding LSF config:
LSB_RESOURCE_ENFORCE=“cpu gpu memory”
LSF_PROCESS_TRACKING=Y
LSF_LINUX_CGROUP_ACCT=Y
Or we need to do anything else