Shell app disconnects after ~20 seconds on OOD 4.0.8 / Rocky 9 - works fine on OOD 3.1.14 / Ubuntu

We have two OOD instances and the shell app behaves very differently on each:

Working system:

  • OOD 3.1.14 on Ubuntu 22.04

  • Node.js 18.20.8

  • Shell stays connected for several minutes

Broken system:

  • OOD 4.0.8 on Rocky Linux 9.6

  • Node.js 20.19.5

  • Shell disconnects after ~20 seconds regardless of activity

Both systems have shell app version 1.1.2. The WebSocket closes with code 1006 (abnormal closure) with no reason provided. This affects all users, and happens even with a laptop plugged directly into the same VLAN switch as the server.

What we’ve ruled out:

  • SELinux (disabled)

  • Firewall (firewalld not running, iptables policy ACCEPT)

  • Apache mod_reqtimeout (set to header=0 body=0)

  • mod_proxy_wstunnel (verified loaded)

  • Shell app environment variables:

    • OOD_SHELL_PING_PONG=true

    • OOD_SHELL_INACTIVE_TIMEOUT_MS=300000

  • Passenger settings (passenger_pool_idle_time, passenger_abort_websockets_on_process_shutdown)

  • TCP kernel settings (identical on both systems)

  • PUN nginx config (compared side by side, nearly identical)

  • Network path (tested with direct connection to switch)

PUN error log shows:

Closed terminal: XXXXX code=1006 reason=

The app.js files have minor differences (V2 logs close code/reason, V1 doesn’t) but core logic is identical.

Any ideas what could cause this on OOD 4.0.8 / Rocky 9 / Node 20 that doesn’t happen on OOD 3.1.14 / Ubuntu / Node 18?

Hi and welcome!

Hmmmm not entirely sure. I guess as a spot check - I’d check the env file that holds the configurations and ensure they’re world readable - i.e., that the app can actually boot up and read it as a regular user (I suppose that goes for not only the file, but the directory path as well).

Thanks Jeff!! Permissions are all world-readable:

-rw-r--r-- 1 root root 188 Feb 12 12:03 /etc/ood/config/apps/shell/env
drwxr-xr-x for all parent directories

The app does seem to boot — we see Connection established and Opened terminal in the logs, but then it immediately closes with code 1006:

App 3578969 output: Connection established
App 3578969 output: Opened terminal: 4099293
App 3578969 output: Closed terminal: 4099293 code=1006 reason=

We have a working OOD 3.1.14 on Ubuntu with the same shell app version (1.1.2), and the key difference seems to be Node.js: v18.20.8 (working) vs v20.19.5 (broken on OOD 4.0.8/Rocky 9).

Could there be a Node 20 compatibility issue with the shell app or node-pty?

I’d also entertain any other path of investigation to be honest

Hmmm OK - so what’s the behavior on the broken system again? I mean, does it work at all or does it just hang and disconnect after 20 seconds or does it work for ~20 seconds then disconnect?