Ood shell inactive timeout

OnDemand version: 4.0.6

[fangping@ondemand shell]$ pwd
/etc/ood/config/apps/shell
[fangping@ondemand shell]$ cat env

/etc/ood/config/apps/shell/env

OOD_SHELL_INACTIVE_TIMEOUT_MS=600000

The connection to the shell session has been terminated in < 1 minute. How can I increase the ood shell inactive timeout?

You have to enable ping pongs by setting the OOD_SHELL_PING_PONG environment variable. Otherwise, they’re disabled entirely.

We started to have this same issue recently. We have the time outs and the ping pongs enabled but it still is timing out rather quickly.

@DartCleek can you share your configurations? The timeout setting is in milliseconds not seconds - just an off the top guess as to what your issue may be.

Hey Jeff,

This did work for a period of time but we are not sure exactly when it broke.

We originally had the file in /etc/ood/config/apps/shell
We have temporarily renamed the old directory to what you see below with shell being deleteme_shell

[ ~]$ cat /etc/ood/config/apps/deleteme_shell/env
OOD_SSHHOST_ALLOWLIST=“redacted”
OOD_DEFAULT_SSHHOST=“redacted”
OOD_SSH_WRAPPER=/var/www/ood/apps/sys/shell/ssh-kerb

OOD_SHELL_INACTIVE_TIMEOUT_MS=3600000 # 1 hour
OOD_SHELL_MAX_DURATION_MS=36000000 # 10 hours
OOD_SHELL_PING_PONG=true

We currently have the shell app at /var/www/ood/apps/sys/shell, and its environment settings are stored in a .env.local file there, not an env. Should we put those environment variables into an env file?

[rci8@]$ cat .env.local
OOD_SSHHOST_ALLOWLIST=“\redacted”
OOD_DEFAULT_SSHHOST=“redacted”
OOD_SSH_WRAPPER=/var/www/ood/apps/sys/shell/ssh-kerb

OOD_SHELL_INACTIVE_TIMEOUT_MS=3600000 # 1 hour
OOD_SHELL_MAX_DURATION_MS=36000000 # 10 hours
OOD_SHELL_PING_PONG=true

We had things working originally. At some point in the last month or two it times out very quickly.
Initially the env file was in /etc/ood/config/apps/shell.
We recently moved each app into its own git repository. That change broke some things; we have since fixed parts of it.
After the change, we experimented and put the env file back in different places, but we didn’t notice the change that caused the timeout.

The OOD documentation directs us to use the /etc but for some reason we’re using /var. Not sure what the difference between the 2, we have all of our interactive apps under the
/var/www/ood/apps/sys/shell

We’re not sure which location is authoritative for customizations. I wonder if the environment settings needed by the dashboard, shell, job composer, desktop, etc., should be placed in an env file under /etc/ood/config/apps/shell (system configuration), rather than inside /var/www/ood/apps/sys/shell/.env.local (app-local file).

The correct path is /etc/ood/config/apps/shell/env - you seem to have deleteme_shell folder which is incorrect.

Do not use .env.local in the /var/www path as this will be overwritten when you update OOD. Same goes for the shell wrapper - that will be deleted when you upgrade as well.

Thanks Jeff,

One of my colleagues did that because all of our apps are in the /var/www directory and we had did a move to put them in git. We had renamed the shell directory to deleteme_shell in case we needed stuff there. Thank you!! I’ll bring this back to my team.

Edited to add we tried this again to just make sure, and we still time out within about 1minute to 90 seconds.

If you have the correct settings in /etc/ood/config/apps/shell/env, then I would look at the permissions of the file and ensure they’re readable by regular users.

We did that, the file is readable by all. This worked for awhile but we’re not sure exactly when it broke. Sometime within the last 4-6 weeks

OK so if permissions are correct then I guess I’d have to confirm the contents of the file.

Also note that the login server could be kicking you off. As in, the ssh settings on the login server itself preclude you from keeping the connection open.

Thanks. Wouldn’t that happen when we were directly ssh’ed into the server without using OOD?
I normally stay ssh’ed into the server for several hours and do not get kicked off. So I don’t think its the login server. I think we have an idea because we recently had moved a lot of our apps into git repositories, but not the shell so we’re going to do a rollback. Could this be a version issue with ood 4.0.6?

Yes, but I’m just tossing out guesses.

I don’t think so, I can’t replicate on our systems.

There’s something obvious that we’re missing. Something like it’s a windows formatted file and not being read correctly or something. Are there any errors in /var/log/ondemand-nginx/$USER/error.log when it boots up? Or maybe it’s as simple as a typo in the environment variable names that we’re just missing.

I was able to view the logs but there is a lot of noise in the logs but nothing that is standing out to me.

Can you share the contents of this file /etc/ood/config/apps/shell/env?

Sorry, Jeff, I just noticed this
cat /etc/ood/config/apps/shell/env
OOD_SSHHOST_ALLOWLIST=“\w+.redatcted.edu:\w+.hpcc.redacted”
OOD_DEFAULT_SSHHOST=“servername.redacted.edu”
OOD_SSH_WRAPPER=/usr/bin/ssh-kerb

OOD_SHELL_INACTIVE_TIMEOUT_MS=3600000 # 1 hour
OOD_SHELL_MAX_DURATION_MS=36000000 # 10 hours
OOD_SHELL_PING_PONG=true