Timeout summary for interactive apps and the bc_desktop app

emdrago · October 7, 2025, 9:01pm

summary of timeout: do i understand this correctly?

For an interactive app based on vnc template, there is a ‘websockify_timeout_seconds’ that can be expanded to 20 seconds. This is an upper limit due to websockify internals, iirc

For basic and template, annecdotally, one can apply another timeout in the ../template/after.sh.erb

if wait_until_port_used “${host}:${port}” 160; then

The 160 is an arbitrary local choice. This can be much longer, and is (i think) independent of the websockify timeout,

I don’t understand what timeout this setting actually affects, but we’ve seen it be effective.

So finally – the desktop app, which does use the vnc app, does not have an open structure with ‘after.sh(.erb)’. Is there another option to implement a timeout for the desktop app? is there another app template that is now superceeding the bc_desktop?

travert · October 9, 2025, 6:43pm

There’s a few things here that are a bit off and some I’m not sure of myself.

I’m unsure if there is a hard 20 second ceiling for websockify itself. I couldn’t find anything online saying this either, but I could be mistaken. I’m just not sure if that’s true.

OOD itself has 3 types of apps: bacic, vnc, and an external contributor created vnc_container which exists but is much less common currently.

wait_until_port_used is just making the bash script wait before it writes the connection.yml file for OOD, so if you have network latency or heavy load or anything to cause the job to take a bit to spin up this can help ensure the app waits before timing out.

I’m not sure what you mean here. You can use the after.sh.erb with a vnc app. You also have the option of using the script_wrapper in the submit to issue the wait_until_port_used I think as well. It’s all about just finding the ways OOD lets you inject bash, which is those scripts and the script_override for the most part.

Finally, bc_desktop is not a template. The templates are only basic, vnc and vnc_container. And by template I mean the thing you set in you submit.yml for the app.

I do see how it could be confusing given the doc does title Basic Batch Connect but the key in that file to notice is the template:

batch_connect:
  template: "basic"  # this is one of the 3 template types OOD understands 
  ...

This is what the OOD team means when the talk about the “types of scientific apps” OOD can run.

Also, you do have the option of moving some of this up into a clusters.d file and putting the batch_connect stuff there to set it across the cluster as well, I was just trying to show a more modular approach.

emdrago · October 9, 2025, 8:52pm

Hi, Travis – Thanks for providing this guidance. Appreciate you.

Thanks particularly for the comments about ‘bash injection’ as the high-level perspective to take. For the bc_desktop, I was too narrow in my perspective by looking at /etc/ood/config/apps/bc_desktop, and forgetting that the working desktop app can certainly accept the ../template/before.sh.erb and ./template/script.sh.erb, so why not introduce the after.sh.erb there as well? Thanks again for reorienting me.

And a colleague was showing me this morning that the clusters.d definitions also allow for the bash injection – so I think it’s coming together now.

OOD itself has 3 types of apps: bacic, vnc, and an external contributor created vnc_container which exists but is much less common currently.

Yeah, I guess I had previously heard that the vnc_container emerged from the community. Maybe in the Appverse affinity group, or some other such, I’ll encounter people who have developed to leverage that. Do you recall the motivation that led to ‘vnc_container’ being developed? At the highest level, it must be to better support containerized apps ( : I will dig into the github repo to follow-up on this exchange.

I’m unsure if there is a hard 20 second ceiling for websockify itself. I couldn’t find anything online saying this either, but I could be mistaken. I’m just not sure if that’s true.

Good point. I’ve been trying to find this reference again. I quoted this to the team at Case, since we’ve been struggling with network latency. What i recall reading was essentially “if you need more than 20 s for the port communication, there’s a deeper problem that shouldn’t be papered over with a longer timeout”

True or not, I’ll admit that might be a bit beside the point from an operational point of view.

Take care

emdrago · October 9, 2025, 10:27pm

Technical review question. To implement the websockify timeout in submit.yml.erb, i’ve previously just done the following – is this actually an effective approach?

batch_connect:

template: “vnc”
set_host: “host=$(hostname)”
websockify_timeout_seconds: 20

I’ve not yet confronted the syntax for a cluster definition. Would the timeout alongside the script_wrapper, or within? From submit.yml.erb, I’m thinking it’s the same level…

batch_connect:
basic:
websockify_timeout_seconds: 20
script_wrapper: |
module purge
%s
set_host: “host=$(hostname)”
vnc:
websockify_timeout_seconds: 20
script_wrapper: “module load ondemand-vnc/2.0\n%s”
set_host: “host=$(hostname)”

travert · October 10, 2025, 12:08am

For the websockify_timeout_seconds I’m a bit unsure looking at this where it goes. We don’t seem to call it out in the docs explicitly that I could find (but maybe it’s in there somewhere), yet in the source code I see it being used in the backend ood_core code here:

github.com/OSC/ood_core

lib/ood_core/batch_connect/templates/vnc.rb

05de3fa86


      
          # Run the script under the VNC server's display
          def run_script
            %(DISPLAY=:${display} #{super})
          end
          
          # After startup the main script, scan the VNC server log file for
          # successful connections so that the password can be reset
          def after_script
            websockify_cmd = context.fetch(:websockify_cmd, "${WEBSOCKIFY_CMD:-/opt/websockify/run}").to_s
            websockify_hb = context.fetch(:websockify_heartbeat_seconds, "${WEBSOCKIFY_HEARTBEAT_SECONDS:-30}").to_s
            websockify_timeout_seconds = context.fetch(:websockify_timeout_seconds, '${WEBSOCKIFY_TIMEOUT_SECONDS:-10}').to_s
          
            <<-EOT.gsub(/^ {14}/, "")
              #{super}
          
              # launches websockify in the background; waiting until the process
              # has started proxying successfully.
              start_websockify() {
                local log_file="./websockify.log"
                # launch websockify in background and redirect all output to a file.
                #{websockify_cmd} $1 --heartbeat=#{websockify_hb} $2 &> $log_file &

This makes me think it should be there in the submit.yml to get picked up and help setup the environment for the job.

For the indentation questions, the doc examples are likely the best bet to see where to place the indentation.

For the cluster files, you can see an example here with the batch_connect stanza and even a script_wrapper for each type of app launched on that cluster, so you see a different command run whether it’s basic or vnc on that cluster:

But this is only if you want that to run for each job submitted to that cluster, which for setting things like where the websockify command is located or ensuring modules are purged on job launch this feature can be quite helpful.

I hope that helps and clears some things up, but let me know if you have anymore or if any of that is unclear.

emdrago · October 10, 2025, 2:12am

This is great, Travis – very helpful. Will share this locally, and update in the coming days.

Topic		Replies	Views
Websockify timeout? Get Help	14	442	December 23, 2024
SHELL app and websocket inactivty timouts Get Help	5	297	August 19, 2024
Want to set the VNC timeout from the configuration file Get Help feature-request	3	104	June 11, 2025
Vnc port configurations Get Help	37	356	May 19, 2025
VNC apps stuck when launched Get Help question	5	115	October 21, 2025

Timeout summary for interactive apps and the bc_desktop app

Related topics