Osc-systemstatus app is not working

Hello,

I downloaded GitHub - OSC/osc-systemstatus and deployed it.

But when I try to run it in a browser I get:

Internal Server Error
We’re sorry, but something went wrong. If you are the application owner check the logs for more information.

Details:
/var/www/ood/apps/sys/osc-systemstatus/app.rb:115:in block (2 levels) in ' /var/www/ood/apps/sys/osc-systemstatus/app.rb:115:in map’
/var/www/ood/apps/sys/osc-systemstatus/app.rb:115:in block in ' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1763:in call’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1763:in block in compile!' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1066:in block (3 levels) in route!’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1084:in route_eval' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1066:in block (2 levels) in route!’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1115:in block in process_route' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1113:in catch’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1113:in process_route' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1064:in block in route!’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1061:in each' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1061:in route!’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1185:in block in dispatch!' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1156:in catch’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1156:in invoke' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1180:in dispatch!’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:996:in block in call!' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1156:in catch’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1156:in invoke' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:996:in call!’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:985:in call' /usr/share/gems/gems/rack-protection-3.1.0/lib/rack/protection/xss_header.rb:20:in call’
/usr/share/gems/gems/rack-protection-3.1.0/lib/rack/protection/path_traversal.rb:18:in call' /usr/share/gems/gems/rack-protection-3.1.0/lib/rack/protection/json_csrf.rb:28:in call’
/usr/share/gems/gems/rack-protection-3.1.0/lib/rack/protection/base.rb:53:in call' /usr/share/gems/gems/rack-protection-3.1.0/lib/rack/protection/base.rb:53:in call’
/usr/share/gems/gems/rack-protection-3.1.0/lib/rack/protection/frame_options.rb:33:in call' /usr/share/gems/gems/rack-2.2.8/lib/rack/logger.rb:17:in call’
/usr/share/gems/gems/rack-2.2.8/lib/rack/common_logger.rb:38:in call' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:261:in call’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:254:in call' /usr/share/gems/gems/rack-2.2.8/lib/rack/head.rb:12:in call’
/usr/share/gems/gems/rack-2.2.8/lib/rack/method_override.rb:24:in call' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:219:in call’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:2074:in call' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1633:in block in call’
/usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1849:in synchronize' /usr/share/gems/gems/sinatra-3.1.0/lib/sinatra/base.rb:1633:in call’
/opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/rack/thread_handler_extension.rb:107:in process_request' /opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:149:in accept_and_process_next_request’
/opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:110:in main_loop' /opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler.rb:419:in block (3 levels) in start_threads’
/opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/utils.rb:113:in block in create_thread_and_abort_on_exception'undefined method friendly_error_message’ for nil:NilClass

My OnDemand version: 3.0.1 and ruby 3.0.4p208 (2022-04-12 revision 3fa771dded) [x86_64-linux]

Where I can find apps logs?

Any advise?
Thank you

What type of scheduler do you have? IIRC this app may only work for Slurm.

I have slurm.

I have 2 cluster files and look like this:


v2:
metadata:
title: “Cluster”
login:
host: “hostname”
job:
adapter: “slurm”
# cluster: “viking”
bin: “/usr/sbin/slurmd”
conf: “/etc/slurm/slurm.conf”
bin_overrides:
sbatch: “/bin/sbatch”
squeue: “/bin/squeue”
scontrol: “/bin/scontrol”
scancel: “/bin/scancel”

batch_connect:
basic:
script_wrapper: |
module purge
%s
set_host: “host=$(hostname -A | awk ‘{print $1}’).domain.com”
vnc:
script_wrapper: |
module purge
export PATH=“/opt/TurboVNC/bin:$PATH”
export WEBSOCKIFY_CMD=“/cluster/apps/websockify/run”
%s
set_host: “host=$(hostname -A | awk ‘{print $1}’)”

And another one:


v2:
metadata:
title: “Phoenix”
login:
host: “hostname2”
job:
adapter: “slurm”
bin: “/usr/sbin/slurmd”
conf: “/etc/slurm/slurm_flamingo.conf”
bin_overrides:
sbatch: “/bin/sbatch”
squeue: “/bin/squeue”
scontrol: “/bin/scontrol”
scancel: “/bin/scancel”

OK then it should work for you.

What’s the error message - you’ve given the stack trace but there should be an error message as well.

I have tried on another machine (our ondemand production server). it has CentOS 7 and OnDemand version: 2.0.32 and ruby 2.0.0p648 (2015-12-16) [x86_64-linux].

I built ruby 2.7.8p225 (2023-03-30 revision 1f4d455848) [x86_64-linux] and loaded it as a module:

module list
Currently Loaded Modulefiles:

  1. ruby/2.7.8

After that I deployed application:

bin/bundle config --local --path vendor/bundle
bin/setup
cd /var/www/ood/apps/sys/osc-systemstatus

== Verify dependencies ==
bin/bundle check 1>/dev/null 2>&1 || bin/bundle install

== Restart App ==
touch tmp/restart.txt

cd -

The error is different on this server but much clear:
[ E 2023-09-15 10:10:31.1579 16281/T1o age/Cor/App/Implementation.cpp:221 ]
: Could not spawn process for application /var/www/ood/apps/sys/osc-systems
tatus: The application encountered the following error: Could not find ood_
core-0.23.5 in any of the sources (Bundler::GemNotFound)
Error ID: 3fc5af15
Error details saved to: /tmp/passenger-error-Yurytl.html

When I check gems for ruby 2.7.8 I see it there:
gem list |grep ood
ood_core (0.23.5)
ood_support (0.0.5)

I’m not sure why it cannot find it.

I found this issue:

And I followed advise from:

efranz
Sep '20
The null object wasn’t added yet https://github.com/OSC/osc-systemstatus/blob/4ecde89261dca77b50473680174655a7b5677679/lib/slurm_squeue_client.rb#L186-L188 3 so instead when an exception is raised in the setup method call the return value is nil, and then we don’t compact the array to get rid of nil prior to calling friendly_error_message on each object.

I would just remove the line rescue => e from the setup call and let the exception be unhandled. Then you should see the error in the browser for easier debugging.

After that I got an error that /usr/sbin/slurmd/sinfo file cannot be found. I don’t know why it’s trying to look for sinfo there but I made a link. And after that I got my next error:

/opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/utils.rb:113:in `block in create_thread_and_abort_on_exception’uninitialized constant SlurmSqueueClient::CommandFailed

I think command is supposed to be
sinfo -a -h --Node --Format=‘nodehost,gres,statelong’
and it’s actually working if I run it from console.

What version of Slurm are you running? There appears to be a bug around that CommandFailed class.

I’m using slurm version 22.05.8.

Is there any fix for
Could not spawn process for application /var/www/ood/apps/sys/osc-systems
tatus: The application encountered the following error: Could not find ood_
core-0.23.5 in any of the sources (Bundler::GemNotFound)

For my older ondemand server?

Should be just path not --path.

bin/bundle config --local path vendor/bundle
bin/setup

bin/bundle config --local path vendor/bundle

bin/setup

cd /var/www/ood/apps/sys/osc-systemstatus

== Verify dependencies ==
bin/bundle check 1>/dev/null 2>&1 || bin/bundle install
Don’t run Bundler as root. Bundler can ask for sudo if it is needed, and
installing your bundle as root will break this application for all non-root
users on this machine.
Fetching source index from https://art/artifactory/sx-rubygems/
Fetching rake 13.0.6
Installing rake 13.0.6
Using bundler 2.1.4
Fetching ffi 1.15.5
Installing ffi 1.15.5 with native extensions
Fetching minitest 5.20.0
Installing minitest 5.20.0
Fetching ruby2_keywords 0.0.5
Installing ruby2_keywords 0.0.5
Fetching mocha 2.1.0
Installing mocha 2.1.0
Fetching mustermann 3.0.0
Installing mustermann 3.0.0
Fetching ood_support 0.0.5
Installing ood_support 0.0.5
Fetching rexml 3.2.6
Installing rexml 3.2.6
Fetching ood_core 0.23.5
Installing ood_core 0.23.5
Fetching rack 2.2.8
Installing rack 2.2.8
Fetching rack-protection 3.1.0
Installing rack-protection 3.1.0
Fetching tilt 2.3.0
Installing tilt 2.3.0
Fetching sinatra 3.1.0
Installing sinatra 3.1.0
Bundle complete! 6 Gemfile dependencies, 14 gems now installed.
Bundled gems are installed into ./vendor/bundle

== Restart App ==
touch tmp/restart.txt

cd -

touch tmp/restart.txt

Now I see
[ E 2023-09-15 13:37:46.9696 8483/T2r age/Cor/App/Implementation.cpp:221 ]:
Could not spawn process for application /var/www/ood/apps/sys/osc-systemst
atus: The application encountered the following error: Could not find ffi-1
.15.5 in any of the sources (Bundler::GemNotFound)
Error ID: b167258e
Error details saved to: /tmp/passenger-error-trClhD.html

But gem is installed:
ffi (1.15.5)

Fetching ffi 1.15.5
Installing ffi 1.15.5 with native extensions

When you installed the ffi gem - it built with native extensions. I.e., it compiled something that’s dynamically linked to a .so file. Try ldd on the files that it built (under vendor/bundle) to see if it’s correctly linked (run ldd on the OOD webserver itself, not on a login node).

It doesn’t look broken:

ldd vendor/bundle/ruby/2.7.0/gems/ffi-1.15.5/lib/ffi_c.so
linux-vdso.so.1 => (0x00007ffc280d6000)
libm.so.6 => /lib64/libm.so.6 (0x00007fec456fd000)
libc.so.6 => /lib64/libc.so.6 (0x00007fec4532f000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fec45113000)
/lib64/ld-linux-x86-64.so.2 (0x00007fec45c22000)

Hmmmm. Even so, I’d rebuild (bin/setup) the app in the ondemand’s environment

IIRC you need to source /opt/ood/enable (the path may be slightly different -I’m recalling that off the top of my head)

source /opt/ood/enable
bin/setup

My guess is - the ruby module you build is almost like the ruby that ships on that OS. We had a similar issue with modules. You need to build the ruby module with --enable-shared - at least for EL operating systems.

I did:

source /opt/ood/ondemand/enable
bin/bundle config --local path vendor/bundle
bin/setup

without rebuilding ruby. Now it can find all gems, but it gives me the same error as my Ondemand 3.0 version:


/var/www/ood/apps/sys/osc-systemstatus/lib/slurm_squeue_client.rb:85:in sinfo' /var/www/ood/apps/sys/osc-systemstatus/lib/slurm_squeue_client.rb:176:in cluster_info’
/var/www/ood/apps/sys/osc-systemstatus/lib/slurm_squeue_client.rb:200:in setup' /var/www/ood/apps/sys/osc-systemstatus/app.rb:105:in block (2 levels) in ’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/ood_core-0.23.5/lib/ood_core/clusters.rb:123:in each' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/ood_core-0.23.5/lib/ood_core/clusters.rb:123:in each’
/var/www/ood/apps/sys/osc-systemstatus/app.rb:101:in map' /var/www/ood/apps/sys/osc-systemstatus/app.rb:101:in block in ’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1763:in call' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1763:in block in compile!’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1066:in block (3 levels) in route!' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1084:in route_eval’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1066:in block (2 levels) in route!' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1115:in block in process_route’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1113:in catch' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1113:in process_route’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1064:in block in route!' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1061:in each’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1061:in route!' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1185:in block in dispatch!’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1156:in catch' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1156:in invoke’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1180:in dispatch!' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:996:in block in call!’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1156:in catch' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1156:in invoke’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:996:in call!' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:985:in call’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-protection-3.1.0/lib/rack/protection/xss_header.rb:20:in call' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-protection-3.1.0/lib/rack/protection/path_traversal.rb:18:in call’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-protection-3.1.0/lib/rack/protection/json_csrf.rb:28:in call' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-protection-3.1.0/lib/rack/protection/base.rb:53:in call’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-protection-3.1.0/lib/rack/protection/base.rb:53:in call' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-protection-3.1.0/lib/rack/protection/frame_options.rb:33:in call’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-2.2.8/lib/rack/logger.rb:17:in call' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-2.2.8/lib/rack/common_logger.rb:38:in call’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:261:in call' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:254:in call’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-2.2.8/lib/rack/head.rb:12:in call' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/rack-2.2.8/lib/rack/method_override.rb:24:in call’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:219:in call' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:2074:in call’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1633:in block in call' /var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1849:in synchronize’
/var/www/ood/apps/sys/osc-systemstatus/vendor/bundle/ruby/2.7.0/gems/sinatra-3.1.0/lib/sinatra/base.rb:1633:in call' /opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/rack/thread_handler_extension.rb:107:in process_request’
/opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:149:in accept_and_process_next_request' /opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:110:in main_loop’
/opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler.rb:419:in block (3 levels) in start_threads' /opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/utils.rb:113:in block in create_thread_and_abort_on_exception’uninitialized constant SlurmSqueueClient::CommandFailed

You mentioned a bug around SlurmSqueueClient class. Is there a bug report opened? Is there any workaround?

I have to say first off - this app is not well maintained to be used outside of OSC. We just don’t keep it updated for other sites.

To debug I’d ask you these things. If you issue commands - please do so on the webhost itself or on the submit_host if you use it. The important bit in replicated is replicating the environment of the web host as well. That’s important because sinfo may work elsewhere, but what’s important is that it works on the webhost itself.

  1. What’s the output of this command? We run the same version of Slurm and this is the command we’re issuing when you fail.
sinfo -a -h -o="%C/%A/%D"
  1. The cluster.d file you’re reading can actually specify a different bin directory. so check
/some/clusterd/defined/path/sinfo -a -h -o="%C/%A/%D"
  1. Do you use submit_host to issue commands on another server? This application does not appear to respond to submit_host so it’s issuing these commands on the webserver itself - not the submit_host.

  2. As a last resort edit this line so that it’s StandardError instead of CommandFailed and please report the error message. Stack traces are nice - but really I’m only interested in the last few stacks (say the top 3) and the error message. The error message will be very important here as it should contain some hint as to what’s going on.

I run commends on my ondemand webhost console:

sinfo -a -h -o=“%C/%A/%D”
=15193/9501/528/25222/370/133/523

I have two slurm clusters configured.
First:
job:
adapter: “slurm”
# cluster: “viking”
bin: “/bin”
conf: “/cluster/scheduler/current/etc/slurm.conf”
bin_overrides:
sbatch: “/bin/sbatch”
squeue: “/bin/squeue”
scontrol: “/bin/scontrol”
scancel: “/bin/scancel”
sinfo: “/bin/sinfo”

Second:
job:
adapter: “slurm”
bin: “/bin”
conf: “/cluster/scheduler/current/etc_phx/slurm.conf”
bin_overrides:
sbatch: “/bin/sbatch”
squeue: “/bin/squeue”
scontrol: “/bin/scontrol”
scancel: “/bin/scancel”
sinfo: “/bin/sinfo”

My command with the ful path works as well.
/bin/sinfo -a -h -o=“%C/%A/%D”
=15124/9570/528/25222/369/134/523

I do not use submit_host command.

This is my standard error:


/opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:110:in main_loop' /opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler.rb:419:in block (3 levels) in start_threads’
/opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/utils.rb:113:in block in create_thread_and_abort_on_exception'undefined method friendly_error_message’ for nil:NilClass

Does it require grafana to be set up with ondemand instance?

I figured it out. I needed cluster names in yml files because I had multiple clusters and it works fine.

Thank you for all your help

I noticed that planned nodes are under free, I think they should be under active.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.