I have tried on another machine (our ondemand production server). it has CentOS 7 and OnDemand version: 2.0.32 and ruby 2.0.0p648 (2015-12-16) [x86_64-linux].
I built ruby 2.7.8p225 (2023-03-30 revision 1f4d455848) [x86_64-linux] and loaded it as a module:
module list
Currently Loaded Modulefiles:
ruby/2.7.8
After that I deployed application:
bin/bundle config --local --path vendor/bundle
bin/setup
cd /var/www/ood/apps/sys/osc-systemstatus
The error is different on this server but much clear:
[ E 2023-09-15 10:10:31.1579 16281/T1o age/Cor/App/Implementation.cpp:221 ]
: Could not spawn process for application /var/www/ood/apps/sys/osc-systems
tatus: The application encountered the following error: Could not find ood_
core-0.23.5 in any of the sources (Bundler::GemNotFound)
Error ID: 3fc5af15
Error details saved to: /tmp/passenger-error-Yurytl.html
When I check gems for ruby 2.7.8 I see it there:
gem list |grep ood
ood_core (0.23.5)
ood_support (0.0.5)
I would just remove the line rescue => e from the setup call and let the exception be unhandled. Then you should see the error in the browser for easier debugging.
After that I got an error that /usr/sbin/slurmd/sinfo file cannot be found. I don’t know why it’s trying to look for sinfo there but I made a link. And after that I got my next error:
/opt/rh/ondemand/root/usr/share/ruby/vendor_ruby/phusion_passenger/utils.rb:113:in `block in create_thread_and_abort_on_exception’uninitialized constant SlurmSqueueClient::CommandFailed
I think command is supposed to be
sinfo -a -h --Node --Format=‘nodehost,gres,statelong’
and it’s actually working if I run it from console.
Is there any fix for
Could not spawn process for application /var/www/ood/apps/sys/osc-systems
tatus: The application encountered the following error: Could not find ood_
core-0.23.5 in any of the sources (Bundler::GemNotFound)
== Verify dependencies ==
bin/bundle check 1>/dev/null 2>&1 || bin/bundle install
Don’t run Bundler as root. Bundler can ask for sudo if it is needed, and
installing your bundle as root will break this application for all non-root
users on this machine.
Fetching source index from https://art/artifactory/sx-rubygems/
Fetching rake 13.0.6
Installing rake 13.0.6
Using bundler 2.1.4
Fetching ffi 1.15.5
Installing ffi 1.15.5 with native extensions
Fetching minitest 5.20.0
Installing minitest 5.20.0
Fetching ruby2_keywords 0.0.5
Installing ruby2_keywords 0.0.5
Fetching mocha 2.1.0
Installing mocha 2.1.0
Fetching mustermann 3.0.0
Installing mustermann 3.0.0
Fetching ood_support 0.0.5
Installing ood_support 0.0.5
Fetching rexml 3.2.6
Installing rexml 3.2.6
Fetching ood_core 0.23.5
Installing ood_core 0.23.5
Fetching rack 2.2.8
Installing rack 2.2.8
Fetching rack-protection 3.1.0
Installing rack-protection 3.1.0
Fetching tilt 2.3.0
Installing tilt 2.3.0
Fetching sinatra 3.1.0
Installing sinatra 3.1.0
Bundle complete! 6 Gemfile dependencies, 14 gems now installed.
Bundled gems are installed into ./vendor/bundle
== Restart App ==
touch tmp/restart.txt
cd -
touch tmp/restart.txt
Now I see
[ E 2023-09-15 13:37:46.9696 8483/T2r age/Cor/App/Implementation.cpp:221 ]:
Could not spawn process for application /var/www/ood/apps/sys/osc-systemst
atus: The application encountered the following error: Could not find ffi-1
.15.5 in any of the sources (Bundler::GemNotFound)
Error ID: b167258e
Error details saved to: /tmp/passenger-error-trClhD.html
Fetching ffi 1.15.5
Installing ffi 1.15.5 with native extensions
When you installed the ffi gem - it built with native extensions. I.e., it compiled something that’s dynamically linked to a .so file. Try ldd on the files that it built (under vendor/bundle) to see if it’s correctly linked (run ldd on the OOD webserver itself, not on a login node).
Hmmmm. Even so, I’d rebuild (bin/setup) the app in the ondemand’s environment
IIRC you need to source /opt/ood/enable (the path may be slightly different -I’m recalling that off the top of my head)
source /opt/ood/enable
bin/setup
My guess is - the ruby module you build is almost like the ruby that ships on that OS. We had a similar issue with modules. You need to build the ruby module with --enable-shared - at least for EL operating systems.
I have to say first off - this app is not well maintained to be used outside of OSC. We just don’t keep it updated for other sites.
To debug I’d ask you these things. If you issue commands - please do so on the webhost itself or on the submit_host if you use it. The important bit in replicated is replicating the environment of the web host as well. That’s important because sinfo may work elsewhere, but what’s important is that it works on the webhost itself.
What’s the output of this command? We run the same version of Slurm and this is the command we’re issuing when you fail.
sinfo -a -h -o="%C/%A/%D"
The cluster.d file you’re reading can actually specify a different bin directory. so check
/some/clusterd/defined/path/sinfo -a -h -o="%C/%A/%D"
Do you use submit_host to issue commands on another server? This application does not appear to respond to submit_host so it’s issuing these commands on the webserver itself - not the submit_host.
As a last resort edit this line so that it’s StandardError instead of CommandFailed and please report the error message. Stack traces are nice - but really I’m only interested in the last few stacks (say the top 3) and the error message. The error message will be very important here as it should contain some hint as to what’s going on.