Hey All,
So I am working on finalizing our OnDemand Instance for our Slurm-based HPC. Right now, every time I try to get a job or interactive app working, I receive the following error:
sbatch: error: slurm_persist_conn_open_without_init: failed to open persistent connection to host:localhost:6819: Permission denied
sbatch: error: Sending PersistInit msg: Permission denied
sbatch: error: Sending PersistInit msg: Permission denied
sbatch: error: DBD_GET_CLUSTERS failure: Permission denied
sbatch: error: Problem talking to database
sbatch: error: There is a problem talking to the database: Permission denied. Only local cluster communication is available, remove --cluster from your command line or contact your admin to resolve the problem.
I have everything configured based on the documentation. For example, here is the /etc/ood/config/clusters.d/link.yml
:
v2:
metadata:
title: "Link (Bowser v2.0)"
login:
host: "link.phys.wvu.edu"
default: true
job:
adapter: "slurm"
cluster: "Korok"
bin: "/usr/bin"
conf: "/etc/slurm/slurm.conf"
batch_connect:
basic:
script_wrapper: |
module purge
%s
set_host: "host=$(hostname -A | awk '{print $1}')"
vnc:
script_wrapper: |
module purge
export PATH="/opt/TurboVNC/bin/:$PATH"
export WEBSOCKIFY_CMD="/opt/websockify-0.10.0/run"
%s
set_host: "host=$(hostname -A | awk '{print $1}')"
I have confirmed that the reverse proxy is set up correctly and works fine (starting nc -l 5432
on node05.korok
allows a connection when going to http://link.phys.wvu.edu/node/node05.korok/5432
).
sbatch
is working just fine on the system for users, so its not a slurm issue per se.
Any ideas on what might be going on here?
~ Joe G.