Question About Passing Environment Variables for PBS Job

We have an ood server attached to an Open HPC cluster. We are able to submit a simple PBS hostname script and PBS accepts the job but declares an error that it can’t find mpirun likely because the environment was changed by submitting the job through a browser to ood. What is the recommended process for submitting a job through ood and passing the environment for that job through the ood job submission process?

We have verified that we can submit the hostname job via command line (not using the ood GUI) successfully from the ood server and get that job to run okay on four compute nodes. Based on this we know the issue is that ood isn’t passing the environment variables through for PBS.

Thanks
Dom

@domd1: two things come to mind. The first is that at OSC we have similar /etc/profile.d's on our web nodes and our compute nodes so that the environments are available in the first place. The other is that for many scheduler clients we offer the ability to write wrappers without disturbing the primary installation. For example to set an environment variable and ensure that the entire environment is passed to jobs you could write the following wrapper:

#!/usr/bin/env bash
# Export arbitrary variables
export THE_ANSWER='42'
# Capture stdin, waiting only 0.1 seconds for input
read -t 0.1 -r -d '' STDIN_CONTENTS
# Point to the real qsub
QSUB=/opt/pbs/bin/qsub
# Switch on whether or not stdin was empty
# Pass -V to ensure that the entire environment is passed to the job
if [[ -z "$STDIN_CONTENTS" ]]; then
  exec "$QSUB" -V "$@"
else
  echo "$STDIN_CONTENTS" | exec "$QSUB" -V "$@"
fi
# Do not echo or print to STDOUT; the resource adapter parses qsub's
# stdout and changing that format is likely to break OnDemand.

You would then make the wrapper executable, and install it somewhere that makes sense; say /opt/pbs/bin_wrappers/qsub. Finally you would update your cluster configuration like so:

---
v2:
  # ...
  job:
    adapter: "pbspro"
    # ...
    bin_overrides:
      qsub: '/opt/pbs/bin_wrappers/qsub'

Eric pointed out that because we’re exec-ing then the wrapper does not need to explicitly handle stdin:

#!/usr/bin/env bash
# Export arbitrary variables
export THE_ANSWER='42'
# Optionally source /etc/profile
source /etc/profile
# Point to the real qsub
QSUB=/opt/pbs/bin/qsub
exec "$QSUB" -V "$@"

Another option is to use the wrapper script to actually execute the qsub command via ssh to a login node. That would ensure you have the same environment as you would when submitting from the login node.

This issue describes a request on how to use ssh and the individual who posted the issue actually provided a repo with a set of wrappers that appear to work for Slurm. I imagine these could be modified to work in the same way with Open HPC.

Thanks for getting back to me. I am still trying to find time to get back on this project. I will advise you of the results once I complete some testing.

Dominic Daninger | domd@nor-tech.com

Vice President of Engineering

Direct 952-229-2070 | Cell 612
251 3505 | Fax 952-229-2061

Burnsville, Minn.

People Friendly Technology with a Global Reach