Looking for some debugging tips re: rstudio app on el8

Hiya, please see copypasta below, but long story short: I’m trying to debug exactly where my app scripting is hitting a wall which leads to logging…

Singularity as exited…
Timed out waiting for RStudio Server to open port <#>
Cleaning up…

…in the output.log file, but nothing else that looks like an error. No other *log file is created (e.g. rsession.log) so I’m hoping for some advice as to what I can add to the script(s) to try and pin down where the issue exists. Please and thanks!

(We’re jumping to this el8 instance from el6 and a much older version of Singularity, which is apparently too many variables for me to work through! Don’t judge the chaotic mess I made in setup_env; I know it’s fugly!)
Here’s a couple files from my last launch attempt…


Script starting...
Waiting for RStudio Server to open port 48640...

  [from script.sh.erb] Module to load: R/4.0.2   ##
  [from script.sh.erb] Loaded modules:           ##
Currently Loaded Modules:                        ##
  1) R/4.0.2   2) anaconda3/2020.07              ##
*** which rsession                               ## Debug reporting
/usr/lib/rstudio-server/bin/rsession             ##  added by me
*** PATH is                                      ##
/packages/anaconda3/2020.07/condabin             ##
/packages/anaconda3/2020.07/bin                  ##
/packages/singularity/3.6.2/bin                  ##
/packages/R/4.0.2/bin                            ##
/common/bin                                      ##
/usr/lib/rstudio-server/bin                      ##
/usr/local/bin                                   ##
/usr/local/sbin                                  ##
/usr/bin                                         ##
/usr/sbin                                        ##

+ echo 'Starting up rserver...'
Starting up rserver...
+ singularity run -B /tmp/jtb49/35070921/tmp.1c580rbc7Y:/tmp /common/contrib/containers/rserver-launcher-centos6-jb.simg --www-port 48640 --auth-none 0 --auth-pam-helper-path /home/jtb49/ondemand/data/sys/dashboard/batch_connect/dev/RStudio_dev/output/a148a999-9c63-446d-9345-220412d48a34/bin/auth --auth-encrypt-password 0 --rsession-path /home/jtb49/ondemand/data/sys/dashboard/batch_connect/dev/RStudio_dev/output/a148a999-9c63-446d-9345-220412d48a34/rsession.sh
+ echo 'Singularity as exited...'
Singularity as exited...
Timed out waiting for RStudio Server to open port 48640!
Cleaning up...


#!/usr/bin/env bash

# Log all output from this script
export RSESSION_LOG_FILE="/home/jtb49/ondemand/data/sys/dashboard/batch_connect/dev/RStudio_dev/output/a148a999-9c63-446d-9345-220412d48a34/rsession.log"

exec &>>"${RSESSION_LOG_FILE}"

# Launch the original command
echo "**Which rsession"
which rsession
echo "Launching rsession..."
set -x
exec rsession --r-libs-user "/home/jtb49/R/4.0.2:/home/jtb49/R/%v" "${@}"


#!/usr/bin/env bash

# Load the required environment
setup_env () {
  # Additional environment which could be moved into a module
  # Change these to suit
  export RSTUDIO_SERVER_IMAGE="/common/contrib/containers/rserver-launcher-centos6-jb.simg"
  export SINGULARITY_BINDPATH="/bin,/usr/bin,/scratch,/o-scratch,/common,/packages,/etc,/opt,/usr,/var,/lib,/lib64"
  export PATH="/packages/R/4.0.2/bin:/usr/lib/rstudio-server/bin:/packages/singularity/3.6.2/bin:$PATH"
  export R_LIBS_USER="~/R/%v"
  export R_LIBS="/packages/R/%v"
  # path needs for rgdal, sf
  export LD_LIBRARY_PATH="/packages/geos/3.8.1/lib:/packages/PROJ/6.1.0/lib:/packages/gdal/3.0.4/lib:$LD_LIBRARY_PATH"

  # In Singularity 3.5.x it became necessary to explicitly pass LD_LIBRARY_PATH
  # to the singularity process
  export SINGULARITYENV_APPEND_PATH="/usr/lib/rstudio-server/bin:/packages/singularity/3.6.2/bin"

  echo "  [from script.sh.erb] Module to load: R/4.0.2"
  module load R/4.0.2
  module load anaconda3
  echo "  [from script.sh.erb] Loaded modules:"
  module list
  echo "*** which rsession" && which rsession
  echo "*** PATH is" && echo "$PATH"

# Start RStudio Server

# PAM auth helper used by RStudio
export RSTUDIO_AUTH="${PWD}/bin/auth"

# Generate an `rsession` wrapper script
export RSESSION_WRAPPER_FILE="${PWD}/rsession.sh"
umask 077
sed 's/^ \{2\}//' > "${RSESSION_WRAPPER_FILE}" << EOL
  #!/usr/bin/env bash

  # Log all output from this script
  export RSESSION_LOG_FILE="${PWD}/rsession.log"

  exec &>>"\${RSESSION_LOG_FILE}"

  # Launch the original command
  echo "**Which rsession"
  which rsession
  echo "Launching rsession..."
  set -x
  exec rsession --r-libs-user "${R_LIBS_USER}" "\${@}"

# Set working directory to home directory
cd "${HOME}"

export TMPDIR="$(mktemp -d)"

mkdir -p "$TMPDIR/rstudio-server"
python -c 'from uuid import uuid4; print(uuid4())' > "$TMPDIR/rstudio-server/secure-cookie-key"
chmod 0600 "$TMPDIR/rstudio-server/secure-cookie-key"

set -x
# Launch the RStudio Server
echo "Starting up rserver..."

singularity run -B "$TMPDIR:/tmp" "$RSTUDIO_SERVER_IMAGE" \
 --www-port "${port}" \
 --auth-none 0 \
 --auth-pam-helper-path "${RSTUDIO_AUTH}" \
 --auth-encrypt-password 0 \
 --rsession-path "${RSESSION_WRAPPER_FILE}"

echo 'Singularity as exited...'

Is there anything in /var/log/messages on the compute node where the job ran? Perhaps its a Singularity image failure which would log there.

Also, have you considered running rstudio-server without Singularity?

My suggestion would be to get an interactive session going and try to start it interactively by shelling into a singularity image and trying to start it. It may fail immediately, but since you’re in an interactive singularity container, logs inside the container (like /var/lib/rstudio-server/monitor/log/rstudio-server.log) will still be there.

You only care about it starting up so you probably don’t even need to bother with mounting pam helpers or using those arguments. Focus on getting it to start-up in an interactive session and you’re likely to find the cause of your issues (which I’m going to take a wild guess is probably glibc given you’re mounting centos:8 /lib directory into a centos:6 image).

Is this Rstudio 1.3xxxx ? They fiddled with auth and broke what worked in 1.2xxx. I rolled back waiting for the fix in 1.4.

Or try the fix suggested here

Alright, I’ve done a couple things, and made a little progress.
However, I basically know nothing about R/rserver/rsession, and I think that is massively impacting my ability to help-me-help-myself. …but I’ll get back to that, shortly. But for now I apologize for missing things that are probably very obvious to anyone familiar with that software.

It wasn’t (yet) impacting anything, but I created an el8 container using the suggested definition just the same. (For anyone following behind me, to bootstrap Centos 8, I had to mod MirrorURL to be: http://mirror.centos.org/centos-%{OSVERSION}/%{OSVERSION}/BaseOS/$basearch/os since the repo is structured differently.)

For anyone’s reference, let’s start with this…


module load singularity anaconda3 R
export SINGULARITY_BINDPATH="/etc,/media,/mnt,/opt,/srv,/usr,/var,/scratch,/o-scratch,/common,/packages"
export RSTUDIO_SERVER_IMAGE="/common/contrib/containers/rserver-launcher-centos8-jb.simg"
## <snip!> 
singularity run -B "$TMPDIR:/tmp" "$RSTUDIO_SERVER_IMAGE" \
 --www-port "${port}" \
 --auth-none 0 \
 --auth-pam-helper-path "${RSTUDIO_AUTH}" \
 --auth-encrypt-password 0 \
 --server-data-dir "$PWD/rstudio-server" \     ## I'LL DISCUSS THESE
 --server-pid-file "$PWD/rstudio-server.pid" \ ## TWO LINES, BELOW!
 --rsession-path "${RSESSION_WRAPPER_FILE}" 

…but note that the --server-* lines were added after the first bit of testing. Exactly when, I’ll specifically mention below.

I figured since I didn’t feel confident in messing with the singularity/rserver parameters, the next best thing would be to start with the script hinted above (again, without those server lines)…

  • start an rstudio app session, letting it create the targeted files for me
  • (it’ll time-out and die, and that’s fine)
  • note the business-end of the singularity line from the output.log:
  • singularity run -B /tmp/jtb49/35145515/tmp.QIZSlsyJP7:/tmp /common/contrib/containers/rserver-launcher-centos8-jb.simg –www-port 23631 --auth-none 0 --auth-pam-helper-path /home/jtb49/ondemand/data/sys/dashboard/batch_connect/dev/RStudio_dev/output/6e17c3c2-ebee-4f74-82b9-832b8adeb40c/bin/auth --auth-encrypt-password 0 --rsession-path /home/jtb49/ondemand/data/sys/dashboard/batch_connect/dev/RStudio_dev/output/6e17c3c2-ebee-4f74-82b9-832b8adeb40c/rsession.sh

  • ssh into the compute-node
  • load the necessary modules & export variables
  • shell into the container and (mostly-) recycle that singularity line…

$ module load singularity anaconda3 R
$ export SINGULARITY_BINDPATH="/etc,/media,/mnt,/opt,/srv,/usr,/var,/scratch,/o-scratch,/common,/packages"
$ singularity shell /common/contrib/containers/rserver-launcher-centos8-jb.simg
Singularity> rserver --www-port 23631 --auth-none 0 --auth-pam-helper-path /home/jtb49/ondemand/data/sys/dashboard/batch_connect/dev/RStudio_dev/output/6e17c3c2-ebee-4f74-82b9-832b8adeb40c/bin/auth --auth-encrypt-password 0 --rsession-path /home/jtb49/ondemand/data/sys/dashboard/batch_connect/dev/RStudio_dev/output/6e17c3c2-ebee-4f74-82b9-832b8adeb40c/rsession.sh

24 Nov 2020 23:37:10 [rserver] ERROR system error 30 (Read-only file system) [path: /var/run/rstudio-server, target-dir: ]; OCCURRED AT rstudio::core::Error rstudio::core::FilePath::createDirectory(const string&) const src/cpp/shared_core/FilePath.cpp:648; LOGGED FROM: int main(int, char* const*) src/cpp/server/ServerMain.cpp:555

I guess it turns out rserver was angry that it wasn’t allowed to write to /var/run. I don’t recall my exact thought process but since I also am leveraging RStudio 1.3.1093, I ended up trying the “–server-data-dir” parameter mentioned in this post: RStudio when launched without Singularity is having strange troubles with authentication …and that allowed the server to start. Yay! I looked around for logs in a number of places and didn’t see them, can someone tell me where they are and if maybe some binding is screwing them up?

Long story short: I added those --server lines to the singularity line in the app’s script.sh.erb. (Only the data-dir made a difference, but both were mentioned in that other thread, so I figured I’d try both too.) The rstudio app session starts and I was greeted with the blue connect button – BUT after clicking that button, I’m sent to a URL that yields a 404 error:


Not Found
The requested URL /rnode/cn104/37628/auth-do-sign-in was not found on this server.

So I got some forward progress, and that’s cool. But, I’ve been trying to figure this one out for like [too many] hours now, and am getting nowhere. :confused: Anyone have additional suggestions, please and thank you?

UPDATE :slight_smile:

Apparently the hostname command performs slightly differently between el6/el8, because after I sat down to re-read through my post to make sure I didn’t booboo, I realized the “host-part” of the URL didn’t match that of my other instance, and was missing the host’s fqdn:




Boom, baby!

So now I am served a legit page, but with that familiar error from that earlier thread, “Error: Incorrect or invalid username/password”. So at least now I’m in a better position to proceed through that thread, but if anyone can point me to rserver’s logging, I’d appreciate it! Thanks!

Your auth script might need updating. It should match this:

#!/usr/bin/env bash

# Confirm username is supplied
if [[ $# -lt 1 ]]; then
  echo "Usage: auth USERNAME"
  exit 1

# Confirm password environment variable exists
if [[ -z ${RSTUDIO_PASSWORD} ]]; then
  echo "The environment variable RSTUDIO_PASSWORD is not set"
  exit 1

# Read in the password from user
read -s -p "Password: " PASSWORD
echo ""

if [[ ${USERNAME} == ${USER} && ${PASSWORD} == ${RSTUDIO_PASSWORD} ]]; then
  echo "Successful authentication"
  exit 0
  echo "Invalid authentication"
  exit 1

Note the change from -ne to -lt in the first function.

Thanks! That was actually the last thing I tried, real quick, on the way out the door; so it was and still is “less than” (rather than the previous “not equal to”) but apparently that move alone was not sufficient. Though I think perhaps this isn’t being run, ever? My evidence is that I can tuck an echo/touch/etc statements into the top of the auth script, and nothing ever gets echoed/touched/written…!?

#!/usr/bin/env bash
date > /home/jtb49/auth-touch.txt
# Confirm username is supplied
if [[ $# -lt 1 ]]; then
  echo "Usage: auth USERNAME"
  exit 1

Starting to throw spaghetti at the wall to see what sticks, I tried omitting the --server-data-dir parameter and “moved” its intention (for lack of a better phrase) to the binding during the singularity execution:

singularity run -B "$TMPDIR:/tmp,$TMPDIR/rstudio-server:/var/run/rstudio-server"

…but this didn’t work either. So I tried ejecting singularity from the equation and, instead, firing up rserver directly by copying charles8ronson’s rserver line:

rserver \
 --www-port "${port}" \
 --auth-none 0 \
 --auth-pam-helper-path "${RSTUDIO_AUTH}" \
 --auth-encrypt-password 0 \
 --rsession-path "${RSESSION_WRAPPER_FILE}" \
 --server-data-dir "${TMPDIR}" \
 --secure-cookie-key-file "${TMPDIR}/rstudio-server/secure-cookie-key"

…except for one change: his code was missing the trailing slash on the “–server-data-dir” line (which I assume was a typo and so I’ve added it in mine, above. All of these yielded the same “bad auth” web page. I only have that one clue, which is literally nothing: the absense of any proof that ‘auth’ is running, as mentioned above.

I also have to agree with the very last post currently in that thread: there’s a very frustrating lack of logging surrounding the auth/rsession stuff. My compute node has a literally empty /var/log/messages and none of my debug hooks get written to ~ or TMPDIR or anywhere. :frowning: Here’s a grab bag of places they might have gone as I’ve changed directory targets for this-or-that, and there’s no trace of what’s going on, so far.

$ find /var/log/rstudio-server /tmp/rstudio-* /tmp/jtb49 ~/rstudio-*

Thanks again for everyone who’s chipped in!

@jasonbuechler we ran into the same problem, with not being able to find the logs. When we run the OnDemand application (outside singularity) the TMPDIR is actually being spun up on the compute node, at the location of /var/tmp/$jobnum_tmp_dir/* so you should see some logs there. However, the most helpful logs we see is after the jobs finish and we get the “output.log” transferred back. This tells us if anything was wrong with the rserver/rsession launch.

Finally, in the line “–server-data-dir” i do believe there was a typo in there, if you are using vim you should be able to see the ‘/’ in the line --server-data-dir "${TMPDIR}" \ is the same color as the others in that section (i think yellow is the default but it was hard for me to catch) if you mess with the spacing you should get it no problem. But for the sake of being helpful:

rserver \
 --www-port "${port}" \
 --auth-none 0 \
 --auth-pam-helper-path "${RSTUDIO_AUTH}" \
 --auth-encrypt-password 0 \
 --rsession-path "${RSESSION_WRAPPER_FILE}" \
 --server-data-dir "${TMPDIR}" \
 --secure-cookie-key-file "${TMPDIR}/rstudio-server/secure-cookie-key"

The above should work swimmingly and this was the biggest issue for getting the “Auth” to work. Once you have that you should be all set, but again its hard to determine without any debug logs. Let us know if that worked!

Note: we also tried this with the preview of ‘rstudio 1.4.1056’ and they have a whole new “database scheme” which is causing a lot of headaches so hopefully some feedback with them can get a conversation started about how to better handle things like this in the future.

1 Like

I just wanted to check back in here and formally mark this as ‘solved’. (yay!)

Environment: CentOS Linux release 8.2.2004 (Core), rstudio-server-1.3.1093-1.x86_64, and dependencies from ondemand-release-compute-1.8-1.noarch.

More or less cloning @charles8ronson’s rserver command (rserver --server-data-dir ...) and updating the bin/auth script (changing “not equal” to “less than”) got me up and running with a new centos8 instance.

My “gotcha” that had me chasing my tail unnecessarily was due to two annoying things:

  • needing to appropriately scope CAS
  • needing to match the host_regex to el8’s hostname results

The host_regex is clearly stated in the Reverse Proxy requirements but I was skipping steps in cloning elements of our el6 instance, and missed the fact that was a bit different for our el8 nodes.

The CASScope thing is another story. It’s not really canonically documented (that I’m aware of) in the OOD/apps documentation or even here on OSC’s Discourse… and CAS/mod_auth_cas is something I have virtually no experience with outside of OnDemand. But because the “failed auth” screen is also what you’ll see if you fail to “fix” bin/auth for newer RStudio releases, I was chasing a red-herring for a while.