Issues with RStudio Server Integration in Open OnDemand using Singularity

Hello,

I am currently working on integrating RStudio Server as an interactive app within Open OnDemand on a CentOS 7.9 machine. The machine connects to my cluster via host-based authentication and I have successfully set up similar interactive apps like Jupyter Notebook.

I encountered challenges due to the default behavior of RStudio Server, which runs on a specific port and fails if a session is initiated on a different port. To address this, I attempted to follow various examples, particularly those involving building a custom Singularity image. Despite these efforts, the integration has not yet been successful.

Here is a brief overview of my progress and issues:

Setup Steps Completed:

  1. Copied RStudio App: Successfully copied and configured to send RStudio Server requests to the compute node.
  2. Software Requirements: Confirmed the presence of required software:
  • R: /usr/bin/R
  • rstudio-server: Successfully installed but currently inactive.
  • singularity: /usr/local/bin/singularity

Setup Steps Attempted:

The steps to be complete were unclear for me, since the default behavior for rstudio-server is to run on a specific unique port and the service would fail if a session is made with a different port. Then I’ve attempted to follow some examples, specially having to build the image.

Current Challenge:

The primary challenge is running RStudio Server within a Singularity container on the compute node which uses Ubuntu. I built a custom Singularity image successfully, but the integration fails when launching from the Open OnDemand UI.

Singularity.def File:

Bootstrap: docker
From: ubuntu:22.04

%labels
    Maintainer oodadmin
    Version 0.1

%help
This will run RStudio Server which must be mounted with dependencies into the container.

%post
    export DEBIAN_FRONTEND=noninteractive
    ...
    apt-get update && apt-get install -y \
    r-base \
    r-base-dev \
    r-cran-foreign \
    r-cran-ggplot2 \
    wget \
    libcurl4-openssl-dev \
    libssl-dev \
    libxml2-dev \
    tzdata \
    dpkg-sig \
    gnupg \
    netcat \
    sudo
    useradd -m -s /bin/bash rstudio
    mkdir -p /root/.gnupg
    # Import RStudio public key
    cat << 'EOF_KEY' | gpg --import
-----BEGIN PGP PUBLIC KEY BLOCK-----
[# (Key content)](https://cloud.rstudio.com/code-signing/)
-----END PGP PUBLIC KEY BLOCK-----
EOF_KEY
    wget http://security.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2.22_amd64.deb
    dpkg -i libssl1.1_1.1.1f-1ubuntu2.22_amd64.deb
    RSTUDIO_VERSION="2024.04.1-748"
    RSTUDIO_DEB="rstudio-server-${RSTUDIO_VERSION}-amd64.deb"
    wget https://download2.rstudio.org/server/jammy/amd64/${RSTUDIO_DEB}
    apt install -y ./${RSTUDIO_DEB}
    mkdir -p /etc/rstudio
    cat << 'EOF_CONF' > /etc/rstudio/rserver.conf
server-user=rstudio
auth-none=1
server-project-sharing=0
auth-pam-sessions-enabled=0
EOF_CONF
    apt-get clean
    rm -rf /var/lib/apt/lists/*

%environment
    export PATH=/usr/lib/rstudio-server/bin:$PATH

%apprun rserver
    exec rserver "${@}"

%runscript
    exec rserver "${@}"

Script.sh.erb File Modifications:

#!/usr/bin/env bash

set -x

echo "Script starting..."
echo "Waiting for RStudio Server to open port ${port}..."
echo "TIMING - Starting wait at: $(date)"

setup_env () {
  export RSTUDIO_SERVER_IMAGE="/home/aymane/Scripts/RStudio/rstudio-server.sif"
  export SINGULARITY_BINDPATH="/etc,/media,/mnt,/opt,/srv,/usr,/var"
  export PATH="$PATH:/usr/lib/rstudio-server/bin"
}
setup_env

export TMPDIR="${PWD}/rstudio-server-tmp"
mkdir -p $TMPDIR/var/lib/rstudio-server
mkdir -p $TMPDIR/var/run/rstudio-server

cat /proc/sys/kernel/random/uuid > "$TMPDIR/var/run/rstudio-server/secure-cookie-key"
chmod 0600 "$TMPDIR/var/run/rstudio-server/secure-cookie-key"

export RSTUDIO_DB_FILE="$TMPDIR/var/lib/rstudio-server/rstudio-os.sqlite"
touch $RSTUDIO_DB_FILE
chmod 0600 $RSTUDIO_DB_FILE

echo "Starting up rserver..."

singularity exec --bind $TMPDIR/var/lib/rstudio-server:/var/lib/rstudio-server \
  --bind $TMPDIR/var/run/rstudio-server:/var/run/rstudio-server \
  --home /home/oodadmin:/home/oodadmin \
  $RSTUDIO_SERVER_IMAGE \
  /usr/lib/rstudio-server/bin/rserver \
  --auth-none 1 \
  --www-port=${port} \
  --secure-cookie-key-file=/var/run/rstudio-server/secure-cookie-key \
  --database-config-file=/var/lib/rstudio-server/rstudio-os.sqlite &

sleep 5

echo 'Singularity has exited...'

echo "Waiting for RStudio Server to open port ${port}..."
timeout 60 sh -c 'until nc -z localhost ${port}; do sleep 1; done'

if ! nc -z localhost ${port}; then
  echo "Timed out waiting for RStudio Server to open port ${port}!"
  
  echo "Capturing log files for debugging:"
  if [ -f "${TMPDIR}/rsession.log" ]; then
    echo "Content of rsession.log:"
    cat "${TMPDIR}/rsession.log"
  else
    echo "rsession.log not found."
  fi
  
  if [ -f "${HOME}/.local/share/rstudio/log/rserver.log" ]; then
    echo "Content of rserver.log:"
    cat "${HOME}/.local/share/rstudio/log/rserver.log"
  fi
  exit 1
else
  echo "RStudio Server is running on port ${port}"
fi

Issue Faced:

When launching RStudio Server from the UI, it fails with the following error:

/var/spool/slurm/d/job310697/slurm_script: line 3: module: command not found
Script starting...
Waiting for RStudio Server to open port 55677...
...
Timed out waiting for RStudio Server to open port 55677!
...
Content of rserver.log:
ERROR Attempt to run server as user 'rstudio-server' (uid 997) from account 'oodadmin' (uid 1406410999) without privilege...

Log File Content:

2024-06-05T07:18:12.108552Z [rserver] ERROR Attempt to run server as user 'rstudio-server' (uid 997) from account 'oodadmin' (uid 1406410999) without privilege, which is required to run as a different uid; LOGGED FROM: virtual rstudio::core::ProgramStatus rstudio::server::Options::read(int, char* const*, std::ostream&) src/cpp/server/ServerOptions.cpp:327
...
ERROR system error 1 (Operation not permitted) [path: /var/lib/rstudio-server/rstudio-os.sqlite]; OCCURRED AT rstudio::core::Error rstudio::core::{anonymous}::changeFileModeImpl(const string&, mode_t) src/cpp/shared_core/FilePath.cpp:318

Request for Help:

  1. How can I properly run the RStudio Server as a non-root user within the Singularity container?
  2. Are there additional configurations required to grant necessary privileges?
  3. Any insights on resolving the “Operation not permitted” error for the SQLite database file?

Thank you in advance for your help and suggestions!

OSC uses bubblewrap to contain Rstudio - but the idea would be the same for singularity. You can look at our production application for what we do.

I believe that’s the --server-user CLI flag. We set this to $(whoami) to set it to the current user.

Use Singularity mount points to your advantage. We mount the job’s etc directory (anything in the apps template directory will be templated and copied to the jobs staged root. The staged root is that very long path ~/ondemand/data/sys/dashboard/batch_connect/.../<some uuid>) into /etc/rstudio where we can reconfigure it. We mount in a database.conf that tells Rstudio to use a tmp location for this sqlite3 database. So we don’t get permission denied on /var/lib because we’re using /tmp instead.

So again, we’re using bubblewrap but the principle is the same for Singularity: mount things into the container to reconfigure RStudio.

Hello again,

Thank you for your response. I believe I am closer to finding a solution now.

However, I am still facing an issue where the session opens and then immediately closes. I also realized that the host should correspond to the computing node, but it currently does not.

I applied the RStudio-server syntax as you suggested and debugged by running the commands manually before adding them to the script. Everything worked fine during the manual tests. Here is my current script.sh.erb:

#!/usr/bin/env bash

set -x

echo "Script starting..."
echo "Waiting for RStudio Server to open port ${port}..."
echo "TIMING - Starting wait at: $(date)"

# Load the required environment
setup_env () {
  # Load environment modules if available
  if [ -f /etc/profile.d/modules.sh ]; then
    source /etc/profile.d/modules.sh
  fi

  # Load Singularity module if available
  if module avail singularity/3.11.0 &>/dev/null; then
    module load singularity/3.11.0
  fi

  export RSTUDIO_SERVER_IMAGE="/home/oodadmin/Scripts/RStudio/rstudio-server-new.sif"
  export SINGULARITY_BINDPATH="/etc,/media,/mnt,/opt,/srv,/usr,/var"
  export PATH="$PATH:/usr/lib/rstudio-server/bin"
}
setup_env

# Ensure Singularity is available
if ! command -v singularity &> /dev/null
then
    echo "Singularity could not be found, ensure it's installed and available in the PATH."
    exit 1
fi

# Set up temporary directories
export TMPDIR="${PWD}/rstudio-server-tmp"
mkdir -p $TMPDIR/rs/var/lib/rstudio-server
mkdir -p $TMPDIR/rs/var/run/rstudio-server
mkdir -p $TMPDIR/rs/home/rstudio/.rstudio
mkdir -p $TMPDIR/rs/etc/rstudio
chmod -R 777 $TMPDIR

# Generate a secure cookie key
cat /proc/sys/kernel/random/uuid > "$TMPDIR/rs/var/run/rstudio-server/secure-cookie-key"
chmod 666 "$TMPDIR/rs/var/run/rstudio-server/secure-cookie-key"

# Set up the database file with broad permissions
export RSTUDIO_DB_FILE="$TMPDIR/rs/var/lib/rstudio-server/rstudio-os.sqlite"
touch $RSTUDIO_DB_FILE
chmod 666 $RSTUDIO_DB_FILE

# Create updated configuration files
cat << EOF > "$TMPDIR/rs/etc/rstudio/rserver.conf"
# Server Configuration File
www-port=${port}
auth-none=1
server-user=$(whoami)
server-app-armor-enabled=0
database-config-file=/tmp/rs/home/rstudio/.rstudio/database.conf
EOF

cat << EOF > "$TMPDIR/rs/home/rstudio/.rstudio/database.conf"
# sqlite configuration
provider=sqlite
directory=/tmp/rs/home/rstudio/.rstudio
EOF

cat << EOF > "$TMPDIR/rs/etc/rstudio/rsession.conf"
# R Session Configuration File
session-default-working-dir=/tmp/rs/home/rstudio
EOF

echo "Starting up rserver..."

# Execute the rserver command in the background
singularity exec --bind "$TMPDIR/rs/var/lib/rstudio-server:/var/lib/rstudio-server" \
  --bind "$TMPDIR/rs/var/run/rstudio-server:/var/run/rstudio-server" \
  --bind "$TMPDIR/rs/home/rstudio/.rstudio:/home/rstudio/.rstudio" \
  --bind "$TMPDIR/rs/etc/rstudio:/etc/rstudio" \
  --containall --cleanenv --home "$TMPDIR/rs:/home/rstudio" \
  --writable-tmpfs --no-privs "$RSTUDIO_SERVER_IMAGE" \
  /usr/lib/rstudio-server/bin/rserver \
  --auth-none 1 --www-port=${port} \
  --secure-cookie-key-file=/var/run/rstudio-server/secure-cookie-key \
  --database-config-file=/home/rstudio/.rstudio/database.conf \
  --server-user=$(whoami) \
  --server-app-armor-enabled=0 \
  --config-file=/etc/rstudio/rserver.conf &

# Capture PID of rserver process
rserver_pid=$!

echo 'Singularity has exited...'

echo "Waiting for RStudio Server to open port ${port}..."

# Timeout mechanism to wait for the server to start
count=0
while ! nc -z localhost ${port}; do
  sleep 1
  count=$((count+1))
  if [ $count -ge 60 ]; then
    echo "Timed out waiting for RStudio Server to open port ${port}!"
    echo "Capturing log files for debugging:"
    
    # Check if rserver process is still running
    if ps -p $rserver_pid > /dev/null
    then
       echo "rserver process is still running."
    else
       echo "rserver process has exited."
    fi
    
    if [ -f "${TMPDIR}/rsession.log" ]; then
      echo "Content of rsession.log:"
      cat "${TMPDIR}/rsession.log"
    else
      echo "rsession.log not found."
    fi
    
    if [ -f "${HOME}/.local/share/rstudio/log/rserver.log" ]; then
      echo "Content of rserver.log:"
      cat "${HOME}/.local/share/rstudio/log/rserver.log"
    fi
    exit 1
  fi
done

echo "RStudio Server is running on port ${port}"

The session still closes unexpectedly, so I added some debugging messages in the output.log:

set with port 5844
Script starting...
Waiting for RStudio Server to open port 5844...
TIMING - Starting wait at: Mon Jun 10 02:37:23 PM CEST 2024
...
...
...
/home/oodadmin/Scripts/RStudio/rstudio-server-new.sif /usr/lib/rstudio-server/bin/rserver --auth-none 1 --www-port=5844 --secure-cookie-key-file=/var/run/rstudio-server/secure-cookie-key --database-config-file=/home/rstudio/.rstudio/database.conf --server-user=oodadmin --server-app-armor-enabled=0 --config-file=/etc/rstudio/rserver.conf
+ nc -z localhost 5844
Connection to localhost (127.0.0.1) 5844 port [tcp/*] succeeded!
+ echo 'RStudio Server is running on port 5844'
RStudio Server is running on port 5844
++ date
+ echo 'TIMING - Wait ended at: Mon Jun 10 02:37:24 PM CEST 2024'
TIMING - Wait ended at: Mon Jun 10 02:37:24 PM CEST 2024
Discovered RStudio Server listening on port 5844!
TIMING - Wait ended at: Mon Jun 10 02:37:25 PM CEST 2024
Generating connection YAML file...
Cleaning up...

The main issue I am facing now is identifying which script is setting the “Generating connection YAML file” and “Cleaning up” messages. Any thoughts on what modifications might fix this part?

I have tried using the default scripts for before.sh.erb, after.sh.erb, and submit.yml, but they did not resolve the issue. The connection.yml file appears to be correct, similar to the one used for Jupyter Notebook, so the reverse proxy configuration should not be the problem.

Any ideas or suggestions would be greatly appreciated!

Thank you!

Our Rstudio app mounts the log directory so that we can access the logs from within the container after the job has completed and the container has stopped.

Look at our configurations for how we do that - it’s with mount points and configurations.

Thank you for the information. After checking the logs, here are the specific issue I need assistance with:

  • My script works when executed manually (line by line) inside the Singularity container, successfully opening the port, as I tested it manually.
  • However, when the script is run from Open OnDemand (OOD), it fails to open the port and it proceeds to Cleaning up.
  • In the output.log, I found the following:
Starting up rserver...
+ rserver_pid=2577832
...
+ singularity exec --bind /home/oodadmin/ondemand/data/sys/dashboard/batch_connect/dev/RStudio/output/03f509f4-18cc-43af-82d4-4a73de67511e/rstudio-server-tmp/rs/var/lib/rstudio-server:/var/lib/rstudio-server --bind /home/oodadmin/ondemand/data/sys/dashboard/batch_connect/dev/RStudio/output/03f509f4-18cc-43af-82d4-4a73de67511e/rstudio-server-tmp/rs/var/run/rstudio-server:/var/run/rstudio-server --bind /home/oodadmin/ondemand/data/sys/dashboard/batch_connect/dev/RStudio/output/03f509f4-18cc-43af-82d4-4a73de67511e/rstudio-server-tmp/rs/home/rstudio/.rstudio:/home/rstudio/.rstudio --bind /home/oodadmin/ondemand/data/sys/dashboard/batch_connect/dev/RStudio/output/03f509f4-18cc-43af-82d4-4a73de67511e/rstudio-server-tmp/rs/etc/rstudio:/etc/rstudio --containall --cleanenv --home /home/oodadmin/ondemand/data/sys/dashboard/batch_connect/dev/RStudio/output/03f509f4-18cc-43af-82d4-4a73de67511e/rstudio-server-tmp/rs:/home/rstudio --writable-tmpfs --no-privs /home/aymane/Scripts/RStudio/rstudio-server-new.sif /usr/lib/rstudio-server/bin/rserver --auth-none 0 --www-port=17925 --secure-cookie-key-file=/var/run/rstudio-server/secure-cookie-key --database-config-file=/home/rstudio/.rstudio/database.conf --server-user=oodadmin --server-app-armor-enabled=0 --config-file=/etc/rstudio/rserver.conf
+ count=1
+ '[' 1 -ge 60 ']'
+ nc -z localhost 17925
Connection to localhost (127.0.0.1) 17925 port [tcp/*] succeeded!
+ echo 'RStudio Server is running on port 17925'
RStudio Server is running on port 17925
Discovered RStudio Server listening on port 17925!
TIMING - Wait ended at: Tue Jun 11 02:08:24 PM CEST 2024
Skipping cleanup step in after.sh.erb
Generating connection YAML file...
Cleaning up...

It seems the script job_script_content.sh is responsible for running these commands. However, I cannot locate this script within the sandbox. I found the following syntax, which can explain the cleanup, even if the connection.yaml was created:

# Wait for script process to finish
wait ${SCRIPT_PID} || clean_up 1

Its almost as if, once the script has ended executing, rstudio will close. My script.sh.erb does find a PID for rserver and TCP logs prior to that:

Starting up rserver...
+ rserver_pid=2577832
+ nc -z localhost 17925
Connection to localhost (127.0.0.1) 17925 port [tcp/*] succeeded!
  • Could you please guide me on where to find the content for job_script_content.sh or to avoid a clean_up when rstudio is active?

Thank you for your help!

job_script_content.sh is generated by our libraries.

The script shouldn’t end, it should block. I.e., the last thing you execute in the script should be rserver (or singularity to run the container) and it should block for however long rstudio the app is running for.

If the script does exit, then it’ll do as you say. But if the script blocks then job_script_content.sh will wait as you’ve linked there.

It’s not about avoiding cleanup - it’s about being sure the process blocks and lives for a long duration and wait does it’s job and waits.

If the script has ended executing - that indicates to me that rstudio itself has stopped executing. Again, script.sh.erb needs to block. It cannot background any processes.

If you’re trying to replicate interactively in a shell - then you need to use job_script_content.sh as the entry point/script you’re running.

Thanks for providing some context on the job_script_content.sh.

You were right; I have updated my script.sh.erb to follow the example you provided, and now the RStudio Server sessions remain open as long as needed.

However, I’m struggling with the authentication. When I click the “Connect to RStudio” button, it shows an “Incorrect username/password” error.

I suspect this might be due to the content outside the container, such as connection.yml, not being accessible from within the Singularity container.

Could you please provide more specific guidance on how to resolve this issue? For example:

  • Which folders need to be mounted for authentication?
  • Are there any specific configurations or files that should be accessible within the container to ensure proper authentication?

Any detailed instructions or examples would be greatly appreciated.

Thank you!

Glad to see progress!

We use the --auth-pam-helper-path that is within the template/ directory so it’s stored in the same directory as output.logs and job_script_content.sh and so on. Without it, I’m not sure how Rstudio would be checking for your actual password.

Here’s the actual auth script we use.