Firstly, my apologies if I should be directly contacting VT-ARC or the Mathworks people about this – I thought surely someone else would eventually ask about this topic (given the intense interest during the tips-n-tricks call) but I’ve not seen anything here, nor in ARC’s github-issues. If there’s a better venue (e.g. discord/slack/etc), please let me know, thanks!
Trying to follow the Dev Guide from VT, I’ve made the following observations/additions/etc/etc…
The guide mentions modifying manifest.yml, but I wonder if that is supposed to suggest modifying form.yml? (To specify your cluster and customize fields.)
Similarly, although the guide does mention updating template/script.sh.erb for the path to the .sif image, I also needed to update several of the bind
parameters. Beyond the obvious filesystem differences, though, it was definitely not clear to me at first that $MATLAB_DIR and $TMPFS seem to be VT-site-specific environment variables. I also had to bind our newer version of readline’s ‘libhistory’.
[jason@wind] $ cd ~/ondemand/dev/bc_vt_matlab_html/template
[jason@wind] $ diff -u script.sh.erb-origBindings script.sh.erb
--- script.sh.erb-origBindings 2021-11-24 12:07:04.000000000 -0700
+++ script.sh.erb 2021-12-10 11:13:59.033674779 -0700
@@ -20,12 +20,13 @@
+echo "SUSPECTED VT-SPECIFIC VARS: MATLAB_DIR=$MATLAB_DIR ... TMPFS=$TMPFS"
export SINGULARITYENV_LD_LIBRARY_PATH=$LD_LIBRARY_PATH
export SINGULARITYENV_PATH=$PATH
singularity run --nv --writable-tmpfs \
- --bind=$MATLAB_DIR:/opt/matlab,$TMPFS:/tmp,/work/${USER},/projects \
- --bind=`pwd`/matlab.rc:/mathworks.rc,/cm,/etc/slurm/slurm.conf \
- --bind=/lib64/libhistory.so.6:/lib/x86_64-linux-gnu/libhistory.so.6 \
+ --bind=/packages/matlab/R2021b:/opt/matlab,/tmp,/projects \
+ --bind=`pwd`/matlab.rc:/mathworks.rc,/etc/slurm/slurm.conf \
+ --bind=/lib64/libhistory.so.7:/lib/x86_64-linux-gnu/libhistory.so.6 \
--bind=/usr/lib64/libmunge.so.2:/lib/x86_64-linux-gnu/libmunge.so.2,/var/run/munge \
--bind=`pwd`/entrypoint.sh:/entrypoint.sh \
/home/jason/ondemand/dev/bc_vt_matlab_html/matlab.sif bash /entrypoint.sh
When I launch my dev Matlab app, it gets stuck “Starting”, and the app/script output indicates that matlab-jupyter-app cannot be found…
[jason@wind] $ cd ~/ondemand/data/sys/dashboard/batch_connect/dev/bc_vt_matlab_html/output/f36b9577-ac0f-4894-9adb-b3d318b6405a
[jason@wind] $ cat output.log
starting before
No modules loaded
Script starting...
Waiting for Matlab to open port 41492...
/home/jason/ondemand/data/sys/dashboard/batch_connect/dev/bc_vt_matlab_html/output/f36b9577-ac0f-4894-9adb-b3d318b6405a
module works
starting singularity
starting Matlab on cn31 using 41492
SUSPECTED VT-SPECIFIC VARS: MATLAB_DIR= ... TMPFS=
retrieved ENV variables from matlab.rc
MWI_APP_PORT=41492
MWI_BASE_URL=/matlab
TMPDIR=/tmp
MWI_EXT_URL=ood.arc.vt.edu
MLM_LICENSE_FILE=/opt/matlab/licenses/network.lic
To use the web-desktop: http://ood.arc.vt.edu/matlab/index.html
starting web matlab
/entrypoint.sh: line 30: matlab-jupyter-app: command not found
Though my app session is stuck starting up, I can stay in the ondemand-generated working-directory to leverage the resources it already prepared. When I interactively shell in, it’s now clearly on my $PATH:
[jason@wind] $ singularity shell --nv --writable-tmpfs \
> --bind=/packages/matlab/R2021b:/opt/matlab,/tmp,/projects \
> --bind=`pwd`/matlab.rc:/mathworks.rc,/etc/slurm/slurm.conf \
> --bind=/lib64/libhistory.so.7:/lib/x86_64-linux-gnu/libhistory.so.6 \
> --bind=/usr/lib64/libmunge.so.2:/lib/x86_64-linux-gnu/libmunge.so.2,/var/run/munge \
> --bind=`pwd`/entrypoint.sh:/entrypoint.sh \
> /home/jason/ondemand/dev/bc_vt_matlab_html/matlab.sif
INFO: Could not find any nv files on this host!
Singularity> which matlab-jupyter-app
/usr/local/bin/matlab-jupyter-app
Now, obviously there are some variables in template/before.sh.erb (that define the ephemeral matlab.rc) that everyone should be localizing; but more to the point, I can modify template/entrypoint.sh to provide the full path to matlab-jupyter-app
to move things along.
$ diff -u entrypoint.sh-orig entrypoint.sh
--- entrypoint.sh-orig 2021-12-10 12:50:53.352527908 -0700
+++ entrypoint.sh 2021-12-10 12:51:36.872916964 -0700
@@ -27,4 +27,4 @@
echo ""
echo starting web matlab
-matlab-jupyter-app
+/usr/local/bin/matlab-jupyter-app
Unforunately, although that definitely got me closer and I can launch the app, I’m still hitting some error and not sure how to go about diagnosing/debugging it:
Now, when I view my app sessions, this Matlab one is still appears active/available and when I click the blue “Connect” button, I get a similar looking page that has empty promises about error logs. Clicking the “Start MATLAB Session” button here starts the cycle over, at the “Starting” screenshot.
The output.log file doesn’t show anything enlightening, and there doesn’t appear to be anything remotely special in my /var/log/ondemand-nginx logs.
$ tail -f output.log
INFO:MATLABProxyApp:MATLAB_LOG_DIR:/tmp/MWI/31511
INFO:MATLABProxyApp:MATLAB_READY_FILE:/tmp/MWI/31511/connector.securePort
INFO:MATLABProxyApp:Starting MATLAB on port 31511
INFO:MATLABProxyApp:Installing handler for signal: 1
INFO:MATLABProxyApp:Installing handler for signal: 2
INFO:MATLABProxyApp:Installing handler for signal: 3
INFO:MATLABProxyApp:Installing handler for signal: 15
MATLAB is selecting SOFTWARE OPENGL rendering.
Discovered Matlab listening on port 18154!
Generating connection YAML file...
< M A T L A B (R) >
Copyright 1984-2021 The MathWorks, Inc.
R2021b (9.11.0.1769968) 64-bit (glnxa64)
September 17, 2021
INFO:MATLABProxyApp:Waiting for MATLAB to exit...
INFO:MATLABProxyApp:MATLAB has exited with errorcode: -9
ERROR:MATLABProxyApp:MATLAB returned an unexpected error. For more details, see the log below.
INFO:MATLABProxyApp:Cleaning up matlab_ready_file.../tmp/MWI/31511/connector.securePort
INFO:MATLABProxyApp:MATLAB_LOG_DIR:/tmp/MWI/31511
INFO:MATLABProxyApp:MATLAB_READY_FILE:/tmp/MWI/31511/connector.securePort
INFO:MATLABProxyApp:Starting MATLAB on port 31511
MATLAB is selecting SOFTWARE OPENGL rendering.
< M A T L A B (R) >
Copyright 1984-2021 The MathWorks, Inc.
R2021b (9.11.0.1769968) 64-bit (glnxa64)
September 17, 2021
INFO:MATLABProxyApp:Waiting for MATLAB to exit...
INFO:MATLABProxyApp:MATLAB has exited with errorcode: -9
ERROR:MATLABProxyApp:MATLAB returned an unexpected error. For more details, see the log below.
Note that that is the entirety of the file, and “the long below” is evidently empty.
Suggestions?
Thank you!
Jason Buechler
NAU Monsoon HPC