JupyterLab, python modules, PATHs

The second question I’ve been asked it about Jupyter being turned into JupyterLabs (the first was “how do I upload files”).

I found this question/answer and quickly installed jupyterlab in the appropriate module and implemented the check mark on the form and there it is. Fantastic - thank you.

My question is about kernels. I understand the end point, but I don’t know enough about JupyterLab and kernels, and I’m not confident of my reading of the ruby in template/script.sh.erb

From what I can tell, the process is:

  • get Jupyterlab work (tick)
  • prepare various other installations in their own modules (R, Scala, Julia, Python etc)
  • load the modules within the template/script.sh.erb?

My primary concern is conflicting python modules (for instance). Am I correct in thinking that the JupyterLab will remain in the env it is launched in (for eg /usr/local/modules/ood-jupyter), but each of the kernels will launch their own self contained envs (for eg /usr/local/modules/python27) when they are chosen and launched?

ie, lines 84~170 ish are building menu items, which are triggered when they are chosen?

Yes this works as you’d expect, mostly through Jupyter’s magic, not our own.

How to generate these kernel file is much more complicated. But basically it works like this:

We know all the kernels we want up front, that’s that long list of objects called kernels. (our Julia installations are a little bit different with mixing and matching user installations and system installations, so I’d focus on how we install the python kernels).

We want to write all these files out so Jupyter can pick them up. Setting JUPYTER_PATH="${PWD}/share/jupyter" is how Jupyter knows about them.

Here’s what one of those files looks like when it gets written out. The only dynamic thing really is the first argument, the wrapper script location, which is a part of the job’s current working directory. We write this wrapper script in the main script.

This file name is share/jupyter/kernels/sys_python27/kernel.json

{
  "display_name": "Python 2.7 [python/2.7 ]",
  "language": "python",
  "argv": [
    "/users/PZS0714/johrstrom/ondemand/data/sys/dashboard/batch_connect/dev/jupyter/output/b11f0145-33c9-4b86-ab18-1fe0018c8b4f/launch_wrapper.sh",
    "python",
    "-m",
    "ipykernel",
    "-f",
    "{connection_file}"
  ],
  "env": {
    "MODULES": "xalt/latest python/2.7 "
  }
}

So you boot Jupyter with say python 3.5, but in the launch_wrapper.sh we can load python 2.7 given the environment variable MODULES. (my off the top, out of thin air guess is, this works because it’s a different process tree and they communicate through some file or unix socket).

So that’s the end goal, to write out these kernel files and set JUPYTER_PATH to be able to find them.

You could probably also do all this in bash and jq. Again, the only real variable is getting the kernels to point to the $PWD/launcher_wrapper.sh (PWD being the job’s working directory), and for the wrapper script to redirect output into a log file in the same PWD everything else is known up front (unless you have to start looking for user installed kernels which is highly variable).

Hope that helps!

1 Like