Add an option to source a setup script before starting Jupyter

We have a number of users who work on multiple projects, each of which uses a different environment and in at least some cases the environments would conflict. The user(s) and we think it would be convenient to provide this by adding an option for specifying a file that would be sourced as part of script.sh.erb. We are also looking at the simpler task of making it possible for users to specify a list of modules to be loaded prior to the main program starting. We are using Jupyter as our test case, but we think it might be worth extending it so that it could be part of other applications.

Our thoughts about error checking are that we should check first whether the file exists, and if it does not, what to do if it does not. It seems straightforward to put that test into script.sh.erb, but I was wondering whether that test could be included somewhere that it could throw an error prior to job submission and return the user to the application form?

Any suggestions on this? Please note, we are not Ruby programmers, so please be merciful in your assumptions when answering. :wink:

Sure, I have this option for you here, but I will say maybe you should explore/encourage more reproducible options, like setting the environment up in the notebook itself. If I have notebooks that rely on highly specialized environments, I can’t really share them. Or worse, if that environment itself or the setup get’s corrupted I’d have to figure it out all over again. Having everything nicely contained in the notebook itself I think is what I’d be after if I were a regular Jupyter user.

That said, I can still give you this option.

Let’s say you add a form field you call sourced_file. In your submit.yml.erb you can add this check that’ll throw an error if you’ve specified the file to source, but it doesn’t exist. So users how leave this empty, will continue as normal and users who try to specify it will get instant feedback as to whether or not it’ll work.

  if sourced_file.present?
    raise "You're trying to source #{sourced_file}, but it does not exist" unless Pathname(sourced_file.to_s).file?
  end

Note that this check happens on the web node, so it won’t be able to find anything on remote host (though you could add an ssh command or something similar if that’s reallly what you’d want - though we’ve had trouble sshing in the form/submit files with timeouts and all).

Thanks, @jeff.ohrstrom !

The Notebooks are typically only a small part of the overall work, I think. Most of the work is done outside the Notebook and has already been set up using a combination of Lmod and some additional variable exporting for software compiled by their group.

They run 100s of batch jobs from these environments, then do visual checking things in the Notebook.

I think trying to replicate that inside the Notebook would be harder, I think, and actually more likely to not get changes from the external setup than passing the external setup into the Notebook.

We have had little success getting the started kernel in the Notebook to register changes like additional library paths and the like, but that may be because we aren’t really Notebook users or programmers ourselves.

I will give your suggestion a try! It looks at first blush to be exactly what we want: Runs on the web server, raises an error, and the instructions are simple enough that even I might get them right. :slight_smile:

Very cool! Let us know how it works out.

Here’s the bit you’d add to the script.sh.erb

<%- if context.sourced_file.present? -%>
source "<%= sourced_file %>"
<%- end -%>

Adding the check to the form seems to have worked splendidly.

The addition to script-sh.erb seems to be lacking a certain something…

(Pardon me for slightly changing the variable name.)

At the top of template/script.sh.erb, I had to put something in to define the setup_file variable to prevent an undefined variable error, but I think I have not got it right. I used

<%-
  cuda = (context.node_type == "gpu") ? context.cuda_version : ""
  wrapper = session.staged_root.join("launch_wrapper.sh")
  wrapper_log = session.staged_root.join("launch_wrapper.log")
  setup_file = context.setup_file
  kernels = {}
-%>

thinking that would find the value from the form.

I added this further down just before the jupyter command is invoked

# Source the setup file
<%- if setup_file.present? -%>
echo "The setup file is <%- setup_file -%>"
source "<% setup_file %>"
<%- end -%>

But, in the output file that prints

+ echo 'Setup file is:  '
Setup file is:  
+ source ''

so I clearly failed to get the right incantation at the top, or somehow otherwise mangled it.

Please help me get my mind right. :wink:

Thanks!

My apologies. It was my example that was wrong. I’ve edited that comment. (you need to use <%= ... %> to get it to output as a string).

That seems to have got it.

Thanks for the patience and the pointers!

For convenience of others, the summary is to add to the respective sections (... indicates other blocks that may appear in the same section)

form:
  . . . 
  - setup_file
  . . . 
attributes:
  . . .
  - setup_file:
    label: "Source this setup file"
  . . .

to those sections of the form.yml.erb file.

The following block is added to submit.yml.erb before the ---

<%-
  if setup_file.present?
    raise "You're trying to source #{setup_file}, but it does not exist" unless Pathname(setup_file.to_s).file?
  end
-%>

This line needed to be added to the top of template/script.sh.erb.

<%-
  setup_file = context.setup_file
-%>

and then in the body of template/script.sh.erb, this got added after any module purge commands, which might undo module load commands in the setup file.

# Source the setup file
<%- if setup_file.present? -%>
echo "Setup file used was:  <%= setup_file %>"
source "<%= setup_file %>"
<%- else -%>
echo "No setup file!  How did that happen?"
<%- end -%>

Thanks, again!