Jupyter notebook app with multi-node session and distributed framework

ashy-jpg · September 10, 2024, 7:29pm

Hello ,

can you please tell me if Jupyter notebook / HUB app will supports multi-node session and distributed framework for running jobs on HPC ? Thank you !

jeff.ohrstrom · September 10, 2024, 8:47pm

Hi and welcome!

I don’t know if Jupyter Notebook can. Of course, JupyterHub can but that’s almost a separate product. So much so that if you have JupyterHub already, you kind of don’t want Jupyter + OnDemand integration - JupyterHub just has a lot more to offer and we don’t integrate with it directly.

That said - JupyterLab/Notebook through OnDemand are for interactive jobs. I.e., jobs you’re actively interacting with. We often tell customers to run these programs in batch jobs if they require a long time to run (more than ~6 hours or so). Seems like that’d be the same guidance we’d give our own customers if they asked this question - instead of getting an interactive job with multiple machines, submit the work as a batch job that you don’t actively interact with.

ashy-jpg · September 11, 2024, 3:31pm

Hi Jeff ,

Thank you for your reply .
well , we are trying to use open ondemand as a portal for all users . so , it makes sense to have JupyterHub on OOD .So , we are using singularity image to run the jupyterhub on Open ondemand

Moreover , Are’nt the JupyterLab/Notebook on OOD already launched as batch jobs through Open ondemand ? Please find attached snippet of submit.yml . So , does that mean we can distribute the job across other nodes ?

Thank you for your guidance and assistance .

jeff.ohrstrom · September 11, 2024, 3:54pm

The TLDR; here I think is that (a) I don’t know how to deploy JupyterHub, so I can’t really help there. And (b) that I don’t know how to get JupyterLab/Notebook to work across machines, so while you can schedule many nodes - I don’t know how to tell Jupyter to do work across these nodes.

Technically, yes, but you have to consider the users mental model of what’s going on (and their potentially limited knowledge of HPC). Yes this is a batch job, but you’re interacting with it as opposed to just submitting it and waiting for it to complete. We get customer tickets where a JupyterLab job has been running for days and the users can no longer connect to it because whatever processing it’s doing is taking too long for the server to actually respond to requests. So we tell those customers to submit them as batch jobs, so they cannot interact with it and have to wait for the job to complete to see the results.

I don’t know how to do this and so can’t really provide guidance on it. From what I understand (admittedly not a lot) JupyterHub should be a persistent process. That is you can always connect to it regardless of whether you have jobs running or not. It seems to me that JupyterHub would have to be deployed as a Passenger application that runs on the web node, not an interactive application that runs on compute nodes. But again. I haven’t really even looked into it into any depth - this is just an off the top guess.

You can schedule more nodes - the question is how do you boot it all up and make connections between the nodes? Is this something JupyterLab/Notebook can even do? I can’t seem to find any clear documentation on running a notebook across nodes without JupyterHub.

system · March 10, 2025, 3:54pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Launch interactive app programmatically Feature Requests and Roadmap Discussion question	10	1279	May 26, 2022
Jupyter interactive app question - launching, but not providing link to Get Help	3	1365	May 26, 2022
Jupiter interactive app LHA Get Help	8	331	January 2, 2023
Jupyter notebook with custom Python provided in a container Get Help	3	665	May 19, 2022
Jupyter notebook 'Connect to Jupyter' button not rendering Get Help	2	1128	May 26, 2022

Jupyter notebook app with multi-node session and distributed framework

Related topics