Serious Jupyter problem in OnDemand

The latest 7.5.x versions for Jupyter have a very peculiar problem in OnDemand which makes them unusable for any user. To the users it appears with a “nothing works”.

After lots of sleuthing I pinpointed the problem to the inability to read any file that is in any directory different from the HOME directory itself. Any subdirectory of the home is not accessible either. It seems that the server returns 404, but not in a clear way (that may be a Jupyter thing that prefers to keep spinning the irritating wheel rather than giving up). See screenshot:

This applies in general, so I can’t even say “just make your notebooks in the home directory” because as soon as they try to import any module they need subdirectories and the modules file to import (interestingly, modules which are part of the virtual environment the kernel is based on, which are in some subdirectory load just fine).

The problem does not apply version 7.2.1 of Jupyter, so I thought it was a problem with them and was getting ready to submit a bug report there, but before doing so I tried to reproduce the issue without OnDemand. Easy peasy: take the config file, password and script out of ondemand/data/sys/dashboard/batch_connect/sys/jupyter753/output/97b5810d-6a4b-4e41-aca6-9d99619d3a55/ and run those is an XFCE environment, using same environment, same everything. LOL and behold, that works just fine.

So it must be some interaction between Jupyter and OnDemand. I thought that tornado may be responsible (since it affects serving files), however both the v7.2.1 and the v7.5.x environments use tornado v6.5.4 so it must be something else. The output.log does not provide any clue, just regular things in both working and non-working instances. Nothing about it in /var/log/ondemand-nginx/$USER/ either.

Do you have any suggestions about what may be causing this issue, or at least what to look for?

Small detail which I realize now it’s not visible in the screenshot. The red thing with Failed to load resource reports an URL such as https://ood.example.com/node/node11.cluster/57395/api/collaboration/session/test%2FUntitled.ipynb?1770399298924

and if I try to navigate to that URL I do get a 404 indeed. On the other hand, if I replace %2F with / I get

{
  "message": "Method Not Allowed",
  "reason": null
}

which however I get only for URLs pointing to any existing or non existing file.

Hey Davide, I apologize for the delay. I wonder if this could be a bug/mismatch with the interaction between JupyterNotebook and JupyterLab? You mentioned the notebook version but JupyterLab could require an update to move from 7.2 to 7.5. Also curious if similar links are generated when you run outside ondemand, and what those look like.

With the version information, I can try and replicate at OSC and hopefully have some more precise suggestions for you.

Thanks @bsingleton for the response, and no worries about the delay!

The kernel (probably irrelevant?) has these versions

$ pip freeze | grep jupyter
jupyter-events==0.12.0
jupyter-lsp==2.3.0
jupyter_client==8.8.0
jupyter_core==5.9.1
jupyter_server==2.17.0
jupyter_server_terminals==0.5.4
jupyterlab==4.5.3
jupyterlab_pygments==0.3.0
jupyterlab_server==2.28.0
$ pip freeze | grep notebook
notebook_shim==0.2.4

whereas the python environment has these

$ pip freeze | grep jupyter
jupyter-collaboration==4.2.1
jupyter-collaboration-ui==2.2.1
jupyter-docprovider==2.2.1
jupyter-events==0.12.0
jupyter-lsp==2.3.0
jupyter-server-ydoc==2.2.1
jupyter-ydoc==3.3.6
jupyter_client==8.8.0
jupyter_core==5.9.1
jupyter_server==2.17.0
jupyter_server_fileid==0.9.3
jupyter_server_proxy==4.4.0
jupyter_server_terminals==0.5.4
jupyterlab==4.5.3
jupyterlab_pygments==0.3.0
jupyterlab_server==2.28.0
jupyterlab_widgets==3.0.16
$ pip freeze | grep notebook
notebook==7.5.3
notebook_shim==0.2.4

and they were both created from scratch right before I ran the experiment, using pip, with pip install ipykernel for the kernel and pip install notebook ipympl matplotlib dask-labextension jupyter-collaboration for the python environment. So I think the versions are correct, do you?

Also curious if similar links are generated when you run outside ondemand, and what those look like.

Hard to say. I have no clue how those URLs are generated, or how to check what a correctly working one looks like (i.e. how can I check for the URL when I get no Failed to load resource error?) Just guessing with a http://localhost:8888/api/collaboration/session/test%2FUntitled.ipynb I get the same Method Not Allowed error (with or without the argument for ?)

Thanks

Thanks so much. I was able to replicate at OSC, but this only happened when I explicitly loaded jupyter-collaboration. This changes the url for files requests from /api/contents/PATH to /api/collaboration/session/PATH, which gives the 404. It seems that for a quick fix you could disable collaboration (by not including the jupyter-collaboration extension), and this issue shouldn’t happen.

I still need to look into this more, but it seems strange that this collaboration url is not used when pulling directly from your HOME. It also changes from a GET request (that works) to a PUT request (that fails).

As far as the reason the url fails, it does seem like apache is introducing some strangeness by encoding the slashes, that aspect appears most similar to Strange issue with RTC in JupyterHub behind Apache HTTP reverse proxy - JupyterHub - Jupyter Community Forum . That explains why you can only load files within your HOME, as those can be referenced without slashes. However why the api endpoint changes is still mystifying me.

2 Likes

Thank you so much!!!
I thought I tried that, but maybe I was not diligent enough with my testing.

In fact, the only reason I had jupyter collaboration in the environment was to hope to be able to reconnect to a running notebook, a feature my users are clamoring for since I don’t know how long… But that does not work anyway, as described at Restoring computation output after disconnect in Notebook · Issue #641 · jupyter/notebook · GitHub and therefore it’s not a big deal for me to remove it. And indeed removing it everything is back to normal.

As far as I’m concerned, we may chalk this off as one of the “life is too short for X” things, but if you do end up investigating it further, I’ll be curious to know if you can fix it.

For my needs, I’ll try later jupyverse and/or the native jupyter solution mentioned at the end of the github ticket above.

Thanks again

EDIT: reviewing my notes and redoing my tests, I noticed that removing jupyter-collaboration fixes the issue for 7.5.3 but not for 7.5.2. It may be my environment being messed up, and 7.5.3 is good enough for me to not care more than just eliminating the other option for my users, but putting this note here in case others continue to bang their head on this one (and to close on my top comment: I did try that out, but only for 7.5.2)

Unfortunately even without collaboration and in 7.5.3 the problem reoccurs. Back to the drawing board :frowning:

I am still able to get it working without collaboration, and I am guessing there is just something funky with your environment. Here are some dumps I ran in my notebook for you to compare against

!jupyter --version


Selected Jupyter core packages...
IPython          : 8.18.1
ipykernel        : 6.31.0
ipywidgets       : not installed
jupyter_client   : 8.6.3
jupyter_core     : 5.8.1
jupyter_server   : 2.17.0
jupyterlab       : 4.5.3
nbclient         : 0.10.2
nbconvert        : 7.17.0
nbformat         : 5.10.4
notebook         : 7.5.3
qtconsole        : not installed
traitlets        : 5.14.3
!jupyter server extension list


Config dir: /users/PZS0714/bsingleton/.jupyter

Config dir: /users/PZS0714/bsingleton/.venv/etc/jupyter
    jupyter_lsp enabled
    - Validating jupyter_lsp...
      jupyter_lsp 2.3.0 OK
    jupyter_server_terminals enabled
    - Validating jupyter_server_terminals...
      jupyter_server_terminals 0.5.4 OK
    jupyterlab enabled
    - Validating jupyterlab...
Extension package jupyterlab took 0.1098s to import
      jupyterlab 4.5.3 OK
    notebook enabled
    - Validating notebook...
      notebook 7.5.3 OK
    notebook_shim enabled
    - Validating notebook_shim...
      notebook_shim  OK

Config dir: /usr/local/etc/jupyter
!pip freeze


anyio==4.12.1
argon2-cffi==25.1.0
argon2-cffi-bindings==25.1.0
arrow==1.4.0
asttokens==3.0.1
async-lru==2.0.5
attrs==25.4.0
babel==2.18.0
beautifulsoup4==4.14.3
bleach==6.2.0
certifi==2026.1.4
cffi==2.0.0
charset-normalizer==3.4.4
comm==0.2.3
debugpy==1.8.20
decorator==5.2.1
defusedxml==0.7.1
exceptiongroup==1.3.1
executing==2.2.1
fastjsonschema==2.21.2
fqdn==1.5.1
h11==0.16.0
httpcore==1.0.9
httpx==0.28.1
idna==3.11
importlib_metadata==8.7.1
ipykernel==6.31.0
ipython==8.18.1
isoduration==20.11.0
jedi==0.19.2
Jinja2==3.1.6
json5==0.13.0
jsonpointer==3.0.0
jsonschema==4.25.1
jsonschema-specifications==2025.9.1
jupyter-events==0.12.0
jupyter-lsp==2.3.0
jupyter_client==8.6.3
jupyter_core==5.8.1
jupyter_server==2.17.0
jupyter_server_terminals==0.5.4
jupyterlab==4.5.3
jupyterlab_pygments==0.3.0
jupyterlab_server==2.28.0
lark==1.3.1
MarkupSafe==3.0.3
matplotlib-inline==0.2.1
mistune==3.2.0
nbclient==0.10.2
nbconvert==7.17.0
nbformat==5.10.4
nest-asyncio==1.6.0
notebook==7.5.3
notebook_shim==0.2.4
overrides==7.7.0
packaging==26.0
pandocfilters==1.5.1
parso==0.8.6
pexpect==4.9.0
platformdirs==4.4.0
prometheus_client==0.24.1
prompt_toolkit==3.0.52
psutil==7.2.2
ptyprocess==0.7.0
pure_eval==0.2.3
pycparser==2.23
Pygments==2.19.2
python-dateutil==2.9.0.post0
python-json-logger==4.0.0
PyYAML==6.0.3
pyzmq==27.1.0
referencing==0.36.2
requests==2.32.5
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rfc3987-syntax==1.1.0
rpds-py==0.27.1
Send2Trash==2.1.0
six==1.17.0
soupsieve==2.8.3
stack-data==0.6.3
terminado==0.18.1
tinycss2==1.4.0
tomli==2.4.0
tornado==6.5.4
traitlets==5.14.3
typing_extensions==4.15.0
tzdata==2025.3
uri-template==1.3.0
urllib3==2.6.3
wcwidth==0.6.0
webcolors==24.11.1
webencodings==0.5.1
websocket-client==1.9.0
zipp==3.23.0
1 Like

Thanks Braeden, I appreciate you taking the time to double check this for me.

I checked what I had and compared with yours. Of course there were many differences, but a few seemed the most likely culprits. As we discussed jupyter-collaboration (which I did not have installed anymore) is the most important one, since with it I could never get it to work, whereas without it the problem occurred 50-50 of the times (statistics collected with a handful of repeated tests which I ran just now).

So I wiped my old environment and created a new one, which has/differs from the previous one as follows

Selected Jupyter core packages...
IPython          : 9.10.0
-ipykernel        : 7.1.0
+ipykernel        : 7.2.0
ipywidgets       : 8.1.8
jupyter_client   : 8.8.0
jupyter_core     : 5.9.1
jupyter_server   : 2.17.0
-jupyterlab       : 4.5.3
+jupyterlab       : 4.5.4
nbclient         : 0.10.4
nbconvert        : 7.17.0
nbformat         : 5.10.4
notebook         : 7.5.3
qtconsole        : not installed
traitlets        : 5.14.3


Config dir: /home/act/.jupyter

Config dir: /home/sw/other/public_venvs/jupyter-v7.5.3/etc/jupyter
    jupyter_lsp enabled
    - Validating jupyter_lsp...
      jupyter_lsp 2.3.0 OK
    jupyter_server_terminals enabled
    - Validating jupyter_server_terminals...
      jupyter_server_terminals 0.5.4 OK
    jupyterlab enabled
    - Validating jupyterlab...
-      jupyterlab 4.5.3 OK
+      jupyterlab 4.5.4 OK
    notebook enabled
    - Validating notebook...
      notebook 7.5.3 OK
    notebook_shim enabled
    - Validating notebook_shim...
      notebook_shim  OK

Config dir: /usr/local/etc/jupyter

But most importantly the new one lacks the following (which I removed from the above diff to simplify reading)

    dask_labextension enabled
    - Validating dask_labextension...
      dask_labextension 7.0.0 OK
    jupyter_server_proxy enabled
    - Validating jupyter_server_proxy...
      jupyter_server_proxy 4.4.0 OK
    jupyter_server_ydoc enabled
    - Validating jupyter_server_ydoc...
      jupyter_server_ydoc 2.2.1 OK
    jupyter_server_fileid enabled
    - Validating jupyter_server_fileid...
      jupyter_server_fileid 0.9.3 OK

I installed dask_labextension myself, but I think collaboration installed the other ones. I don’t think dask_labextension could cause this problem, but one never knows, so I did not reinstall it for now. On the other hand all the other looked suspicious given the behavior we discussed.

Without these (and with the slight different versions as mentioned above), I have been unable to reproduce the problem out of a dozen tries. I will now unleash my users on it and see if in the big numbers something reoccurs, but perhaps you might want to check if server_proxy or fileid are able to cause the issue for you, and if so you can warn others about it in the docs.

I will eventually need to reinstall dask_labextension so you can hold off testing it yourself if you don’t have time/stamina to do it, since I will eventually report back here what I find, but I prefer making a first step without it, since debugging an erratic problem is already hard with fewer options to consider.

Thanks again