Make sure that compute nodes are syncing their writes properly. connection.yml
is generated on the compute node directly. If writes from compute nodes aren’t being synced quickly enough to your file system, then that would explain why the login node is stalled because its waiting for connection.yml
to exist.
The home directory is similar to IPC, for example the login node is looking for /jupyter_test/output/694741ad-a664-4eee-924a-c0db3dd9961b/connection.yml