We are working on a project where the OnDemand server and the remote host where we want to launch the remote job (via Linux Host Adapter) don’t share user home directories, i.e. each system has its own local home directory space. This appears to be a problem. Can you please confirm that both the server and the remote host have to share the same home directory (e.g. via a file system mount)?
They need to share some directory. You can change this through
OOD_DATAROOT environment variable. It doesn’t need to be their HOME but it does need to exist on both sides.
Thanks Jeff, where would the OOD_DATAROOT environment variable be set?
You’ll likely want to set it to something like this. Note that because it’s shared and expected to be writable you’ll have to take extra care to be sure users can write to their own directory but not to anyone else’s. (maybe initialize everyone’s directory to the right permissions?)
You’ll need to set it for both the dashboard and the job composer.
Also set it for the job composer.
After writing that I wonder if another way to go would be just to use the webserver as the destination host? I’m not sure of all the context here, but that may suite you.
@mcuma are you wanting to set this only for a single app interactive app OR only for a single adapter? I don’t think we easily support this at the moment but if we understand the use case we can open an issue to track work for supporting it.
Hi Eric and Jeff,
what we are trying to do is to use the Linux Host Adapter to launch an app (Desktop in this case) on the LHA. As mentioned earlier, we have an OOD server which has a local home for the user, and Linux remote host that also has a local home. We have set up a shared NFS mount that is not home, but is R/W accessible on both the OOD server and the remote host, and use the OOD_DATAROOT to point to this shared file system to store the Desktop app job’s data files:
Once the job is launched, we see all the data files in the correct staging directory, but, the output.log just has:
/bin/bash: /nfs/ood_data/pierce/batch_connect/sys/bc_desktop/kc_host/output/2c74aad3-2375-4c04-915b-2567ec5755bf/tmp.nSrb8fAufD_sing: No such file or directory
Looking at what happens on our production system where the home dirs are NFS mounted and shared by the OOD server and the remote host, and no OOD_DATAROOT, the tmp.sing and tmp.tmux files are put to $HOME. I have a feeling that the tmux and singularity commands that start the session with those tmp files are putting the tmp files somewhere else, and they can’t be found by subsequent commands that need them. I.e., that there’s some inconsistency in using the OOD_DATAROOT in the LHA.
Can you please comment on this, and/or, point us to the source code of the LHA to understand better what’s the syntax and sequence of the tmux and singularity execution when starting the LHA job?