Interactive app sessions "overlap" from distinct slurm controllers

Similar to OSC, we have a share a single filesystem for /home with two clusters, in which cluster means also distinct slurm controller servers. What is the strategy to prevent interact apps info from showing up under “My Interactive Apps” tab? Each cluster uses the same ‘dataroot’ (which i understand to be ~/ondemand/data). I have seen references to OOD_DATAROOT variable that may be changed for the dashboard app. I think that would be done through /file
etc/ood/config/apps/dashboard/env.

Our current situation: we have a rhel8 cluster that will replace the rhel7 cluster we are phasing out this year. So each cluster has independent login nodes, and slurm controllers running on distinct servers. But the clusters share explicitly /home, such that each person accumulates app sessions at:
/home//ondemand/data/sys/dashboard/bc_connect/

The issue is that all interactive app info is recognized by the dashboard, so under “My Interactive Apps”, a listing will appear for an app, whether it belongs to the cluster of the active dashboard session or not. And further, there appear messages that cause confusion, even though they really just mean “this is not the app you are looking for - it belongs to the other cluster you also run jobs on”. The above unique “DATAROOT” is the idea to break the ~/ondemand/data degeneracy between the clusters, and have unique spaces for the interactive app (and perhaps other?) data managed by the dashboard.

So, Is adjusting OOD_DATAROOT adequate to break the degeneracy, and provide the unique storage for each cluster in the /home? Do we have a ‘best practices’ consensus on this situation? Are there implementation details to be aware of when making a change of this sort while operations are ongoing?
Thanks

Thanks for the question.

At OSC this is controlled in the form for the particular app in combination with the cluster config files. So, the apps are all listed under Interactive Apps but when you select that app, there are options to select the cluster or even some logic to preselect the cluster for certain apps without giving the user a choice.

This can be done using a combination of the form.yml and /etc/ood/config/cluster.d/<some_cluster>.yml.

In the form.yml you can set the cluster for the app so that when the user selects to use this app in their dashboard (which is on the web-node, not a cluster) the app already knows what cluster it works on and will submit to the corresponding cluster: attribute set in its form.yml and the corresponding file located in /etc/ood/config/clusters.d/<your_cluster_file>.yml.

It may be as simple as setting up these cluster files and then pointing to them in your form.yml to keep the entire problem obfuscated from the user, so they don’t need to even know which cluster to select.

Or, you could also give the user a choice of clusters in the form.yml and allow them to select it as well. But, it sounds like you want to hide this from the user and make the choice for them, in which case just follow the steps above.

If I misunderstood something or my explanation doesn’t make sense please let me know. Also, you may find these docs useful to look at:
https://osc.github.io/ood-documentation/latest/app-development/interactive.html

And specifically the “User Form” may provide useful information.

Thanks for sharing these thoughts, Travis. Much appreciated.

My question does run in a different direction, and involves specifying the location for the database and output information for the active jobs. On our system:

$ ls /home//ondemand/data/sys/dashboard/batch_connect/
db dev sys usr

So, for the clusters, they see the same /home. How can I set a variable to change the ‘root’ of the data? Otherwise, both clusters are looking to the same locations for these files, and presenting them all, causing confusion for some.

Do you have 2 different Open OnDemand portals? One for each cluster?

I think the best practice here is to just have 1 portal that can submit jobs to several clusters. That’s the way we run it.

OR you could use a different ondemand_portal in nginx_stage.yml. This creates the top level ~/ondemand by default.

This is how we run different logical portals. By that I mean we have completely separate environment for some customers (even though they interact with the same schedulers and file systems).

# Unique name of this OnDemand portal used to namespace multiple hosted portals
# NB: If this is not set then most apps will use default namespace "ondemand"
#
#ondemand_portal: null

If you configure this beware of this problem though: when you reconfigure, all the jobs currently in ~/ondemand` won’t be found. So your users may lose any currently running job and/or all the job composer data. You may have to manually copy files to the new directory.

That said - those cards should probably indicate what cluster the job belongs to, so I can submit that ticket upstream.

Hi, Jeff – Because we have different slurm controllers for each cluster, we wanted to operate different portals. We have different slurm controllers because one cluster is using RHEL7, and the other RHEL8. I likely underestimated the flexibility of ondemand to account for just such situations.

We label our apps with the cluster that they are associated with, so one can distinguish them.

Nonetheless, ideally, I don’t want to show jobs to each cluster – just the cluster/portal that they are associated with.

I like setting ‘ondemand_portal’, while also respecting the caution about making changes during regular operations. If I understand the comments here, and the documentation for nginx_stage.yml, setting “custom” will create ~/“custom” and then the subsequent file structure discussed above. Is this correct?

Could clarify whether setting the namespace to reflect two levels of a path would be successful? Such that “ondemand/cluster1” would support ~/ondemand/cluster1/data/sys/dashboard/… ?

Thanks

That’s OK - that’s how we run different clusters too. We just have multiple cluster.d files on each portal. That one portal can talk different Slurm controllers and treats those separate files as separate clusters. Which is to say, that OOD is indeed flexible enough for this, especially now with dynamic forms.

Yes.

If you want the directory structure to have a top level ondemand with subdirectories for each cluster, then I believe OOD_DATAROOT may be the better option as I’m not sure whether ondemand_portal will accept slashes.

OOD_DATAROOT=$HOME/ondemand/cluster1

In either scheme - if you’re changing the directory, you’ll have to take care of whatever sessions currently exist with the old directory.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.