Proposal to add a function to check job scripts

Dear OOD developers and users,

I am Masahiro Nakao at RIKEN R-CCS. We use OOD on Japan’s flagship system, Fugaku and some clusters. We will develop a new function of OOD and would like your comments.

Users want to easily execute real applications on Fugaku via OOD. However, in the Job Composer, users have to write all commands (#SBATCH -N 1, mpiexec ./a.out, etc.), it is not suited for executing specific applications. Thus, we use the webform for Interactive Application to execute real applications.

The above figure shows our Open OnDemand. When user clicks the “Launch” at the bottom, a job script will be generated based on submit.yml.erb and the be submitted to our cluster.

There are two issues here. The first issue is that user cannot know what a job script is generated before submitting it. Some users want to see what a job script will be submitted. The second issue is that user cannot make edits to the job script. Users may want to add special pre- or post-processing before mpiexec.

To solve the issues, we are considering adding a new page. Check “Check job script” in the webform and click the “Launch” button to move to the new page. The page is generated from submit.yml.erb. On that page, users can check the job script before submitting it, and users can edit the job script on the page. Then click the “Submit” at the bottom to submit the job to the cluster.

We considered developing an application with the above function as a new Passenger Application, but in that case, we would also need to write code to parse form.yml.erb, and we thought it would be difficult to extract that code from OOD, so we considered it as an extension of the existing webform.

I’d love to hear your comments on the above ideas.

Best,

We’re rewriting the Job composer to add a lot of this functionality. I’ve added a link to the GH issue below for a demo of what the Project Manager looks like in 3.1 (it was shipped but disabled by default, and only works for Slurm so we’d need some support for Fugaku).

The premise is basically a WYSIWYG form editor. The intention is that users write scripts

So the user would have write this script

expoort OPM_NUM_THREADS=1


# this may actually be an environment variable populated in the form.
# module load genisis/$OOD_GENISIS_MODULE_VERSION
module load genisis/2.1.2_mix


mpiexec ./spdym

But we (OnDemand) would supply all the necessary scheduler directives as CLI arguments without the user having to know/care about them.

OK I’m having trouble getting the gif to autoplay.

The actual gif is too large to upload to Discourse. So, I’ve added the gif to the current github ticket we’re working on for the next release.

I know you’ve added the Fugaku adapter to ood_core. To get the same support for Fugaku in the new project manager, we’d need a few more additions to the adapter. Feel free to open tickets on that repository and I can show you what additional APIs we need to implement.

Interestingly enough - a “web form to generate a shell script” has come up as an idea within the OSC team.

Though I’m a little unconvinced we should be generated the shell script itself. I mean there are so many things people may want to do in a shell script I find it untenable. I’m not sure how to get beyond issuing 1 single command.

So with that said - the Project Manager just takes the shell script at face value. The web form that users interact with control the scheduler options like how much walltime or how many nodes and so on. So the Project Manager sort of decouples the script you want to submit to the scheduler and scheduler’s job options - where all scheduler’s job options are provided as web form controls that are submitted as CLI flags.

Beyond that - I think there’s a locality problem here that the Project Manager solves. When you create a project - you’re essentially attaching that project to a filesystem directory. Ostensibly this directory is where all your project files live. Source code, shell scripts input & output data and so on. So all the jobs you submit from within that project have that directory as the submit directory ($SLURM_SUBMIT_DIR or $CWD). Of course a directory or path could be just another form field - but those add up at some point. (The Project Manager will not show hard coded values in the form - though it does currently in the 3.1 version).

All that said - if you don’t want to wait for the Project Manager (as it’s currently in development) and continue this endeavor what I’d suggest using some javascript and a text_area instead of I guess a modal that pops up when you check that checkbox.

You can use text_areas for multi-line strings like that. Maybe some javascript to update it when a new version or executable is chosen. The users can then interact with that text_area without context switching to the modal popup.

Also, I don’t think you don’t need the SBATCH directives, because either the form itself will submit the appropriate CLI flags or you can submit the CLI flags in the submit.yml.erb itself. I mean, if number of nodes is a web form field - why should users also have to interact with the SBATCH directives? Or is it the case that the users are just using the form to generate a shell script that they then submit on the CLI?

It’s true that users don’t need to write scheduler directives. That’s a mistake in the picture I drew.

The new Job Composer is very cool. It’s disabled by default, but is there a way to enable it in OOD 3.1.4 ? We have a cluster system with slurm other than Fugaku, so we’d like to try it there.

What we want to achieve is to make it easy to use applications installed on cluster systems. Thus, we currently use the OOD web form to automatically generate job scripts. In many cases, this is sufficient, but to increase flexibility, we would like to allow users to modify job scripts to some extent. I’m not sure if that’s possible with the new job composer, but I’d love to try it.

All that said - if you don’t want to wait for the Project Manager (as it’s currently in development) and continue this endeavor what I’d suggest using some javascript and a text_area instead of I guess a modal that pops up when you check that checkbox.

This also looks good, I’d like to try it.

Bests,

The RPM installs the directory /var/www/ood/apps/sys/projects as 700 root:root owned - that’s how we disable it. You can’t view the files. Change this directory to say 750 root:staff so that members in the staff Unix group can see it. I wouldn’t recommend 755 open to the public because things will change.

The Project Manager takes the shell script your submitting at face value. It doesn’t edit or even look at it. We use environment variables the web form to pass information from the form to the script.

We’d very much like to hear feedback on the current Project Manager’s direction. If you turn it on, please let us know either here or in github (feel welcome to create new tickets if a feature isn’t already planned).

Also - this popped up in my github feed not that long ago that may be of interest:

1 Like

I just finished adding the BYU Job Script Generator (with some tweaks) to our OOD instance.

2 Likes

Hi @brandonbiggs

Some files are missing, such as the manifest and perhaps others. Could you tell me if there are several things I need to do to export it to another infrastructure?

Thank you :slight_smile:

@jeff.ohrstrom
Thank you for your information.
The project manager application is displayed.

And, if anything, the BYUJobScriptGenerator is closer to what we need.
I would like to use this application as a base to consider expanding it or creating a new one.

Best,

The way I set it up, you don’t need the manifest or other files. I created it as a widget and then put the widget on a custom page.

Here are some additional details on how I configured it. Following the documentation for custom layouts, it explains how to create a new widget. So I took the css, js, and html files from the BYU Job Script repo, combined them into one file, and put it in /etc/ood/config/apps/dashboard/views/widgets/_job_script_generator.html.

After that, I configured /etc/ood/config/ondemand.d/ondemand.yml to have a custom page with the new widget:

Excerpts from ondemand.yml:

nav_bar:
  - title: "Information"
    links:
      - group: "Tools"
      - page: "job_script_generator"
        title: "Job Script Generator"

custom_pages:
  job_script_generator:
    rows:
      - columns:
          - width: 12
            widgets:
              - "job_script_generator"

Hopefully this makes sense.

This was the first app I’ve done like this, so if anyone has done something similar or has any feedback, let me know!

2 Likes

Hi,

we have something similar at TAMU HPRC, but a bit more general. We don’t have a public github for this project yet, but we are starting to actively use it on our clusters. We do have a youtube demo on our youtube channel:

The video only shows the Base environment (to create any custom job), but we have other environments (e.g., R/Python/Matlab/Abaqus) where the user only has to specify env-specific information (e.g. select the main script), and the composer generates the job file (which is always editable). Environments can be specified by users and will not require any re-writing of the composer code.

Hopefully, we can make the repo available sometime in summer

2 Likes

This looks great. I attended a Tips and Tricks talk a while back and there was some criticism of the BYU Script Generator. My main impression was that it was “busy.” It presented the user with a wall of text and many options that few would need. Some minor changes might be to make an accordion-type layout to expand the less frequently used options. We might really need this generator since we are introducing quite a few constraints and gres options and I’d like users to be able to see them.

2 Likes

I definitely agree, it was very busy. I tried to reduce the amount of options and really focus on the primary things that my user base would want. Fortunately the code seemed to be pretty accessible for making changes.

I guess there are different ways to do that in CSS or JS. In our case (at tamu hprc), the form is a JSON specification that allows for conditionals that will be rendered to show or not show specific fields (or groups of fields). That would be another option