How would you deal with the edge cases, like start is dependent on something? Many of the jobs “just queued” are waiting on a resource and have N/A in the START_TIME column; e.g.
14439049 standard st_archi <username> PD N/A 1 (null) (DependencyNeverSatisfied)
Would you have OOD display the “DependencyNeverSatisified” string as the Projected Start Time?
We’re also interested in this. It’s on my TODO list to try to implement something locally, but currently have no idea when I’ll have time to try. I feel like I’ve seen other recent discussions about this.
Ric, we’re especially concerned with edge cases. While we’ve tried to prevent it with checks on the forms, users submitting a job that can never start and no way of seeing that from the interactive sessions page is a frequent issue for us.
I think it’s implementation + docs issue. Basically, show users whatever slurm says and then a link to docs explaining what that means. How many jobs have a predicted start time is also highly dependent on local Slurm confs. I think I would add “eligible” times in as well.
Here’s a start. Tighter integration into the cards would be more work, but this lets you show some extra info to users without any significant changes. This would be a in a custom info.md.erb on a per app basis.
<%-
require 'open3'
class CheckJob
@cache = ActiveSupport::Cache::FileStore.new("/users/#{User.new.name}/.cache/OpenOnDemand/", :expires_in => 60.seconds)
def self.CheckJob(job_id)
begin
# get job info from squeue
script = "/hpc/sys/apps/slurm/current/bin/squeue -j " + job_id + " -ho '%T,%S'"
o, status = Open3.capture2e(script)
tmp_output = o.split("\n")
output = tmp_output[0].split(',')
return output
end
end
def self.GetJobState(job_id)
begin
@GetJobState = @cache.fetch("#{User.new.name}/queues/" + job_id, race_condition_ttl: 30.seconds) do
self.CheckJob(job_id)
end
return @GetJobState
end
end
end
def valid_time_string?(time_string)
begin
Time.parse(time_string)
true
rescue ArgumentError
false
end
end
-%>
<%- if queued? -%>
> **Job Status**: <%= CheckJob.GetJobState(job_id)[0] %>
> **Predicted or Actual Start Time**:
<%- if valid_time_string?(CheckJob.GetJobState(job_id)[1]) -%>
<%= Time.parse(CheckJob.GetJobState(job_id)[1]).strftime("%B %d, %Y at %I:%M %p") %>
<%- else -%>
Unknown
<%- end -%>
> For an explanation of the job status values, see https://slurm.schedmd.com/squeue.html#SECTION_JOB-STATE-CODES
> Predicted start times are based on job requests currently in the queue. They are not available for all jobs and
are often not accurate. Jobs usually, but not always, start before the predicted time.
<%- end -%>
There’s probably better ways to do that and you may need to modify how I setup the cache, path to slurm commands, etc.