How to download/cache files from form.yml.erb

Hello,
One of apps (Specialized RStudio app) downloads a json file (~50KB) with information necessary for rendering the app form and in starting the app. It is currently done by executing a curl command via Open3.capture2e call from the apps form.yml.erb template. This is causing a storm of curl command forks and load issues on the OOD node. I am trying to rewrite the app to use a download and cache mechanism. I was wondering if OOD provides a mechanism to manage such download and cache scenarios? Are there any recommendations from users who have come across similar needs?

Hi Raj.

Welcome aboard and thank you for your question.

To ensure I understand your question. You have an App that is making many curl calls to download a JSON File that the App uses. You want OnDemand to cache for that App?

Is that an adequate description?

Thanks,
-gerald

Hi Gerald,

Yes, that is a good description.

Thanks!
Raj

Thanks Raj.

Would you mind pasting your form.yml.erb here please?

Thanks,
-gerald

Please ignore the caching mechanism code in there. It does not work well and it is used with caching disabled.

<%
  require 'json'
  load File.expand_path('../config.rb', __FILE__)
 %>
<%
  $api_cookie         = "\"" + AppConfig.api_cookie + "\""
  $api_gw_url         = "\"" + AppConfig.api_gw_url + "/cedar?op=GetActiveReleases&customer=" + AppConfig.api_customer + "&env=" + AppConfig.api_environment + "\""
  $api_releases       = []

  $api_cache_enabled  = AppConfig.use_cache
  $api_cache_int_sec  = AppConfig.use_cache_int_sec
  $api_cache_mode     = AppConfig.use_cache_mode

  $api_cache_encoding = AppConfig.use_cache_encoding
  $api_cache_path     = "/tmp/.ood"
  $api_cache_file_usr = ".ood-cache-$USER.json"
  $api_cache_file     = ".ood-cache.json"


  def log_message(str)
     raise StandardError, str
  end

  def cache_file_name
    if $api_cache_mode == "global"
      return $api_cache_path + "/" + $api_cache_file
    else
      return $api_cache_path + "/" + $api_cache_file_usr
    end
  end

  def cache_file_exists
    cmd = "sh -c \"[ -f " + cache_file_name + " ] && echo 0\""
    output, status = Open3.capture2e(cmd)
    if status.success?
      return true
    end
    return false
  end

  def cache_file_create(data)
    cmd = "echo \"" + data + "\" > " + cache_file_name
    return Open3.capture2e(cmd)
  end

  def cache_file_load
    cmd = "cat " + cache_file_name
    return Open3.capture2e(cmd)
  end

  def cache_file_calc_int_time_sec
    cmd = "sh -c \"expr `date +\"%s\"` - `stat --format=%Y " + cache_file_name + "`\""
    return Open3.capture2e(cmd)
  end

  def cache_file_zero_int_time_sec
    cmd = "touch -t `date +\"%Y%m%d%H%M\"` " + cache_file_name
    return Open3.capture2e(cmd)
  end

  def encode_bash(string)
    return string.gsub(/["]/, '&#34;')
  end

  def decode_bash(string)
    return string.gsub(/&#34;/, '"')
  end

  def diplay_name_value(string)
    sname = string.strip.split(" ")
    case sname[3] 
    when "DEV"
      vertype = 1
    when "TST"
      vertype = 3
    when "UAT"
      vertype = 5
    when "PRD"
      vertype = 9
    else
      vertype = 0
    end
    return "%d.%s" % [vertype, sname[2]]
  end


  def api_load_customer_releases
    cmd = "curl -s -H \"Content-Type: application/json\" -H " + $api_cookie + " -X GET " + $api_gw_url
    $api_releases = []
    release_info = {}
    begin
      output, status = Open3.capture2e(cmd)
      if status.success?
        json_obj = JSON.parse(output)
        if json_obj.nil? || json_obj.empty?
          return [], nil
        end

        json_obj.each do |release|
          item = release['display_name']
          $api_releases.push(item)

      	  begin
            release_metadata = JSON.parse(release['metadata'])
          rescue => e
            release_metadata = eval(release['metadata'])
          end
          if release_metadata.empty?
            release_info[item] = {'Binds': {}, 'Vars': {}, 'Registry': {}}
          else
            release_info[item] = release_metadata
          end
        end
        $api_releases = $api_releases.sort_by { |v| Gem::Version.new(diplay_name_value(v)) }.reverse
        return release_info.to_json.to_s, status
      end
      return output, status
    rescue => e
      $api_releases.push("unable to fetch releases")
      $api_releases.push(e.message.strip)
      output, status = Open3.capture2e("exit 1")
      return e.message.strip, status
    end
  end

  def load_cache_releases
    begin
      release_info_str, status = api_load_customer_releases
      if status.success?
        data = release_info_str.encode($api_cache_encoding)
        data = encode_bash(data)
        output, status = cache_file_create(data)
        return release_info_str
      else
        cache_file_zero_int_time_sec
        return "{}"
      end
    rescue => e
      cache_file_zero_int_time_sec
      return "{}"
    end
  end

  def load_nocache_releases
    begin
      release_info_str, status = api_load_customer_releases
      if status.success?
        return release_info_str
      end
    rescue => e
      return "{}"
    end
  end

  if $api_cache_enabled
    access_time_int_sec = 0

    file_exists = cache_file_exists
    if !file_exists
      release_info_str = load_cache_releases
    else
      access_time_int_sec, status = cache_file_calc_int_time_sec
      int_sec = access_time_int_sec.to_i

      if int_sec > $api_cache_int_sec
        release_info_str = load_cache_releases
      else
        data, status = cache_file_load
        if status.success?
          release_info_str = decode_bash(data)
          release_info = eval(release_info_str)

          if !release_info.nil?
            release_info.each do |key, value|
              $api_releases.push(key)
            end
          end
        end
      end
    end
  else
    release_info_str = load_nocache_releases
  end
-%>

---
cluster: "hpc_cluster"
form:
  - rstudio_home
  - bc_release
  - bc_release_info
  - bc_account # null
  - bc_num_hours
  - bc_num_slots
#  - bc_running_qos
#  - bc_running_queue
  - bc_queue # null
  - num_mem
  - num_gpus
  - advanced_slurm_options
  - bc_email_on_started
attributes:
  rstudio_home:
    widget: "hidden_field"
    label: "Home directory for R Studio"
    value: '${HOME}/home_rstudio'
  bc_release:
    label: "Environment"
  <%- if $api_releases.blank? -%>
    widget: "text_field"
  <%- else -%>
    widget: "select"
    options:
    <%- $api_releases.each do |r| -%>
      - [ "<%= r %>", "<%= r %>" ]
    <%- end -%>
  <%- end -%>
  bc_release_info: '<%= release_info_str %>'
  bc_queue: null
  advanced_slurm_options:
    widget: "text_field"
    help: >
      Advanced users can override gres type, partition, or qos here. Must be written in correct Slurm syntax, e.g. --gres=gpu:volta:1 and be space-delimited if multiple options provided
  bc_num_slots:
    widget: "number_field"
    label: "Allocated CPUs (in cores)"
    required: true
    value: 1
    min: 1
    max: 44
    step: 1
    help: |
      Max amount of CPU's allocated for R Studio container to run (1..44)
  num_mem:
    widget: "number_field"
    label: "Allocated Memory (in GB)"
    required: true
    value: 4
    min: 1
    max: 1024
    step: 1
    help: |
      Max amount of memory allocated for R Studio to run (1..1024)
    id: 'num_mem_var_id'
  num_gpus:
    widget: "number_field"
    label: "Number of GPUs"
    value: 0
    help: |
      Number of GPUs to use for your job. Default is 0
    min: 0
    step: 1
    id: 'num_gpus_var_id'
  bc_account: null



Hi Raj.

Thanks for the info. I know we do have some caching functionality in OOD. What I’m not sure of, is if it will be of any assistance to you.

This document is specific to the interactive app form.
https://osc.github.io/ood-documentation/latest/app-development/interactive/form.html?highlight=cache

I’ll keep looking. There may be others in the community or in our group that have solved this problem already.

Thanks,
-gerald

I think you have 2 strategies.

One is you get root or some system user to do this for you, and everyone just reads that shared file. You can setup a crontab to update it on whatever schedule you like.

The second is to do the heavy lifting in an initiailzer. This will execute when the PUN boots up, so it’ll only do the work once when the application is starting up. The downside is that the file will only be populated/refreshed whenever the user logs in. So if you have a more aggressive update schedule (i.e., you need updates every hour) it may not fit (unless you do a lot more work with say a recursive ActiveJob).

Here’s a topic where you can get some inspiration on what such an initializer would look like/do. Be careful with that load File.expand_path('../config.rb', __FILE__) (could actually be require_relative config) though. Rails will recognize every file in that path as an initializer and treat it like a ruby script, and it loads these files in a specific order.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.