FATAL error "ActionView::Template::Error ((<unknown>): control characters are not allowed at line 1 column 1):"

One of our researchers is getting the errors below while in their interactive desktop session which then freezes the session so that they have to close the novnc connection and restart it from the session card. I see that it has happened the last couple times the researcher used the interactive desktop. If someone could let me know what might be causing this it would be helpful.

error.log-20221023:App 24560 output: [2022-10-22 09:05:26 -0400 ] FATAL “”
error.log-20221023:App 24560 output: [2022-10-22 09:05:26 -0400 ] FATAL “ActionView::Template::Error ((): control characters are not allowed at line 1 column 1):”
error.log-20221023:App 24560 output: [2022-10-22 09:05:26 -0400 ] FATAL “8: if session.script_type == "vnc"\n 9: views = \n 10: views << { title: "noVNC Connection", partial: "novnc", locals: { connect: session.connect, app_title: session.title } }\n 11: views << { title: "Native Instructions", partial: "native_vnc", locals: { connect: session.connect } } if ENV["ENABLE_NATIVE_VNC"]\n 12: else\n 13: views = { partial: "missing_connection" }\n 14: end”
error.log-20221023:App 24560 output: [2022-10-22 09:05:26 -0400 ] FATAL “”
error.log-20221023:App 24560 output: [2022-10-22 09:05:26 -0400 ] FATAL “app/models/batch_connect/session.rb:490:in connect'\napp/views/batch_connect/sessions/_panel.html.erb:11:in block (2 levels) in _app_views_batch_connect_sessions__panel_html_erb___3024456082408241642_20320’\napp/helpers/batch_connect/sessions_helper.rb:47:in block (2 levels) in session_view'\napp/helpers/batch_connect/sessions_helper.rb:47:in block in session_view’\napp/helpers/batch_connect/sessions_helper.rb:36:in session_view'\napp/views/batch_connect/sessions/_panel.html.erb:2:in block in _app_views_batch_connect_sessions__panel_html_erb___3024456082408241642_20320’\napp/helpers/batch_connect/sessions_helper.rb:29:in block (2 levels) in session_panel'\napp/helpers/batch_connect/sessions_helper.rb:28:in block in session_panel’\napp/helpers/batch_connect/sessions_helper.rb:3:in session_panel'\napp/views/batch_connect/sessions/_panel.html.erb:1:in _app_views_batch_connect_sessions__panel_html_erb___3024456082408241642_20320’\napp/views/batch_connect/sessions/index.js.erb:10:in block in _app_views_batch_connect_sessions_index_js_erb__3941263890056573280_20300'\napp/views/batch_connect/sessions/index.js.erb:6:in each’\napp/views/batch_connect/sessions/index.js.erb:6:in `_app_views_batch_connect_sessions_index_js_erb__3941263890056573280_20300’”

Hi and sorry for the troubles.

What version of ood are you using and what OS are you on? Is the user using the VDI to connect or are they using the a full Remote Desktop?

It looks like something is going wrong with some ill-formatted yaml possibly. Could you post the *.yml files for the app here to see?

We are using 2.0.28 and it is the full Remote Desktop. They are the only ones having the error and not all the time just the last couple of times they used it. Others use the interactive desktop without issue. Here is the form.yml.erb for the desktop:

<%-
cmd=“/opt/mam/bin/mam-list-accounts -A --quiet --show Name | egrep ‘[bshlg]c_|[pk][0-9]+’”
allocations =
user = User.new
cache = ActiveSupport::Cache::FileStore.new(“/var/ood/cache/#{user.name}/rh7_desktop”, expires_in: 12.hours)
cache.cleanup()
allocations = cache.fetch(‘allocations’, race_condition_ttl: 10.seconds) do
callocations =
begin
output, status = Open3.capture2e(cmd)
if status.success?
callocations = output.split(“\n”).map(&:strip).reject(&:blank?).sort
else
raise output
end
rescue => e
callocations =
error = e.message.strip
end
callocations
end
acii_res= File.open(“/var/ood/gdesktop-res”).to_a.first.chomp!
-%>

title: “ACI RHEL7 Interactive Desktop”
cluster: “aci”
form:

  • desktop
  • aci_account
  • bc_num_hours
  • bc_num_slots

- bc_num_cores

  • node_type
  • bc_email_on_started
  • acii_reservation
    submit: “submit/aci.yml.erb”
    attributes:
    desktop:
    label: “Desktop Environment”
    widget: select
    options:
    • [ “MATE Gnome 2”, “mate” ]
      aci_account:
      label: “Allocation”
      help: “Please select an allocation from the drop-down.”
      widget: select
      options:
    • [ “open”, “open” ]
      <%- if !allocations.blank? -%>
      <%- allocations.each do |a| -%>
    • [ “<%= a %>”, “<%= a %>” ]
      <%- end -%>
      <%- end -%>

bc_num_slots: 1

bc_num_cores:

label: “Number of Cores”

widget: number_field

min: 1

max: 16

step: 1

value: 4

bc_num_hours:
value: 1
bc_queue: “open”
node_type:
widget: select
label: “Node type”
help: |
- ACI-i - (4 cores) Use an ACI-i node that has GPU GL acceleration, 40 cores,
and 256GB of RAM. Only available for open account submissions.
- ACI-b Standard Core - (4 cores) Use an ACI-b node without GPU GL
acceleration, 20 available cores, Infiniband interconnect, and 256GB total RAM.
Available for open account and allocation account submissions.
- ACI-b Basic Core - (4 cores) Use an ACI-b node without GPU GL
acceleration, 20 available cores, and 128GB total RAM.
Available for open account and allocation account submissions.
- ACI-b Himem Core - (4 cores) Use an ACI-b node without GPU GL
acceleration, 40 available cores, and 1TB total RAM.
Available for open account and allocation account submissions.
options:
- [
“ACI-i”, “ppn=4:acii:rhel7”,
data-option-for-open: true,
data-option-for-sc_default: false,
data-option-for-bc_default: false,
data-option-for-gc_default: false,
data-option-for-hc_default: false,
data-option-for-lc_default: false,
data-option-for-lc_icds-training: false,
data-option-for-p100_default: false,
data-option-for-k80_default: false,
data-option-for-gc_x1p100_default: false,
data-option-for-gc_x4p100_default: false
]
- [
“ACI-b Standard Core”, “ppn=4:stmem:rhel7”,
data-option-for-open: true,
data-option-for-sc_default: true,
data-option-for-bc_default: false,
data-option-for-gc_default: false,
data-option-for-hc_default: false,
data-option-for-lc_default: false,
data-option-for-lc_icds-training: false,
data-option-for-p100_default: false,
data-option-for-k80_default: false,
data-option-for-gc_x1p100_default: false,
data-option-for-gc_x4p100_default: false
]
- [
“ACI-b Basic Core”, “ppn=4:basic:rhel7”,
data-option-for-open: true,
data-option-for-bc_default: true,
data-option-for-sc_default: false,
data-option-for-gc_default: false,
data-option-for-hc_default: false,
data-option-for-lc_default: false,
data-option-for-lc_icds-training: false,
data-option-for-p100_default: false,
data-option-for-k80_default: false,
data-option-for-gc_x1p100_default: false,
data-option-for-gc_x4p100_default: false
]
- [
“ACI-b GPU Core (NO GL Acceleration)”, “ppn=4:gpu:rhel7:gpus=1”,
data-option-for-open: false,
data-option-for-gc_default: true,
data-option-for-p100_default: true,
data-option-for-k80_default: true,
data-option-for-gc_x1p100_default: true,
data-option-for-gc_x4p100_default: true,
data-option-for-bc_default: false,
data-option-for-sc_default: false,
data-option-for-lc_default: false,
data-option-for-lc_icds-training: false,
data-option-for-hc_default: false
]
- [
“ACI-b Himem Core”, “ppn=4:himem:rhel7”,
data-option-for-open: true,
data-option-for-hc_default: true,
data-option-for-bc_default: false,
data-option-for-sc_default: false,
data-option-for-gc_default: false,
data-option-for-lc_default: false,
data-option-for-lc_icds-training: false,
data-option-for-p100_default: false,
data-option-for-k80_default: false,
data-option-for-gc_x1p100_default: false,
data-option-for-gc_x4p100_default: false
]
- [
“ACI-b Legacy Core”, “ppn=4:legacy:rhel7”,
data-option-for-open: true,
data-option-for-hc_default: false,
data-option-for-bc_default: false,
data-option-for-sc_default: false,
data-option-for-gc_default: false,
data-option-for-lc_default: true,
data-option-for-lc_icds-training: true,
data-option-for-p100_default: false,
data-option-for-k80_default: false,
data-option-for-gc_x1p100_default: false,
data-option-for-gc_x4p100_default: false
]
acii_reservation: “<%= acii_res %>”

Here is the submit.yml.erb file.


batch_connect:
template: vnc

Could you attach the raw files? Otherwise trying to find the bad yaml is almost impossible.

form.yml (5.9 KB)
submit.yml (35 Bytes)

Nothing jumping out with these files for a bad yaml problem.

I have a few other questions:

  1. What is the pattern of frequency? Every n minutes, each day? Or is it erratic with no real pattern?
  2. When they restart, they are clicking that same session card to connect again, and not relaunching the whole app?
  3. After the reconnect, does it continue to happen?

Very strange only this user on a system app is having the issue, so I’m trying to gather better context. Thanks!

I looked at the error logs for this user over all 4 of our servers and only found 2 instances over the last month but they do complain that their interactive desktop freezes up on them. I found these while investigating the freeze up issue and this is the only thing I’ve found. We moved to the multiple OOD servers with an haproxy frontend and we were getting freeze ups due to the haproxy server running out of file descriptors but have not had that issue since setting the the file descriptors higher. We haven’t had any other of our users complain of this since the fix except for this. I am still investigating and am waiting for some answers to questions I posed to the user and might have some more after I get those answers.

Here is an observation from the researcher:

Maybe the fatal error could also be when I step away from the tab for a bit, and when I get back, I get an error “New connection has been rejected with reason: Authentication failed”. But this sort of thing happens with Rstudio also where it wants me to log back into Rstudio.

In searching for that error message I found an answer (possibly) to what is described above so I have set the “passenger_pool_idle_time” to an amount of time that should get rid of that error and have asked the researcher to try again.

Thanks for the information!

It sounds like the user is leaving the session open for quite a while and just hitting timeouts like you are saying so the nginx stage is the best bet like you are trying.

Let me know if you hit any more issues.