Hello! This is an issue that is driving me crazy trying to understand it… let see if any expert is able to see something I’m missing.
While using OOD v2.0.31 (and happened in v2.0.29 as well), our users once in a while, when trying to use OOD (just the main web interface or the file explorer, not connecting to a running job or anything) their request to OOD’s apache get redirected (apache proxy) to a compute node usually running someone else Jupyter/Rstudio (our main jobs), getting the user a 404 error instead of the requested content…
Here are some logs of that happening (error log):
### This one I believe is ok
[Wed Jun 07 08:19:04.214379 2023] [lua:info] [pid 6992:tid 140239509112576] [client 10.125.118.145:22794] res_content_type="application/json; charset=utf-8" time_user_map="0.002" remote_user="user1@company.com" local_user="user1" req_filename="proxy:http://localhost/pun/sys/dashboard/files/fs//s3" req_status="200" log_hook="ood" req_content_type="" res_content_length="" res_location="" res_content_location="" req_handler="proxy-server" req_accept_language="en-us,en;q=0.9" req_cache_control="" req_uri="/pun/sys/dashboard/files/fs/s3" req_port="443" req_protocol="HTTP/1.1" req_is_https="true" req_accept="application/json" time_proxy="2065.308" req_is_websocket="false" req_referer="https://ondemand.company.com/pun/sys/dashboard/files/fs//s3/go-shared-nextflow/project/dev/aws-batch-nextflow-quickstart/rvegesna-alternate-splicing/bin" req_method="GET" req_hostname="ondemand.company.com" req_accept_charset="" req_origin="" req_user_ip="10.125.118.145" log_time="2023-06-07T12:19:04.214049.0Z" req_accept_encoding="gzip, deflate, br" res_content_language="" req_server_name="ondemand.company.com" req_user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36" res_content_disp="" res_content_encoding="", referer: https://ondemand.company.com/pun/sys/dashboard/files/fs//s3/go-shared-nextflow/project/dev/aws-batch-nextflow-qu
ickstart/rvegesna-alternate-splicing/bin
### This two are trying to reach someone else’s instance (node/10.125.237.56) while trying to check files from the dashboard…
[Wed Jun 07 08:20:46.309485 2023] [lua:info] [pid 6992:tid 140240324908800] [client 10.125.118.145:27268] res_content_language="" remote_user="user1@company.com" log_time="2023-06-07T12:20:46.309383.0Z" req_accept_language="en-us,en;q=0.9" req_user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36" req_content_type="" req_cache_control="" req_accept="text/css,*/*;q=0.1" req_method="GET" time_proxy="6.022" req_server_name="ondemand.company.com" res_content_encoding="" time_user_map="0.003" req_protocol="HTTP/1.1" req_user_ip="10.125.118.145" req_referer="https://ondemand.company.com/pun/sys/dashboard/files/fs//s3" res_content_disp="" req_is_websocket="false" res_location="" req_filename="proxy:http://10.125.237.56:53184/node/10.125.237.56/53184/static/style/bootstrap-theme.min.css?v=8b2f045cb5b4d5ad346f6e816aa2566829a4f5f2783ec31d80d46a57de8ac0c3d21fe6e53bcd8e1f38ac17fcd06d12088bc9b43e23b5d1da52d10c6b717b22b3" log_hook="ood" req_accept_encoding="gzip, deflate, br" req_handler="proxy-server" local_user="user1" req_port="443" res_content_location="" req_status="200" req_hostname="ondemand.company.com" req_is_https="true" req_origin="" res_content_type="text/css" req_uri="/node/10.125.237.56/53184/static/style/bootstrap-theme.min.css" res_content_length="23411" req_accept_charset="", referer: https://ondemand.grits
tone.com/pun/sys/dashboard/files/fs//s3
[Wed Jun 07 08:20:46.310252 2023] [lua:info] [pid 6992:tid 140239530092288] [client 10.125.118.145:27270] req_accept_encoding="gzip, deflate, br" time_proxy="6.799" req_protocol="HTTP/1.1" req_content_type="" req_server_name="ondemand.company.com" req_status="200" req_origin="" req_method="GET" req_port="443" local_user="user1" req_filename="proxy:http://10.125.237.56:53184/node/10.125.237.56/53184/static/style/index.css?v=30372e3246a801d662cf9e3f9dd656fa192eebde9054a2282449fe43919de9f0ee9b745d7eb49d3b0a5e56357912cc7d776390eddcab9dac85b77bdb17b4bdae" res_content_location="" res_location="" req_referer="https://ondemand.company.com/pun/sys/dashboard/files/fs//s3" remote_user="user1@company.com" req_hostname="ondemand.company.com" req_accept="text/css,*/*;q=0.1" res_content_disp="" res_content_type="text/css" time_user_map="0.003" req_cache_control="" res_content_encoding="" req_user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36" req_is_https="true" req_handler="proxy-server" log_hook="ood" res_content_language="" res_content_length="1414" log_time="2023-06-07T12:20:46.310165.0Z" req_accept_charset="" req_is_websocket="false" req_accept_language="en-us,en;q=0.9" req_uri="/node/10.125.237.56/53184/static/style/index.css" req_user_ip="10.125.118.145", referer: https://ondemand.company.com/pun/sys/dashboard/files/fs//s3
Here are the correspondent access logs
10.125.118.145 - user1@company.com [07/Jun/2023:08:19:02 -0400] "GET /pun/sys/dashboard/files/fs//s3 HTTP/1.1" 200 8321 "https://ondemand.company.com/pun/sys/dashboard/files/fs//s3/go-shared-nextflow/project/dev/aws-batch-nextflow-quickstart/rvegesna-alternate-splicing/bin" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36"
10.125.118.145 - user1@company.com [07/Jun/2023:08:20:46 -0400] "GET /node/10.125.237.56/53184/static/style/bootstrap-theme.min.css?v=8b2f045cb5b4d5ad346f6e816aa2566829a4f5f2783ec31d80d46a57de8ac0c3d21fe6e53bcd8e1f38ac17fcd06d12088bc9b43e23b5d1da52d10c6b717b22b3 HTTP/1.1" 200 23411 "https://ondemand.company.com/pun/sys/dashboard/files/fs//s3" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36"
10.125.118.145 - user1@company.com [07/Jun/2023:08:20:54 -0400] "GET /node/10.125.237.56/53184/static/style/index.css?v=30372e3246a801d662cf9e3f9dd656fa192eebde9054a2282449fe43919de9f0ee9b745d7eb49d3b0a5e56357912cc7d776390eddcab9dac85b77bdb17b4bdae HTTP/1.1" 200 1414 "https://ondemand.company.com/oidc?iss=https%3A%2F%2Fgritstone.okta.com" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36"
Here is part of my apache ood-portal.conf:
# Reverse proxy traffic to backend webserver through IP sockets:
#
# https://ondemand.company.com:443/node/HOST/PORT/index.html
# #=> http://HOST:PORT/node/HOST/PORT/index.html
#
<LocationMatch "^/node/(?<host>[^/]+)/(?<port>\d+)">
AuthType openid-connect
Require valid-user
# ProxyPassReverse implementation
Header edit Location "^[^/]+//[^/]+" ""
# ProxyPassReverseCookieDomain implemenation
Header edit* Set-Cookie ";\s*(?i)Domain[^;]*" ""
# ProxyPassReverseCookiePath implementation
Header edit* Set-Cookie ";\s*(?i)Path[^;]*" ""
Header edit Set-Cookie "^([^;]+)" "$1; Path=/node/%{MATCH_HOST}e/%{MATCH_PORT}e"
LuaHookFixups node_proxy.lua node_proxy_handler
</LocationMatch>
# Reverse "relative" proxy traffic to backend webserver through IP sockets:
#
# https://ondemand.company.com:443/rnode/HOST/PORT/index.html
# #=> http://HOST:PORT/index.html
#
<LocationMatch "^/rnode/(?<host>[^/]+)/(?<port>\d+)(?<uri>/.*|)">
AuthType openid-connect
Require valid-user
# ProxyPassReverse implementation
Header edit Location "^([^/]+//[^/]+)|(?=/)|^([\./]{1,}(?<!/))" "/rnode/%{MATCH_HOST}e/%{MATCH_PORT}e"
# ProxyPassReverseCookieDomain implemenation
Header edit* Set-Cookie ";\s*(?i)Domain[^;]*" ""
# ProxyPassReverseCookiePath implementation
Header edit* Set-Cookie ";\s*(?i)Path[^;]*" ""
Header edit Set-Cookie "^([^;]+)" "$1; Path=/rnode/%{MATCH_HOST}e/%{MATCH_PORT}e"
LuaHookFixups node_proxy.lua node_proxy_handler
</LocationMatch>
# Reverse proxy traffic to backend PUNs through Unix domain sockets:
#
# https://ondemand.company.com:443/pun/dev/app/simulations/1
# #=> unix:/path/to/socket|http://localhost/pun/dev/app/simulations/1
#
SetEnv OOD_PUN_URI "/pun"
<Location "/pun">
AuthType openid-connect
Require valid-user
ProxyPassReverse "http://localhost/pun"
# ProxyPassReverseCookieDomain implementation (strip domain)
Header edit* Set-Cookie ";\s*(?i)Domain[^;]*" ""
# ProxyPassReverseCookiePath implementation (less restrictive)
Header edit* Set-Cookie ";\s*(?i)Path\s*=(?-i)(?!\s*/pun)[^;]*" "; Path=/pun"
SetEnv OOD_PUN_SOCKET_ROOT "/var/run/ondemand-nginx"
SetEnv OOD_PUN_MAX_RETRIES "5"
LuaHookFixups pun_proxy.lua pun_proxy_handler
</Location>
# Control backend PUN for authenticated user:
# NB: See mod_ood_proxy for more details.
#
# https://ondemand.company.com:443/nginx/stop
# #=> stops the authenticated user's PUN
#
SetEnv OOD_NGINX_URI "/nginx"
<Location "/nginx">
AuthType openid-connect
Require valid-user
LuaHookFixups nginx.lua nginx_handler
</Location>
# Redirect root URI to specified URI
#
# https://ondemand.company.com:443/
# #=> https://ondemand.company.com:443/pun/sys/dashboard
#
RedirectMatch ^/$ "/pun/sys/dashboard"
# Redirect logout URI to specified redirect URI
#
# https://ondemand.company.com:443/logout
# #=> https://ondemand.company.com:443/oidc?logout=https%3A%2F%2Fondemand.company.com
#
Redirect "/logout" "/oidc?logout=https%3A%2F%2Fondemand.company.com"
# OpenID Connect redirect URI:
#
# https://ondemand.company.com:443/oidc
# #=> handled by mod_auth_openidc
#
<Location "/oidc">
AuthType openid-connect
Require valid-user
</Location>
To give more context/information:
- Other users can be actively working while only one or some of them are affected by the problem
- Users can work-around when this issue happens just opening an incognito session and login in OOD again
- Admins can mitigate the issue when happening restarting httpd (the problem goes away for any user facing it)