Reverse proxy without Location header modification

Hello everyone,

I’m trying to get a Galaxy server running as an interactive app in Open OnDemand and have it working for the most part. The only issue I ran into is the data source redirection tool which passes a Location header to an external location.

Looking at the Apache config generator at ood-portal.conf.erb:225, I noticed that the reverse proxy strips out the protocol and domain from the location header. As a result, the URL that the tool is trying to redirect the user gets changed from something like https://www.ebi.ac.uk/ena/browser/text-search to /ena/browser/text-search which lands on the OnDemand server and returns a 404.

Is there a particular reason for editing the location headers for the reverse proxy? For my testing, removing it fixed my issue, but are there consequences to removing this rule?

Header edit Location "^[^/]+//[^/]+" ""

Thanks in advance.

Hello and welcome!

I think the documentation around how the reverse proxy works and the use cases for when to use the rnode_uri vs node_uri will help here. It depends largely on how the application will handle http requests.

Did you already configure the proxy in you ood_portaly.yml? It’s disabled by default.
https://osc.github.io/ood-documentation/latest/reference/files/ood-portal-yml.html#configure-reverse-proxy

You can set the proxy request for the url using those two settings in that file of node_uri and rnode_uri along with checking the hosts_regex. The 404 is likely an issue with how that redirect is being built from the regular expression or how the keys are set.

Are you able to get a sample request from the galaxy server?

There are also docs around verifying the reverse proxy is working as expected with the galaxy server if you an just ssh to the compute node and start it up:
https://osc.github.io/ood-documentation/latest/how-tos/app-development/interactive/setup/enable-reverse-proxy.html?highlight=rnode#verify-it-works

Hi @travert!

Yes, I have both the rnode_uri and node_uri already configured and they’re both working fine. For the Galaxy server, I’m using the node_uri reverse proxy and have configured Galaxy to handle the /<node_uri>/<host>/<port> path. Galaxy’s working and for the most part, appears to be functional.

The only issue is with the ‘Location’ HTTP header. As an example, a request to one of the data tools results in Galaxy returning a 302 redirect to an external website. This is what the browser sends (minus any irrelevant headers)

GET /node/h1/19896/tool_runner/data_source_redirect?tool_id=ebi_sra_main HTTP/1.1
Host: ood.example.com
Referer: https://ood.example.com/node/h1/19896/

The response from the Galaxy server is a redirect for this particular tool. In this case, it would go to https://www.ebi.ac.uk/ena/data/search by passing a Location header with its response.

However, the problem I ran into is that the Apache config generated by ood-portal.conf.erb removes the leading protocol and domain and results in the redirect going to the local Open OnDemand server instead:

HTTP/1.1 302 Found
Date: Tue, 02 Jan 2024 21:53:45 GMT
Server: uvicorn
Content-Security-Policy: frame-ancestors https://ood.example.com;
Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
x-frame-options: SAMEORIGIN
location: /ena/data/search?GALAXY_URL=https%3A//ood.example.com/node/h1/19896/tool_runner%3Ftool_id%3Debi_sra_main
c

This would redirect the user to https://ood.example.com/ena/data/search instead of the external website at https://www.ebi.ac.uk/ena/data/search.

The redirect works as expected once I removed the Header edit Location "^[^/]+//[^/]+" "" line from ood-portal.conf.erb file and restarted Open OnDemand, but I’m wondering what consequences this may have and if there’s a better way.

Cheers

Are you able to share the relevant parts of the ood_portal.yml where you’ve set the rnod_uri and node_uri?

The Header shouldn’t be changing anything by default is the expected behavior, but I’m not sure what happens if you set both of those in the ood_portal.yml as I don’t quite know how ood would tell which request is a relative and which is a full path. So, this has me wondering if by setting both ood is confused somehow and stripping the full path to match what the rnod_uri wants, but which breaks requests for the galaxy server.

Looking into this more I’m starting to wonder about this too. i’ve not had to dig into this code before and the more I’m looking at it and reading what the docs claim I am also getting a tad confused. I want to look into this some more today to try and see why that Header would be need to be adjusted there.

It seems like it works since we’ve not had many others speak up, but that doesn’t mean others haven’t gone in and removed that piece of code like you and just not let us know.

The only thing which makes sense to me is that the Location header edit was written with the assumption that all redirects are destined for the application behind the reverse proxy and never for anywhere external.

This assumption does make sense especially when applications are typically bound to listen on localhost or 127.0.0.1 and absolute URL redirects such as http://127.0.0.1:8080/node/host/port/foo should go to /node/host/port/foo.

I was thinking of changing the regex to replace anything that points to localhost or a local IP address to the relative path while somehow allowing external domains… but my regex skills aren’t that great.

Here’s my pseudo-regex idea. Would this make sense?

Header edit Location "^[^/]+//( not localhost or 127.0.0.1 )/" ""

That makes sense, and so the issue here is the Galaxy server is making a request itself and sending this out but it is being hit by this rule.

I think the type of regex you are looking for would be something like:

Header edit Location "^[^\/]+\/\/(?!localhost|127\.0\.0\.1)" ""

Thanks!

I’m now using this directive for the node_uri reverse proxy. I’m adding it here in case someone else stumbles onto the same broken redirect issue as I did.

Header edit Location "^https?://(localhost|127\.[0-9]+\.[0-9]+\.[0-9]+)(:[0-9]+)?" ""     

Any matches to URLs that points to localhost or a loopback IP gets its protocol/domain/port removed (leaving only the relative URI behind) from the location header. All other URLs should be left unchanged.

I will report back if I notice any breakage with this change.