Reverse proxy response body substitution?

Problem statement:

  • urls and paths in the response body don’t match up with the front end structure.

Background:

Howdy,

We at Texas A&M are working on the Cryosparc app. The Cryosparc web server code has lots of hardcoded absolute paths in it that get returned in the body of a response. In particular, the prefix /<node uri>/<host>/<port> is not included, so the browser can’t reach the node. This affects, for example: css/js source files needed to display the page, and form action buttons to make Cryosparc do things. Neither /node nor /rnode works in this case, because reverse proxy only modifies the request and response headers, not the body. There is no way to configure Cryosparc to use either a root url prefix or relative paths to try to make it compatible. Apparently, the Cryosparc developers are too busy to be willing to address this issue. And finally, since Cryosparc is proprietary software, we can’t simply jump into the source code to try to fix it ourselves. Here’s a recent topic in their forum summarizing highlighting the situation.

A cursory glance at some google results tells me that this type of problem is actually very common for people using reverse proxies. Ideally, you would like to have control over the source of your content, but sometimes you just don’t. Even if we got Cryosparc source code changed to make it compatible with either /node or /rnode, the next proprietary software would just give us the same trouble again. I would rather have the ability to adapt myself rather than rely on other people to write good code.

Potential solution:

Apache and has a module mod substitute which replaces matching text in the body of a response. In the example use cases, it even describes using it to fix this exact type of reverse proxy URL mismatch issue.

(Nginx has a module that does the same thing ngx_http_sub_module in case that matters).

What if we created another node uri that’s pretty much the same as /rnode, but additionally, it fixes all of the absolute paths in the response body by inserting the /<node uri>/<host>/<port> prefix. This would allow the browser to recursively fetch whatever it needed from the node.

I am not a web developer, but I am pretty sure this just involves adding a small handful of apache statements to the file ood-portal.conf.erb (and of course, copy and paste every reference to /rnode everywhere).

Am I thinking correctly? Could this work? Is it worth it?