Question for reversion proxy with compute node hostname unresolvable for ood portal server

Hi Teams,

I have one question regarding the reverse proxy. I am intergrate multiple aws parallelclusters into one ood portal running in AWS ( but not in the same VPC). Now each of the AWS pcluster will have compute node under their domain ( from route 53), say for pcluster1, we have compute nodes like ondemand-m7i2xlarge-dy-node-[1-100] under domain ood1.pcluser, also we have ondemand-m7i2xlarge-dy-node-[1-100] under domain ood2.pcluser.

Now the ood portal cound connect with both pclusters and the remote desktop job had been submitted and started in the Slurm job queue in the pclusters. I can confirm the vnc jobs had been started and the port for vnc was open. But I can only see the noVNC logo and hit the “failed to connect to server” error.

I assume the problem is the reverse proxy. And I am confused after thing about multiple pclusters here. Now assume the ood portal server is running on 192.1.1.1, one compute node say ondemand-m7i2xlarge-dy-node–1.ood1.pcluster ( with ip 10.1.1.1) and anotehr node say ondemand-m7i2xlarge-dy-node-1.ood2.pcluster (with ip 10.2.2.2). what node/rnode should I use here. It is not realistic to guarantee the ood portal server will search all the route53 domain ( ood1.pcluster and ood2.pcluster in the example). In fact, we are supposed to run the ood portal in a different vpc and non of the oodx.pcluster could be resolved under the ood portal. So ondemand-m7i2xlarge-dy-node-1 could not be a good choice for the node/rnode? I tried using the ip address ( 10.1.1.1/10.2.2.2 are example ip and they are directly accessible from the ood portal here)

Any suggestion/guidance for setting up the node/rnode name which could not be resolved by the ood portal server in the above example?

Thanks in advance!
Tao

I’m not 100% sure how AWS networking works, but surely you can open network connectivity even if they’re not in the same VPC. You’d just need some sort of routing rules from the ood portal host and the compute nodes and possibly some DNS configurations/forwarder. Googling aws dns resolution between vpcs seems to have some results with this one being the official AWS documentation.

Hi Jeff, thanks for the reply. I had solved ( not prefectly) the problem. Just for reference is any one had similar issues.
The network connectvitiy is not the problem, it is the resolubility ( with mandatory parallel cluster computen node host name under rouble53, which is only visiable within its own VPC). We add named into the ood portal server and forward the dns requests ( based on the paralle cluster domain name) into corresponding dns server in each VPC, then we can provide the vnc connection from ood portal to the compute node. The pre-requirement is 1) each parallel cluster has its own domain name (no duplicated of the parallel cluster name) 2) we need to figure map of the vpc and dns server)

Thanks for your help, Jeff!

1 Like

Tao, sorry for the late reply. I have been traveling.

You have the right idea. Each ParallelCluster use “cluster_name.pcluster” as the suffix for the DNS name. In order for the OOD servers to resolve the compute node DNS, you need to add a Route53 DNS resolver in the OOD VPC. Very similar to your workaround.