OOD AWS integrations

Can anyone brief me on the current status of AWS integrations for OOD? Would like to build better documentation around deployment on AWS, and how to operate hybrid HPC.


Related to this, the AWS Slurm plugin-v2 (GitHub - aws-samples/aws-plugin-for-slurm at plugin-v2) creates auto-scaling cloud partition(s) in on-prem environments. The remaining challenge is data motion.

From a cloud-first perspective, AWS ParallelCluster (AWS ParallelCluster - Amazon Web Services) boots complete HPC environments on AWS in minutes, and could embed as an app inside OOD. It could be used for 1-click deployment of OOD (see e.g., GitHub - aws-samples/no-tears-cluster: 1-Click Cluster Deployment with AWS ParallelCluster , GitHub - aws-samples/1click-hpc: Run your HPC Cluster on AWS with 1-Click.).

Hi and welcome!

We have native support for a product called Cloudy Cluster. Though they only mention GCP in their documentation. Their images started to ship with Open OnDemand baked in. We did build support for AWS on our side, but only tested it in GCP (they have an abstraction layer for scheduling and managing instances that we use).

Other than that, it’s basically the same as running on premise. You’d need some LDAP (or other user management scheme) and a scheduler like Slurm. OnDemand doesn’t manage infrastructure (and we don’t intend to) so whether the scheduler is in the cloud or on premise really doesn’t matter to us as long as it accepts requests from that user (that Linux UID/GID).

That said - we don’t really support cloud products like AWS CfnCluster or similar products on Azure and GCP (yet!). Hope that helps!