Feasibility of Open OnDemand

Hey everyone! :slight_smile:
I am new here and just have one quick question to see if Open OnDemand might be a good fit for something I am working on:

I would like to host a large language model (LLM) on our HPC infrastructure and make it accessible to multiple users through a front-end web app. The idea is to have “one” shared model running (ideally on a GPU), instead of each user spinning up their own instance and burning extra compute.

Is it possible to build something like this using Open OnDemand where users interact with the same backend model via a shared app?

Thanks in advance!

Conceptually yes it’s possible, but it’s really more dependent on your underlying system and configuration than anything Open OnDemand specific.

Whenever somebody asks whether Open OnDemand can do X, my default answer is, can a knowledgeable client on your system do X with some combination of existing software / workflows / configuration? If so, then yes, Open OnDemand can do X.

In this particular situation, the solution boils down to how would you naturally allow multiple users to access a single running job / process? There are a variety of ways I could see this happening, depending on what specific resource manager you utilize and the system configuration. For example, maybe that job is run under a community account, that everyone has sudo access to? Or maybe you primarily interact with the process via files, in which case group permissions could be set appropriate on those files. Etc. etc. etc.

1 Like

thank you Alan for a detailed reply. :slight_smile: