We’ve setup a VS Code (code-server) app on our OOD instance (big thanks to NMSU for sharing their code). When code-server is in a Slurm job, what’s a reasonable number of tasks and/or CPUs/task?
I’m aware that VS Code can spawn a lot of subprocesses, so I’m wondering if users can see some benefits with > 1 task and/or CPU/task.
Thanks for posting the question. I am not sure what you mean by “task”, but maybe that’s a Slurm thing.
Here’s the configuration we use at OSC:
Ultimately, this is really going to be site dependent in my view though. Some sites will have far more users running Code server than another, and what those users are doing in code server can also effect resource usage as well. So getting a feel for you own site is probably the best advice you could get.
I would say that using 1 CPU for codeserver seems like a bad idea given how resource intense it can be. If the workload is relatively light and the cluster has many available resources, then using 2 CPUs per task would be sufficient. But, if the workload is heavier or the cluster is more limited in terms of resources, then using 4 CPUs per task might be more appropriate.
In a slurm job, a task is a process. Slurm has the separate concept of CPUs/task. That’s used for a multi-threaded apps. So here’s what you would request for different kinds of program:
Multi-process, single-threaded: m tasks, 1 cpu/task.
Single-process, multi-threaded: 1 task, n cpu/task
Multi-process/multi-threaded: m tasks and n cpus/task
So I’m not sure if VS Code is primarily multi-process or multi-thread. My intuition (just based on running it on my Mac with the activity montior) is that it’s spawning a bunch of single-threaded subprocesses. So I think I should request multiple tasks in my slurm job, each with 1 cpu/task.
I did some really quick-and-dirty profiling in the slurm job. It looks like requesting multiple tasks (--ntasks=whatever) and 1 CPU/task (--cpus-per-task=1) gives appropriate CPU utilization. In the job, there are bunch of subprocesses spawned by code-server. There are <= ntasks cores active, with the subprocesses context-switching in and out.