Fork/thread/resource exhaustion (in files app?)?

Background;

I’ve had issues where my OOD instance basically stops working for some users after a while (typically after a few days).
I really don’t have that many users, maybe 20 or so running sessions when it breaks. It typically only breaks for one or two users, other can continue to log in just fine.

It’s not easy to reproduce, and I’ve struggled to get the right information from my logs.
I’ve suspected for a while that it is due to some resource exhaustion on the OS, and some errors on some occasions (not all) clearly indicating such e.g.

ERROR: Cannot fork() a new process: Resource temporarily unavailable
ERROR: boost::thread_resource_error: Resource temporarily unavailable

I’m still not sure if it was caused by my setup being limited or if there is some processes running amok.
My OOD instance is running via podman, launched via a systemd service.
I’ve tried to ensure things like LimitNoFILE and ulimit isn’t too constrained, but I’m honestly not sure if I’ve missed something simple, because I’m not actually sure what is hitting the limit.

Even frantically clicking around in every app, i can’t manage to reproduce it intentionally myself.
I suspect the files app based on some of the logs (sometimes), and I can break it by traversing to a directory with an enormous amount of files, which results in the error

Error occurred when attempting to access /pun/sys/dashboard/files/fs//cephyr/NOBACKUP/Datasets/OpenImages-V6/images/train

and forcing me to kill the hung process on the server node (which was stuck on 100%), else i can’t load the portal at all. Though, this symptom isn’t quite what users are seeing.

Question: Any tuning of OS resource limits?

Maybe I’m just missing tuning something?
Either for the OS itself, the container, or the systemd service launching it. I’m very open to suggestions!

Sorry for the trouble.

What is the output of ulimit when you go to check the OS resources? There’s also the option of something like sysctl to see if anything is set strange with the fs there for max files open.

The thing that also jumps out is the use of podman. There may be some type of limit in the container itself for resources that is being hit. When you start up podman, are you setting anything to do with memory or the ulimit there?

Host is a Rocky 8.6 VM with 16 cores and 32 GB of RAM that just runs this one OOD instance. Barely any load or memory usage on the host ever.

Haven’t tuned anything special, so i guess pretty much default values all the way through.

On the OS itself;

# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127313
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 127313
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Essentials of compose file is nothing

# cat ondemand-compose.yml 
version: "3.8"
services: 
  ondemand:
    image: localhost/ondemand:production
    restart: always
    hostname: roberto
    network_mode: host
mounted directory.
    volumes:
      - /apps:/apps
       .... (etc.) 

I set nothing special when launching this, just a standard podman compose.

I manage it with a service;

# cat /etc/systemd/ondemand.service 
[Unit]
Description=Open Ondemand
After=network.target

[Service]
LimitNOFILE=infinity
LimitNOPROC=infinity
Restart=always

# Compose up
ExecStart=podman-compose -f /root/ood_configs/ondemand-compose.yml up

# Compose down, remove containers and volumes
ExecStop=podman-compose -f /root/ood_configs/ondemand-compose.yml down

[Install]
WantedBy=multi-user.target

Initially, I never had any of the Limit* variables set, but they have default values from systemd, so I tried cranking LimitNOFile up (as i’ve seen that be a problem in other applications in the past)
But that didn’t help. In about a week some users are having issues again.
I set them all the way to infinity now, but I won’t know if that will make a difference until users start complaining.

You can set the ulimits for a container. Do you know what the limits are inside the container?

I just checked on my own personal machine and for whatever reason the container had more than the OS.

[src(master)] 🐥  runit ubuntu:20.04
WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers
root@6850b30c2581:/# cat /proc/self/limits | grep open
Max open files            4096                 4096                 files
root@6850b30c2581:/# exit
[src(master)] 🐺  cat /proc/self/limits | grep open
Max open files            1024                 4096                 files

In /etc/containers/containers.conf there’s this setting:

# A list of ulimits to be set in containers by default, specified as
# "<ulimit name>=<soft limit>:<hard limit>", for example:
# "nofile=1024:2048"
# See setrlimit(2) for a list of resource names.
# Any limit not specified here will be inherited from the process launching the
# container engine.
# Ulimits has limits for non privileged container engines.
#
#default_ulimits = [
#  "nofile=1280:2560",
#]

I’m not super familiar with podman-compose but there’s also a --ulimit CLI flag to podman run. Again, I don’t know if you can pass it through pomdan-compose, but the conf setting should work for you.

Yes i did also check the container, and saw the same higher limits there. Though, since the OS itself had smaller limits i kind of didn’t trust these.

# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127313
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1048576
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4194304
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

I’m also not sure what limit i should be looking at, or how much i should expect OOD to require.
It’s definitely fork that is failing (had another error a few days ago):

[Thu Feb  2 18:50:17 2023] cgroup: fork rejected by pids controller in /machine.slice/libpod-19978c9ae7cd8a1b54b0e89bcabfaf17e314a4481f01a34f416866334557c398.scope

I’ve added TasksMax=infinity to my service file now, gotta leave it for a few weeks to see if it helped or not (the default tasksmax for systemd services seem to be “80%”, which I have no idea how to interpret). If this fails, i’ll just start running it outside the service file to exclude any possibility that systemd is imposing the constraints.

Well, systemd service was a red herring.

It’s definitely this limit i’m hitting: --pids-limit=limit — Podman documentation which is hardcoded to only 2048 processes.

Unfortunately, looks both podman-compose and podman kube play both ignores the resources limits section, and . So much for using handy yaml files for all the bind mounts.

I’m very surprised i’m the only one to encounter this given OOD’s preference for forking

I think running OOD in a container is advanced. Which is to say, few sites do this. (I’m guessing).

Can’t say i understand that, i find it to be by far the easiest (pretty much the only possible) way to set up a somewhat complex web application like this and be able to update it easily.

Anyway, I’ve hunted this exhaustively now it’s my duty to document it for posterity:
podman defaults to using 2048 pids max, which is unfortunately just enough to make OOD work long enough for these errors to be hard to diagnose. Look through dmesg | grep cgroup: fork rejected to see if you have hit any limit.
You can also check the pids current and max inside your container, with for example:

# podman exec -it ood_container cat /sys/fs/cgroup/pids/pids.{max,current}
2048
232

Unfortunately, increasing this max value is much harder than it should be. Only podman run supports the --pids-limit=xxx flag. It can’t be used with podman start for some reason.
Creating a kubernetes style yaml file seemed promising

spec:
  containers:
    resources:
      limits:
        pids: 4096
...

but it does nothing, as it is just (quietly) ignored by podman kube play ood.yaml, and there isn’t any way of supplying a command line flag here either, despite this playbook approach is what they try to sell.

So maybe podman-compose and using the valid compose syntax for limits?

version: "3.8"
services: 
  ondemand:
    deploy:
      resources:
        limits:
          pids: 4096

this too, is unfortunately just quietly ignored by podman-compose.
However, it does work with docker compose, if you assign it a positive value. The 0 or -1 that supposed to indicate max or infinite does not work even with docker compose.

So, the only thing left to solve this at all (as far as i can tell) is to create a global option for podman via the containers.conf file:

# cat /etc/containers/containers.conf 
[containers]
pids_limit = 4096  # 0 = unlimited

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.