I just upgraded OnDemand (from 1.3.7 to 1.8.20). However, when I access the portal, after authentication, I get
Error – nginx: [emerg] bind() to unix:/var/run/ondemand-nginx/XXXXX/passenger.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/var/run/ondemand-nginx/XXXXX/passenger.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/var/run/ondemand-nginx/XXXXX/passenger.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/var/run/ondemand-nginx/XXXXX/passenger.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/var/run/ondemand-nginx/XXXXX/passenger.sock failed (98: Address already in use)
nginx: [emerg] still could not bind()
Completely removing the user’s folder under /var/run/ondemand-nginx
Downgrading to 1.8.12
No luck so far.
Here is some data that says 2 processes are competing for the socket file which probably is the cause. One of them is running as root and the other looks like the child process running as the user.
Hi and welcome! What are the processes 3310 and 3310? I take it their nginx processes’ but I’m wondering if they’re defunct or running.
When you complete step 2 (/opt/ood/nginx_stage/sbin/nginx_stage nginx_clean --user=$USER) I would search for these processes’ if they’re still up or what. Also be sure it removes the socket file.
[root@faa43c1fc202 ~]# sudo /opt/ood/nginx_stage/sbin/nginx_stage nginx_clean --user jeff
[root@faa43c1fc202 ~]# ls /var/run/ondemand-nginx/jeff/ -lrt
total 0
[root@faa43c1fc202 ~]# ps -elf | grep nginx
0 S root 636 431 0 80 0 - 2297 - 17:59 pts/0 00:00:00 grep --color=auto nginx
I confirmed that even if you force kill that nginx master process, it’s smart enough to remove the socket file. But even if it doesn’t, you can try to force kill the nginx master (and any child processes’ if they exist) and force remove the socket file.
[root@faa43c1fc202 ~]# ps -elf | grep nginx
5 S root 670 1 0 80 0 - 23026 - 18:00 ? 00:00:00 nginx: master process (jeff) -c /var/lib/ondemand-nginx/config/puns/jeff.conf
5 S jeff 671 670 0 80 0 - 26668 - 18:00 ? 00:00:00 nginx: worker process
0 S root 727 431 0 80 0 - 2297 - 18:00 pts/0 00:00:00 grep --color=auto nginx
[root@faa43c1fc202 ~]# kill 670
[root@faa43c1fc202 ~]# ps -elf | grep nginx
0 S root 741 431 0 80 0 - 2297 - 18:01 pts/0 00:00:00 grep --color=auto nginx
[root@faa43c1fc202 ~]# ps -elf | grep jeff
4 S jeff 1 0 0 80 0 - 2974 - 17:51 ? 00:00:00 /bin/bash /entrypoint.sh
4 S jeff 420 0 0 80 0 - 3007 - 17:52 pts/0 00:00:00 /bin/bash
0 S root 743 431 0 80 0 - 2297 - 18:01 pts/0 00:00:00 grep --color=auto jeff
[root@faa43c1fc202 ~]# ls /var/run/ondemand-nginx/jeff/ -lrt
total 0
I suspect this is causes by some resource constraint. Here’s steps to debug and try to get a strace for starting up nginx.