Good afternoon. 2 users in a large environment recently started experiencing 502 proxy errors. Is there a specic place I can check for more details regarding what is causing the errors?
I have checked the following so far:
/var/log/httpd
ondemand_exporter_access.log
ssl_error_log
ondemand_exporter_error.log
error_log
access_log
/var/log/ondemand-nginx
I checked the user subdirectories also in this location as well.
The strange thing is that the users mentioned these errors after opening a console session while opening ood. Any advice would be appreciated. I can post logs if that would help. Thanks.
/var/log/ondemand-nginx/$USER/error.log would be the best bet here I think. Then you may want to do a ps check on the host to find their PUNs and see if the processes’ are defunct or what.
You should likely kill it yes. There’s a crontab entry that should have stopped it - but I guess not? In any case, there’s a command to stop it so that you don’t have to find it’s process id and kill it manually.
Thanks so much. Trying now. I appreciate your help.
“nginx-stage” does not seem to exist on my device. I do have an /etc/ood/config/nginx-stage.yml.
Can I just manually kill the process using the kill command?
*reread your last message. I will kill it manually and have the user retry*
Welp. The user is now getting: “Now the error reads “The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.”"
Thanks Jeff. Is there a process I need to start now?
You’re now getting 500 server error instead of 502. You can look at the same ondemand-nginx logs for the same.
You (as the admin) shouldn’t need to start the PUN. But it is much safer to use nginx_stage to stop the pun and it’s children. If you used kill <PID> to stop the PUN you likely left a child or killed the wrong process.
I was monitoring the error.log when the uer tried and nothing showed up. I kill a root ngingx process for the user as well as an nginx process for the user from the 12th.
Hey Jeff. I was able to get the 2nd user going using the nginx-stage command to kill their defunct job(s). Thanks you. Is there anything I can do for the other user where I killed the processes manually?
I tried to reload and restart the user process with the nginx-stage command and got errors since I killed the processes. I just ran the command again without specifying a “signal” and I didn’t get an error. The user is going to test things. Thanks for your help and patience Jeff.