Auth errors would be 400 level, like 401 unauthorized or 403 forbidden. Are you sure it’s an auth problem? Do you apache logs indicate you’re unable to auth?
Your Per User Nginx (PUN) log file /var/log/ondemand-nginx/$USER/error.log, is likely to have some info about this. I assume you likely can auth, and you’re failing to start the PUN.
Thank you very much for your reply, I have checked the log files as you suggested and there is no $USER folder I used to sign in. Is there something wrong with SSL?
/var/log/httpd24/hal.aaaa.bbbb.cccc.dddd_access_ssl.log
192.168.20.202 - hal9000 [06/Oct/2019:09:37:23 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - hal9000 [06/Oct/2019:09:38:34 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - hal9000 [06/Oct/2019:09:38:36 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - - [06/Oct/2019:09:38:40 -0500] “GET / HTTP/1.1” 302 236 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - - [06/Oct/2019:09:38:40 -0500] “GET / HTTP/1.1” 302 236 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - hal9000 [06/Oct/2019:09:38:40 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - - [06/Oct/2019:09:39:00 -0500] “-” 408 - “-” “-”
192.168.20.202 - - [07/Oct/2019:05:57:59 -0500] “GET / HTTP/1.1” 302 236 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - hal9000 [07/Oct/2019:05:57:59 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - - [07/Oct/2019:05:58:19 -0500] “-” 408 - “-” “-”
192.168.20.202 - hal9000 [07/Oct/2019:06:18:37 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - hal9000 [07/Oct/2019:06:18:38 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - hal9000 [07/Oct/2019:06:20:35 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - hal9000 [07/Oct/2019:06:20:36 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - hal9000 [07/Oct/2019:06:20:49 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - - [07/Oct/2019:06:21:08 -0500] “-” 408 - “-” “-”
192.168.20.202 - - [07/Oct/2019:19:15:49 -0500] “GET / HTTP/1.1” 302 236 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - - [07/Oct/2019:19:15:49 -0500] “GET /pun/sys/dashboard HTTP/1.1” 401 381 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - dmu [07/Oct/2019:19:15:58 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - - [07/Oct/2019:19:15:59 -0500] “GET /favicon.ico HTTP/1.1” 404 209 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - dmu [07/Oct/2019:19:26:12 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - - [07/Oct/2019:19:26:12 -0500] “GET /favicon.ico HTTP/1.1” 404 209 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - dmu [07/Oct/2019:19:26:13 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - - [07/Oct/2019:19:26:13 -0500] “GET /favicon.ico HTTP/1.1” 404 209 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - - [07/Oct/2019:19:26:15 -0500] “GET /favicon.ico HTTP/1.1” 404 209 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - dmu [07/Oct/2019:20:13:00 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0”
192.168.20.202 - - [08/Oct/2019:05:55:39 -0500] “GET / HTTP/1.1” 302 236 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - - [08/Oct/2019:05:55:39 -0500] “GET /pun/sys/dashboard HTTP/1.1” 401 381 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - - [08/Oct/2019:05:55:39 -0500] “GET / HTTP/1.1” 302 236 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - - [08/Oct/2019:05:55:39 -0500] “GET /pun/sys/dashboard HTTP/1.1” 401 381 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - dmu [08/Oct/2019:05:55:47 -0500] “GET /pun/sys/dashboard HTTP/1.1” 500 527 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
192.168.20.202 - - [08/Oct/2019:05:55:48 -0500] “GET /favicon.ico HTTP/1.1” 404 209 “https://hal.ncsa.illinois.edu:8888/pun/sys/dashboard” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36”
/var/log/httpd24/error_log
AH00558: httpd: Could not reliably determine the server’s fully qualified domain name, using 192.168.20.129. Set the ‘ServerName’ directive globally to suppress this message
[Sun Oct 06 09:38:31.136420 2019] [http2:warn] [pid 10371] AH10034: The mpm module (prefork.c) is not supported by mod_http2. The mpm determines how things are processed in your server. HTTP/2 has more demands in this regard and the currently selected mpm will just not do. This is an advisory warning. Your server will continue to work, but the HTTP/2 protocol will be inactive.
[Sun Oct 06 09:38:31.137065 2019] [lbmethod_heartbeat:notice] [pid 10371] AH02282: No slotmem from mod_heartmonitor
[Sun Oct 06 09:38:31.140872 2019] [mpm_prefork:notice] [pid 10371] AH00163: Apache/2.4.34 (Red Hat) OpenSSL/1.0.2k-fips configured – resuming normal operations
[Sun Oct 06 09:38:31.140899 2019] [core:notice] [pid 10371] AH00094: Command line: ‘/opt/rh/httpd24/root/usr/sbin/httpd -D FOREGROUND’
[Mon Oct 07 06:18:34.549671 2019] [mpm_prefork:notice] [pid 10371] AH00170: caught SIGWINCH, shutting down gracefully
[Mon Oct 07 06:18:34.659548 2019] [suexec:notice] [pid 24840] AH01232: suEXEC mechanism enabled (wrapper: /opt/rh/httpd24/root/usr/sbin/suexec)
AH00558: httpd: Could not reliably determine the server’s fully qualified domain name, using 192.168.20.129. Set the ‘ServerName’ directive globally to suppress this message
[Mon Oct 07 06:18:34.691233 2019] [http2:warn] [pid 24840] AH10034: The mpm module (prefork.c) is not supported by mod_http2. The mpm determines how things are processed in your server. HTTP/2 has more demands in this regard and the currently selected mpm will just not do. This is an advisory warning. Your server will continue to work, but the HTTP/2 protocol will be inactive.
[Mon Oct 07 06:18:34.691876 2019] [lbmethod_heartbeat:notice] [pid 24840] AH02282: No slotmem from mod_heartmonitor
[Mon Oct 07 06:18:34.695669 2019] [mpm_prefork:notice] [pid 24840] AH00163: Apache/2.4.34 (Red Hat) OpenSSL/1.0.2k-fips configured – resuming normal operations
[Mon Oct 07 06:18:34.695698 2019] [core:notice] [pid 24840] AH00094: Command line: ‘/opt/rh/httpd24/root/usr/sbin/httpd -D FOREGROUND’
[Mon Oct 07 06:20:42.284938 2019] [mpm_prefork:notice] [pid 24840] AH00170: caught SIGWINCH, shutting down gracefully
[Mon Oct 07 06:20:42.388483 2019] [suexec:notice] [pid 25022] AH01232: suEXEC mechanism enabled (wrapper: /opt/rh/httpd24/root/usr/sbin/suexec)
AH00558: httpd: Could not reliably determine the server’s fully qualified domain name, using 192.168.20.129. Set the ‘ServerName’ directive globally to suppress this message
[Mon Oct 07 06:20:42.418602 2019] [http2:warn] [pid 25022] AH10034: The mpm module (prefork.c) is not supported by mod_http2. The mpm determines how things are processed in your server. HTTP/2 has more demands in this regard and the currently selected mpm will just not do. This is an advisory warning. Your server will continue to work, but the HTTP/2 protocol will be inactive.
[Mon Oct 07 06:20:42.419207 2019] [lbmethod_heartbeat:notice] [pid 25022] AH02282: No slotmem from mod_heartmonitor
[Mon Oct 07 06:20:42.423370 2019] [mpm_prefork:notice] [pid 25022] AH00163: Apache/2.4.34 (Red Hat) OpenSSL/1.0.2k-fips configured – resuming normal operations
[Mon Oct 07 06:20:42.423404 2019] [core:notice] [pid 25022] AH00094: Command line: ‘/opt/rh/httpd24/root/usr/sbin/httpd -D FOREGROUND’
[Mon Oct 07 19:26:07.932577 2019] [mpm_prefork:notice] [pid 25022] AH00170: caught SIGWINCH, shutting down gracefully
[Mon Oct 07 19:26:08.042351 2019] [suexec:notice] [pid 1823] AH01232: suEXEC mechanism enabled (wrapper: /opt/rh/httpd24/root/usr/sbin/suexec)
AH00558: httpd: Could not reliably determine the server’s fully qualified domain name, using 192.168.20.129. Set the ‘ServerName’ directive globally to suppress this message
Sorry, $USER here is you’re username. Like my logs show up in /var/log/ondemand-nginx/johrstrom/error.log where johrstrom is my username. Looks like your username is hal9000 so /var/log/ondemand-nginx/hal9000/error.log may be your ticket.
I don’t think apache would boot with bad SSL configs, I’m still guessing there’s something wrong in booting up your PUN, that’s where I’ve usually seen 500 errors. We really only use apache for a proxy, and behind it is the PUN.
Thank you so much for your help. Hi, We have found the problem that there is a commented line within openldap configure file, so the service can not find the certificate. It is working now!
OK cool, Sorry to lead you down a wild goose chase with the PUN business. You had indicated an ldap and SSL problem and I didn’t see either. Apache 500 errors could be all sorts of things, and clearly you were on to something probably because you had just enabled it or was working on it. Sorry again for the confusion.
The idea is you have wrapper scripts on your x86 machine that is running OnDemand and instead of installing the Slurm client binaries on that machine you use ssh wrapper scripts to execute the Slurm binaries on another host (a login node?) of the ppc64le cluster. Of course you would need to setup trust between the two so you could ssh without requiring authentication. Perhaps this would work.