We have a weird problem. We spend now some months setting up Open onDemand(2.0), customizing/branding putting applications etc. It seems to work fine for almost all the users.
Two users seem to have the same problem: All the customization, branding, applications
seem to disappear and showing a default(vanilla) version of open onDemand. More specific:
1)One user used it before few months without problem and this problem appeared before few days;
2) The other started using open onDemand before few days and has this problem from start;
We checked all logs, but no any errors and the authentication works fine.
We tried few things without success:
-Erased the ondemand folder in their home folder;
-Erased their files in /var/lib/ondemand-nginx/config/puns/ in ondemand server;
-Empty cache of web browser, tried different browsers, tried in private browser tabs etc;
I check the permission of file and folders for apps etc. but nothing seems wrong over there.
Any help will be highly appreciated.
I think we’re going to have to see the browsers dev tools. Specifically the console tab I think. Seems like if it was an issuer server side, it’d affect many more people. But since it’s just one or two, there must be something on their side. I’m thinking browser is pulling these assets down (the CSS and so on) but somehow failing to apply them. Any errors should be in the browsers console. A screenshot of the actual page may help too.
Sorry for delaying with a reply. This is because we need to wait for the users to send us the screenshots, as we cannot reproduce this problem in test accounts. This is how it looked plus the console.
Also, in the Clusters menu the name of the cluster do not appear in shell access.
For comparison that’s what a (working)user will see:
Thanks for the help.
Got it thanks. It does look like it’s all server side. When you said “customization & branding” I was thinking about CSS and the navbar color - not missing pinned apps.
So, OK this is actually a bit easier then if it’s all server side. My guess is they’re not able to access those apps. Do you have them behind some FACL permissions? My guess is is that the user literally cannot read the files of the system drive.
If you’re able to impersonate them with a command like this, hop on your machine and see if they can see those app files.
sudo -u foop ls -lrt /var/www/ood/apps/sys/jupyter
Actually is kind of both and partial:
- Applications: not appearing. The permissions is quite normal, as instructed in manual. I tried the
command you send and works fine, he can see the files. The permissions in the applications are world readable;
- “customization & branding”. Although colouring and the icons appear as normal into the user screen,
the “alma_remote_access” does not appear, the active jobs throw an error, and also in job builder the templates are not working. Also, the customization paths do not appear;
3)Just need to emphasize that right now we are using in around 15 “early adopters” and only two of them have this problem. All other users they do not have any problem;
OK - How about the
cluster.d files. Can they read those?
What’s the active jobs error? You should be able to pull those from
Yes, he can read the cluster.d files.
Sorry, there is no active jobs error but simply the list is totally empty (showing all jobs).
I asked a few more pics:
This is from the Job Composer:
The log did not show any errors. Noral output plus some warnings about deprecated variables. Let know if you want to attach it.
If anyone has any idea how to resolve or investigate this, It will be highly appreciated.
We also have problem reproducing it as it only appear to a specific (small) number of users.
So, we actually need to ask the users to login to check if a remedy is working or not.
They can’t or didn’t read those files, or your clusters have group ACLs applied to them. Can you spot check your
cluster.d files to be sure they don’t have group block or allowlists?
That’s what’s causing this issue I’m sure, they cannot read these files entirely or there’s an allowlist they’re not a part of. If it works for other folks, then you have to determine why these folks are different.
There should be a crontab entry to stop PUNs that are inactive. I’d make sure that’s running and maybe stop theirs. Maybe something in the user initialization process didn’t work?
Thanks for the help. We managed to find the problem in our cluster.d configuration files. I am sharing our configuration which can be useful for other users:
We setup our dex with ldap and a filter for the AD group for our HPC users(memberOf=…).
Then, in our cluster.d configuration file, we setup the acls to our group(admins) and forgetting the previous setup in dex. Although, someone would expect only the admins to be allowed to login etc. it seems
to have a weird effect: All users in AD group in dex to login but some of them(randomly belonging or not to admin group) to see the minimum setup, as showed above.
We totally, commented out the acls section in cluster.d configuration file and now everything seems to work fine for the users.
Thanks for the help! The Open OnDemand is great!