Problems trying to build from source using ansible role

dtenenba · May 9, 2022, 11:14pm

Hi,

I’m trying to build Open Ondemand from source as our cluster runs on Ubuntu 18.04.

So I am using this ansible role: GitHub - OSC/ood-ansible: An ansible playbook for Open Ondemand

I changed the one occurrence of install_from_src: false to install_from_src: true.

I get this result when running a playbook that calls the role:

ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.

The error appears to have been in '/home/dtenenba/dev/ood-ansible/ood-ansible/tasks/deps.yml': line 57, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:


- name: install all the gems we need
  ^ here

Just for fun I commented out lines 57-65 of tasks/deps.yml and then got this:

TASK [ood-ansible : include scl related overrides] *****************************
fatal: [192.168.0.122]: FAILED! => {"msg": "The conditional check '(not install_from_src) and (ansible_os_family == \"RedHat\" and ansible_distribution_major_version < '8')' failed. The error was: error while evaluating conditional ((not install_from_src) and (ansible_os_family == \"RedHat\" and ansible_distribution_major_version < '8')): 'install_from_src' is undefined\n\nThe error appears to have been in '/home/dtenenba/dev/ood-ansible/ood-ansible/tasks/main.yml': line 8, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: include scl related overrides\n  ^ here\n"}
	to retry, use: --limit @/home/dtenenba/dev/ood-ansible/playbook.retry

Any ideas about these issues? I am very much an Ansible newbie but I think I figured out how to run a role from a playbook…

Thanks

gbyrket · May 10, 2022, 1:21pm

Hi Dan. Thanks for posting.

I will have to look into the specifics here. I’m not sure of the solution off the top of my head.

Thanks,
-gerald

jeff.ohrstrom · May 10, 2022, 3:16pm

You need a community gem off of Ansible galaxy. I opened a ticket on the repo for the same.

dtenenba · May 10, 2022, 4:40pm

Thanks, not sure if I should comment here or in the GitHub issue, but here is what happens when I try to see if I have this role and then try to install it:

ubuntu@ood-build:~$ ansible-galaxy collection list
- the role collection was not found
ubuntu@ood-build:~$ sudo ansible-galaxy collection install community.general
- downloading role 'collection', owned by
 [WARNING]: - collection was NOT installed successfully: Content has no field
named 'owner'

ERROR! - you can use --ignore-errors to skip failed roles and finish processing the list.

jeff.ohrstrom · May 10, 2022, 4:48pm

I don’t know if you want to use sudo. you install and execute all this stuff as ubuntu. The playbook will occasionally raise privilege, but all the source materials (config, roles and so on) can be user owned.

dtenenba · May 10, 2022, 4:56pm

Thanks for that. Turns out I needed a newer version of ansible.
The playbook/role is running right now. We shall see how far I get…

dtenenba · May 10, 2022, 5:21pm

OK, running the role/playbook ended with a 0 exit code and the following status:

PLAY RECAP *********************************************************************
127.0.0.1                  : ok=42   changed=35   unreachable=0    failed=0    skipped=27   rescued=0    ignored=1

There was also a failure (see below) but I am not sure if it is critical or not.

The next question is, what do I do to get ondemand running? Or should it already be running? Port 80 seems to be running a default installation of Apache (it shows the default apache home page). There is nothing on port 443.

If I run service ondemand status it shows a seemingly unrelated service:

$ sudo service ondemand status
● ondemand.service - Set the CPU Frequency Scaling governor
   Loaded: loaded (/lib/systemd/system/ondemand.service; enabled; vendor preset:
   Active: inactive (dead)
Condition: start condition failed at Tue 2022-05-10 16:37:33 UTC; 41min ago

So, assuming OOD was installed properly, how do I get it running?

Thanks.

This is the one failure I got:

RUNNING HANDLER [ood-ansible : update ood portal] ******************************
fatal: [127.0.0.1]: FAILED! => {"changed": true, "cmd": "/opt/ood/ood-portal-generator/sbin/update_ood_portal --force", "delta": "0:00:00.139320", "end": "2022-05-10 17:01:48.700051", "msg": "non-zero return code", "rc": 1, "start": "2022-05-10 17:01:48.560731", "stderr": "/usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': cannot load such file -- bcrypt (LoadError)\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'\n\tfrom /opt/ood/ood-portal-generator/lib/ood_portal_generator/dex.rb:4:in `<top (required)>'\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'\n\tfrom /opt/ood/ood-portal-generator/lib/ood_portal_generator.rb:9:in `<top (required)>'\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "stderr_lines": ["/usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': cannot load such file -- bcrypt (LoadError)", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "\tfrom /opt/ood/ood-portal-generator/lib/ood_portal_generator/dex.rb:4:in `<top (required)>'", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "\tfrom /opt/ood/ood-portal-generator/lib/ood_portal_generator.rb:9:in `<top (required)>'", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'"], "stdout": "", "stdout_lines": []}
...ignoring

jeff.ohrstrom · May 10, 2022, 5:31pm

What version did you install? ood_source_version looks to default to 2.0.9, so quite some time ago, but if you set it to master it may be flaky. Because we’re still working on native debian support, I haven’t updated the debian pieces of that role for a bit.

dtenenba · May 10, 2022, 5:51pm

Yes, I just used the default, 2.0.9. I changed it to master and this happened:

TASK [ood-ansible : build the project (this will take some time)] **************
fatal: [127.0.0.1]: FAILED! => {"ansible_job_id": "956073475688.23890", "changed": true, "cmd": "rake build -mj$(nproc) > build.out 2>&1", "delta": "0:00:02.007572", "end": "2022-05-10 17:50:34.017597", "finished": 1, "msg": "non-zero return code", "rc": 1, "start": "2022-05-10 17:50:32.010025", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

PLAY RECAP *********************************************************************
127.0.0.1                  : ok=23   changed=8    unreachable=0    failed=1    skipped=3    rescued=0    ignored=0

This seems to be the relevant part of build.out:

[2/4] Fetching packages...

nokogiri-1.13.4-x86_64-linux requires ruby version < 3.2.dev, >= 2.6, which is
incompatible with the current version, ruby 2.5.1p57
rake aborted!
Command failed with status (5): [bin/bundle install --jobs 4 --retry 2 --wi...]
/tmp/ood-build/ondemand/lib/tasks/build.rb:20:in `block (4 levels) in <top (required)>'
/tmp/ood-build/ondemand/lib/tasks/build.rb:19:in `block (3 levels) in <top (required)>'
/tmp/ood-build/ondemand/lib/tasks/build.rb:17:in `each'
/tmp/ood-build/ondemand/lib/tasks/build.rb:17:in `block (2 levels) in <top (required)>'
/var/lib/gems/2.5.0/gems/rake-13.0.3/exe/rake:27:in `<top (required)>'
Tasks: TOP => build => build:all => build:dashboard => build:gems
(See full trace by running task with --trace)

jeff.ohrstrom · May 10, 2022, 7:18pm

Can you try v2.0.23?

dtenenba · May 10, 2022, 8:17pm

Here is the output:

TASK [ood-ansible : clean up to ensure proper build] ***************************
fatal: [127.0.0.1]: FAILED! => {"changed": true, "cmd": "rake clean", "delta": "0:00:00.176850", "end": "2022-05-10 20:16:12.041558", "msg": "non-zero return code", "rc": 1, "start": "2022-05-10 20:16:11.864708", "stderr": "rake aborted!\nLoadError: cannot load such file -- bcrypt\n/tmp/ood-build/ondemand/lib/tasks/development.rb:6:in `block in <top (required)>'\n/tmp/ood-build/ondemand/lib/tasks/development.rb:3:in `<top (required)>'\n/tmp/ood-build/ondemand/Rakefile:17:in `<top (required)>'\n/var/lib/gems/2.5.0/gems/rake-13.0.3/exe/rake:27:in `<top (required)>'\n(See full trace by running task with --trace)", "stderr_lines": ["rake aborted!", "LoadError: cannot load such file -- bcrypt", "/tmp/ood-build/ondemand/lib/tasks/development.rb:6:in `block in <top (required)>'", "/tmp/ood-build/ondemand/lib/tasks/development.rb:3:in `<top (required)>'", "/tmp/ood-build/ondemand/Rakefile:17:in `<top (required)>'", "/var/lib/gems/2.5.0/gems/rake-13.0.3/exe/rake:27:in `<top (required)>'", "(See full trace by running task with --trace)"], "stdout": "", "stdout_lines": []}

PLAY RECAP *********************************************************************
127.0.0.1                  : ok=22   changed=7    unreachable=0    failed=1    skipped=3    rescued=0    ignored=0

jeff.ohrstrom · May 10, 2022, 9:15pm

I’ll look into it on my side.

jeff.ohrstrom · May 11, 2022, 12:58pm

I have sad news to report. 18.04 will never be a target for 2.1 and beyond.

2.1 just has higher dependencies than Ubuntu 18 has to offer.

It’s even too hard to backport all the packaging work we did for OOD 2.1 on Ubunutu 20.04 for OOD 2.0 on Ubuntu 18.04.

That said, my continuous integration works installing from source on 18, so I’ll continue to look into that. Conversely I’d suggest exploring getting a Ubuntu 20 VM as it’s going to make life much easier for you to maintain this instance of Open OnDemand.

jeff.ohrstrom · May 11, 2022, 1:19pm

Also - to your issue with the playbook I think you can fix it if you install apt install ruby-bcrypt it’ll fix your issue.

dtenenba · May 11, 2022, 5:42pm

OK, I will look into what’s involved in setting up a 20.04 machine.

In the meantime I have set up a 20.04 machine in the cloud so I can mess around and I had no trouble installing OOD from source. Now I have a dumb question - I don’t have a browser (or a graphical desktop on the machine) so I can’t access it at localhost via a browser. I went into /etc/ood/config/ood_portal.yml and changed the servername to the FQDN of the machine. But now I can’t figure out how to restart OOD so that it will stop redirecting to localhost. I tried restarting Apache and that didn’t do it. I’d prefer not to reboot as I may get a different IP address…

So how can I restart or otherwise cause the ood_portal.yml to be re-read?
Thanks

jeff.ohrstrom · May 11, 2022, 5:57pm

Don’t install from the source on Ubuntu 20. Even if you install manually, using the configure tag in the playbook, it will still do what you want.

See this issue for how to install our nightly 2.1. It is a nightly version, but there are no known bugs with it.

You can follow this ticket if we do find a bug in nightly or there’s some breaking change (for you a breaking change would just mean the documentation is off).

github.com/OSC/ondemand

Features Currently in 2.1 Nightly

opened 07:11PM - 24 Jan 22 UTC

johrstrom

This ticket is meant to be a list of all the upcoming features in 2.1 that folks… can refer to to see if there's anything they may want to check out early in the [nightly](https://yum.osc.edu/ondemand/nightly/) releases before the official 2.1 release. It will also indicate a list of things that break comparability in the nightly releases. Look at [the nightly milestone](https://github.com/OSC/ondemand/milestone/13) for bugs that you may run into. ## :star: Features * Support for per-cluster filesystems in #1409. * Preset apps just launch making 'quick icons' in #1815. * Modules can be automatically loaded in batch connect apps in #1930. Examples are `auto_modules_matlab` or `auto_modules_rstudio` to load matlab or rstudio modules into the form. * Use `auto_primary_group` in batch connect apps to automatically set the charge-back project to the users' primary group in #1964. ## :bug: Bugs No known bugs. ## :stop_sign: Breaking Changes * Default authentication has changed from OIDC to nothing in #1982. See development documentation for install instructions: https://osc.github.io/ood-documentation/develop/ * Radio buttons in batch connect app's form.yml have change the positioning of the value and human friendly version of the description. The human friendly description label is now second, and the actual value is first. This was done to be more inline with how `select` widget options are defined. ```yml # 2.0 behavior - [ 'description', 'value' ] # is now this in 2.1 - ['value', 'description'] ``` ┆Issue is synchronized with this [Asana task](https://app.asana.com/0/1201735133575781/1201737340754889) by [Unito](https://www.unito.io)

dtenenba · May 11, 2022, 6:35pm

I appreciate your help. I started over and followed these instructions. Seemed to install ok, but now port 80 is just serving a default apache page and I don’t see any processes running that seem like part of OOD. How do I start it up? Sorry for this basic question but I don’t see this documented…

jeff.ohrstrom · May 11, 2022, 6:44pm

It’s all good. Indeed it’s such an often asked question we re-wrote the default install and the docs for the same.

Since you’re running a nightly, these docs are more appropriate. What you see is expected and you’re here at step #4.
https://osc.github.io/ood-documentation/develop/installation/install-software.html#verify-installation

dtenenba · May 11, 2022, 6:57pm

Showing virtual hosts doesn’t work:

# sudo /sbin/apache2 -S
[Wed May 11 18:48:22.843249 2022] [core:warn] [pid 8579] AH00111: Config variable ${APACHE_RUN_DIR} is not defined
apache2: Syntax error on line 80 of /etc/apache2/apache2.conf: DefaultRuntimeDir must be a valid directory, absolute or relative to ServerRoot

Also, /etc/apache2/sites-available/ood-portal.conf is empty (0 bytes) - is that expected?

dtenenba · May 11, 2022, 7:02pm

OK, got it to work by omitting the absolute path to apache2:

# apache2 -S
VirtualHost configuration:
*:80                   ip-[REDACTED].us-west-2.compute.internal (/etc/apache2/sites-enabled/000-default.conf:1)
ServerRoot: "/etc/apache2"
Main DocumentRoot: "/var/www/html"
Main ErrorLog: "/var/log/apache2/error.log"
Mutex rewrite-map: using_defaults
Mutex lua-ivm-shm: using_defaults
Mutex proxy: using_defaults
Mutex default: dir="/var/run/apache2/" mechanism=default
Mutex watchdog-callback: using_defaults
PidFile: "/var/run/apache2/apache2.pid"
Define: DUMP_VHOSTS
Define: DUMP_RUN_CFG
User: name="www-data" id=33
Group: name="www-data" id=33

Doesn’t seem to be using any virtual hosts?

Topic		Replies	Views
Getting started - non RedHat shop - Docker? Get Help question	15	406	November 2, 2022
OOD On Ubuntu 20.04 Get Help	4	2117	August 12, 2022
Ansible requirements.yml: ERROR! Expected role dependencies to be a list General Discussion question	3	941	June 21, 2023
Open OnDemand 3.0 Get Help question	2	825	December 4, 2023
Job composer problem after update to ansible role 3.0 Get Help question	4	262	April 26, 2024

Problems trying to build from source using ansible role

Related topics