Problems trying to build from source using ansible role

Hi,

I’m trying to build Open Ondemand from source as our cluster runs on Ubuntu 18.04.

So I am using this ansible role: GitHub - OSC/ood-ansible: An ansible playbook for Open Ondemand

I changed the one occurrence of install_from_src: false to install_from_src: true.

I get this result when running a playbook that calls the role:

ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.

The error appears to have been in '/home/dtenenba/dev/ood-ansible/ood-ansible/tasks/deps.yml': line 57, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:


- name: install all the gems we need
  ^ here

Just for fun I commented out lines 57-65 of tasks/deps.yml and then got this:

TASK [ood-ansible : include scl related overrides] *****************************
fatal: [192.168.0.122]: FAILED! => {"msg": "The conditional check '(not install_from_src) and (ansible_os_family == \"RedHat\" and ansible_distribution_major_version < '8')' failed. The error was: error while evaluating conditional ((not install_from_src) and (ansible_os_family == \"RedHat\" and ansible_distribution_major_version < '8')): 'install_from_src' is undefined\n\nThe error appears to have been in '/home/dtenenba/dev/ood-ansible/ood-ansible/tasks/main.yml': line 8, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: include scl related overrides\n  ^ here\n"}
	to retry, use: --limit @/home/dtenenba/dev/ood-ansible/playbook.retry

Any ideas about these issues? I am very much an Ansible newbie but I think I figured out how to run a role from a playbook…

Thanks

Hi Dan. Thanks for posting.

I will have to look into the specifics here. I’m not sure of the solution off the top of my head.

Thanks,
-gerald

You need a community gem off of Ansible galaxy. I opened a ticket on the repo for the same.

Thanks, not sure if I should comment here or in the GitHub issue, but here is what happens when I try to see if I have this role and then try to install it:

ubuntu@ood-build:~$ ansible-galaxy collection list
- the role collection was not found
ubuntu@ood-build:~$ sudo ansible-galaxy collection install community.general
- downloading role 'collection', owned by
 [WARNING]: - collection was NOT installed successfully: Content has no field
named 'owner'

ERROR! - you can use --ignore-errors to skip failed roles and finish processing the list.

I don’t know if you want to use sudo. you install and execute all this stuff as ubuntu. The playbook will occasionally raise privilege, but all the source materials (config, roles and so on) can be user owned.

Thanks for that. Turns out I needed a newer version of ansible.
The playbook/role is running right now. We shall see how far I get…

1 Like

OK, running the role/playbook ended with a 0 exit code and the following status:

PLAY RECAP *********************************************************************
127.0.0.1                  : ok=42   changed=35   unreachable=0    failed=0    skipped=27   rescued=0    ignored=1

There was also a failure (see below) but I am not sure if it is critical or not.

The next question is, what do I do to get ondemand running? Or should it already be running? Port 80 seems to be running a default installation of Apache (it shows the default apache home page). There is nothing on port 443.

If I run service ondemand status it shows a seemingly unrelated service:

$ sudo service ondemand status
● ondemand.service - Set the CPU Frequency Scaling governor
   Loaded: loaded (/lib/systemd/system/ondemand.service; enabled; vendor preset:
   Active: inactive (dead)
Condition: start condition failed at Tue 2022-05-10 16:37:33 UTC; 41min ago

So, assuming OOD was installed properly, how do I get it running?

Thanks.

This is the one failure I got:

RUNNING HANDLER [ood-ansible : update ood portal] ******************************
fatal: [127.0.0.1]: FAILED! => {"changed": true, "cmd": "/opt/ood/ood-portal-generator/sbin/update_ood_portal --force", "delta": "0:00:00.139320", "end": "2022-05-10 17:01:48.700051", "msg": "non-zero return code", "rc": 1, "start": "2022-05-10 17:01:48.560731", "stderr": "/usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': cannot load such file -- bcrypt (LoadError)\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'\n\tfrom /opt/ood/ood-portal-generator/lib/ood_portal_generator/dex.rb:4:in `<top (required)>'\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'\n\tfrom /opt/ood/ood-portal-generator/lib/ood_portal_generator.rb:9:in `<top (required)>'\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'\n\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "stderr_lines": ["/usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': cannot load such file -- bcrypt (LoadError)", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "\tfrom /opt/ood/ood-portal-generator/lib/ood_portal_generator/dex.rb:4:in `<top (required)>'", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "\tfrom /opt/ood/ood-portal-generator/lib/ood_portal_generator.rb:9:in `<top (required)>'", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'", "\tfrom /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'"], "stdout": "", "stdout_lines": []}
...ignoring

What version did you install? ood_source_version looks to default to 2.0.9, so quite some time ago, but if you set it to master it may be flaky. Because we’re still working on native debian support, I haven’t updated the debian pieces of that role for a bit.

Yes, I just used the default, 2.0.9. I changed it to master and this happened:

TASK [ood-ansible : build the project (this will take some time)] **************
fatal: [127.0.0.1]: FAILED! => {"ansible_job_id": "956073475688.23890", "changed": true, "cmd": "rake build -mj$(nproc) > build.out 2>&1", "delta": "0:00:02.007572", "end": "2022-05-10 17:50:34.017597", "finished": 1, "msg": "non-zero return code", "rc": 1, "start": "2022-05-10 17:50:32.010025", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

PLAY RECAP *********************************************************************
127.0.0.1                  : ok=23   changed=8    unreachable=0    failed=1    skipped=3    rescued=0    ignored=0

This seems to be the relevant part of build.out:

[2/4] Fetching packages...

nokogiri-1.13.4-x86_64-linux requires ruby version < 3.2.dev, >= 2.6, which is
incompatible with the current version, ruby 2.5.1p57
rake aborted!
Command failed with status (5): [bin/bundle install --jobs 4 --retry 2 --wi...]
/tmp/ood-build/ondemand/lib/tasks/build.rb:20:in `block (4 levels) in <top (required)>'
/tmp/ood-build/ondemand/lib/tasks/build.rb:19:in `block (3 levels) in <top (required)>'
/tmp/ood-build/ondemand/lib/tasks/build.rb:17:in `each'
/tmp/ood-build/ondemand/lib/tasks/build.rb:17:in `block (2 levels) in <top (required)>'
/var/lib/gems/2.5.0/gems/rake-13.0.3/exe/rake:27:in `<top (required)>'
Tasks: TOP => build => build:all => build:dashboard => build:gems
(See full trace by running task with --trace)

Can you try v2.0.23?

Here is the output:

TASK [ood-ansible : clean up to ensure proper build] ***************************
fatal: [127.0.0.1]: FAILED! => {"changed": true, "cmd": "rake clean", "delta": "0:00:00.176850", "end": "2022-05-10 20:16:12.041558", "msg": "non-zero return code", "rc": 1, "start": "2022-05-10 20:16:11.864708", "stderr": "rake aborted!\nLoadError: cannot load such file -- bcrypt\n/tmp/ood-build/ondemand/lib/tasks/development.rb:6:in `block in <top (required)>'\n/tmp/ood-build/ondemand/lib/tasks/development.rb:3:in `<top (required)>'\n/tmp/ood-build/ondemand/Rakefile:17:in `<top (required)>'\n/var/lib/gems/2.5.0/gems/rake-13.0.3/exe/rake:27:in `<top (required)>'\n(See full trace by running task with --trace)", "stderr_lines": ["rake aborted!", "LoadError: cannot load such file -- bcrypt", "/tmp/ood-build/ondemand/lib/tasks/development.rb:6:in `block in <top (required)>'", "/tmp/ood-build/ondemand/lib/tasks/development.rb:3:in `<top (required)>'", "/tmp/ood-build/ondemand/Rakefile:17:in `<top (required)>'", "/var/lib/gems/2.5.0/gems/rake-13.0.3/exe/rake:27:in `<top (required)>'", "(See full trace by running task with --trace)"], "stdout": "", "stdout_lines": []}

PLAY RECAP *********************************************************************
127.0.0.1                  : ok=22   changed=7    unreachable=0    failed=1    skipped=3    rescued=0    ignored=0

I’ll look into it on my side.

I have sad news to report. 18.04 will never be a target for 2.1 and beyond.

2.1 just has higher dependencies than Ubuntu 18 has to offer.

It’s even too hard to backport all the packaging work we did for OOD 2.1 on Ubunutu 20.04 for OOD 2.0 on Ubuntu 18.04.

That said, my continuous integration works installing from source on 18, so I’ll continue to look into that. Conversely I’d suggest exploring getting a Ubuntu 20 VM as it’s going to make life much easier for you to maintain this instance of Open OnDemand.

Also - to your issue with the playbook I think you can fix it if you install apt install ruby-bcrypt it’ll fix your issue.

1 Like

OK, I will look into what’s involved in setting up a 20.04 machine.

In the meantime I have set up a 20.04 machine in the cloud so I can mess around and I had no trouble installing OOD from source. Now I have a dumb question - I don’t have a browser (or a graphical desktop on the machine) so I can’t access it at localhost via a browser. I went into /etc/ood/config/ood_portal.yml and changed the servername to the FQDN of the machine. But now I can’t figure out how to restart OOD so that it will stop redirecting to localhost. I tried restarting Apache and that didn’t do it. I’d prefer not to reboot as I may get a different IP address…

So how can I restart or otherwise cause the ood_portal.yml to be re-read?
Thanks

Don’t install from the source on Ubuntu 20. Even if you install manually, using the configure tag in the playbook, it will still do what you want.

See this issue for how to install our nightly 2.1. It is a nightly version, but there are no known bugs with it.

You can follow this ticket if we do find a bug in nightly or there’s some breaking change (for you a breaking change would just mean the documentation is off).

1 Like

I appreciate your help. I started over and followed these instructions. Seemed to install ok, but now port 80 is just serving a default apache page and I don’t see any processes running that seem like part of OOD. How do I start it up? Sorry for this basic question but I don’t see this documented…

It’s all good. Indeed it’s such an often asked question we re-wrote the default install and the docs for the same.

Since you’re running a nightly, these docs are more appropriate. What you see is expected and you’re here at step #4.
https://osc.github.io/ood-documentation/develop/installation/install-software.html#verify-installation

1 Like

Showing virtual hosts doesn’t work:

# sudo /sbin/apache2 -S
[Wed May 11 18:48:22.843249 2022] [core:warn] [pid 8579] AH00111: Config variable ${APACHE_RUN_DIR} is not defined
apache2: Syntax error on line 80 of /etc/apache2/apache2.conf: DefaultRuntimeDir must be a valid directory, absolute or relative to ServerRoot

Also, /etc/apache2/sites-available/ood-portal.conf is empty (0 bytes) - is that expected?

OK, got it to work by omitting the absolute path to apache2:

# apache2 -S
VirtualHost configuration:
*:80                   ip-[REDACTED].us-west-2.compute.internal (/etc/apache2/sites-enabled/000-default.conf:1)
ServerRoot: "/etc/apache2"
Main DocumentRoot: "/var/www/html"
Main ErrorLog: "/var/log/apache2/error.log"
Mutex rewrite-map: using_defaults
Mutex lua-ivm-shm: using_defaults
Mutex proxy: using_defaults
Mutex default: dir="/var/run/apache2/" mechanism=default
Mutex watchdog-callback: using_defaults
PidFile: "/var/run/apache2/apache2.pid"
Define: DUMP_VHOSTS
Define: DUMP_RUN_CFG
User: name="www-data" id=33
Group: name="www-data" id=33

Doesn’t seem to be using any virtual hosts?