Can I ask for Ubuntu support? Or are there any install instructions for Ubuntu ? Did not see a “install from source” option either … which is a bit surprising given that the project has HPC roots ?
@dipeit time permitting we can look into what it would take to support Debian-based systems natively. We have not had much demand for this so far. Can you tell us a little more about your architecture / site?
As for installation from source: our installation process used to be very manual up until our 1.3 release which is where we cut over to RPM based installation. Those installation instructions are dated, but still available for older versions: https://osc.github.io/ood-documentation/release-1.2/installation.html
I am also interested in an Ubuntu/Debian installation document or packages. Our cluster runs Ubuntu, along with most of our support servers. I can provide additional architecture information and/or some testing help if necessary.
Folks,
I spent the second half of my Thursday experimenting with installation on Ubuntu and found that the installation of system dependencies is going to be complicated. OnDemand uses SCLs which Ubuntu does not have. Installation then becomes a task of trying to ensure compatible versions of the OnDemand apps, Ruby, Passenger and Apache. Ubuntu 18.x LTS provides Ruby 2.5 by default which the OOD apps have not been tested against. Snap can be used to install Ruby 2.4 but the available versions of Apache and Passenger require the Apt-provided Ruby 2.5, and there are no Snap alternatives for Apache or Passenger. Tangentially, I have read anecdotal evidence that Snap’d software takes up more storage and is slower to boot.
I’m happy to announce that we’ve created an Ansible role for Open Ondemand that I’ve been testing on Ubuntu Bionic.
It should be said, as it is in the README, this is still a work in progress. The runs I’ve been using last week and today produce some zombie process’. Which is to say, this install procedure is not yet production ready, not by a long shot. So, use patience and caution as we update it.
Hi everyone. With my employer, I’m in the process of setting up an Ubuntu 18.04 cluster running openPBS with Singularity containers. I’m getting up to speed with Open OnDemand configuration. I don’t have a background with Ansible. I’m going through some online Ansible courses now to at least understand how to use Ansible roles. I haven’t figured out how to make use of the Ansible role created here yet.
If there is a step-by-step guide on how to implement this Ansible role or a good resource someone has on using an existing Ansible role. Please send it my way. I’m also open for general discussion with Open OnDemand on Ubuntu.
If you have container support on your local machine, I’d suggest running through the test cases for the role. That may clean way to get familiar with ansible because things ‘just work’ (even if they’re through the testing framework molecule).
As you can see from the default
test case, it simply runs through the role. That’s as simple of a playbook as you can get, 1 role, no variables. Of course all the defaults are for CentOS/RHEL, but it’s a container, so you can at get your hands dirty with ansible and the role.
You can run these commands to get that all setup (the current working directory being the role’s directory)
pip install -r molecule/requirements.txt
molecule converge
Taking that a step further with playbooks, inventories and config files, here’s what I have on hand to test this role in an ubuntu:20.04 container. Your inventory file may differ if you’re using docker or a VM you can modify (and perhaps throw away).
[jeff 04:48:59 ansible()] 🐼 cat playbooks/open-ondemand.yml
- hosts: ondemand-hosts
roles:
- ondemand
[jeff 04:49:03 ansible()] 🐍 cat inventories/localhost-containers
[ondemand-hosts]
ubuntu ansible_connection=podman ansible_python_interpreter=/usr/bin/python3
[jeff 04:51:01 ansible()] 🐠 cat conf/ood-src.yml
ood_source_version: "v1.8.19"
install_from_src: true
# ubuntu 20.04 location
ruby_lib_dir: "/usr/lib/x86_64-linux-gnu/ruby/2.7.0"
[jeff 04:52:01 ansible()] 🐯 ansible-playbook -i inventories/localhost-containers --extra-vars=@conf/ood-src.yml playbooks/open-ondemand.yml
Another note on testing in containers is that the default container needs sudo python3 python3-pip
installed so you can’t use an off the shelf ubuntu container.
Hope that helps!
Also - I’m now seeing that the role names are different which could be confusing.
This is what my ansible roles directory looks like, I’ve symlinked ood-ansible
with ondemand
so that’s why I’m able to reference that role that way.
[jeff 05:00:24 images(master)] 🐭 ls ~/.ansible/roles/ -l
drwxrwxr-x. 13 jeff jeff 4096 Mar 5 12:51 ondemand
lrwxrwxrwx. 1 jeff jeff 8 Jan 23 2020 ood-ansible -> ondemand
Thanks for the quick reply @jeff.ohrstrom
For testing purposes, I do have some a couple VM’s setup at (one server and one compute node). I’ve already installed Ansible through the PPA. I need to get my head wrapped around configuring and using Ansible playbooks and roles. I’ll work through what you provided. Let me know if you’d like to fork this off to another topic.
Getting back to this. After trying to get this Ansible role working in a Ubuntu 18.04 VM and running into numerous dependency issues. I grabbed a Ubuntu 20.04 container and shelled into that with Singularity to install the Ansible role. I had to install numerous dependent packages that weren’t included but got all of the molecule/requirements.txt packages to install.
However, now I got this error with molecule. Any thoughts on how to handle it?
Singularity> molecule converge
--> Test matrix
└── default
├── dependency
├── create
├── prepare
└── converge
--> Scenario: 'default'
--> Action: 'dependency'
Skipping, missing the requirements file.
Skipping, missing the requirements file.
--> Scenario: 'default'
--> Action: 'create'
--> Sanity checks: 'docker'
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.8/http/client.py", line 1230, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1276, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1225, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1004, in _send_output
self.send(msg)
File "/usr/lib/python3.8/http/client.py", line 944, in send
self.connect()
File "/usr/local/lib/python3.8/dist-packages/docker/transport/unixconn.py", line 43, in connect
sock.connect(self.unix_socket)
FileNotFoundError: [Errno 2] No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "/usr/local/lib/python3.8/dist-packages/urllib3/util/retry.py", line 532, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python3.8/dist-packages/urllib3/packages/six.py", line 734, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.8/http/client.py", line 1230, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1276, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1225, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1004, in _send_output
self.send(msg)
File "/usr/lib/python3.8/http/client.py", line 944, in send
self.connect()
File "/usr/local/lib/python3.8/dist-packages/docker/transport/unixconn.py", line 43, in connect
sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/docker/api/client.py", line 214, in _retrieve_server_version
return self.version(api_version=False)["ApiVersion"]
File "/usr/local/lib/python3.8/dist-packages/docker/api/daemon.py", line 181, in version
return self._result(self._get(url), json=True)
File "/usr/local/lib/python3.8/dist-packages/docker/utils/decorators.py", line 46, in inner
return f(self, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/docker/api/client.py", line 237, in _get
return self.get(url, **self._set_request_timeout(kwargs))
File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 555, in get
return self.request('GET', url, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/molecule", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/molecule/command/converge.py", line 104, in converge
base.execute_cmdline_scenarios(scenario_name, args, command_args, ansible_args)
File "/usr/local/lib/python3.8/dist-packages/molecule/command/base.py", line 104, in execute_cmdline_scenarios
execute_scenario(scenario)
File "/usr/local/lib/python3.8/dist-packages/molecule/command/base.py", line 146, in execute_scenario
execute_subcommand(scenario.config, action)
File "/usr/local/lib/python3.8/dist-packages/molecule/command/base.py", line 135, in execute_subcommand
return command(config).execute()
File "/usr/local/lib/python3.8/dist-packages/molecule/command/create.py", line 94, in execute
self._config.provisioner.create()
File "/usr/local/lib/python3.8/dist-packages/molecule/provisioner/ansible.py", line 722, in create
pb.execute()
File "/usr/local/lib/python3.8/dist-packages/molecule/provisioner/ansible_playbook.py", line 104, in execute
self._config.driver.sanity_checks()
File "/usr/local/lib/python3.8/dist-packages/molecule_docker/driver.py", line 234, in sanity_checks
docker_client = docker.from_env()
File "/usr/local/lib/python3.8/dist-packages/docker/client.py", line 96, in from_env
return cls(
File "/usr/local/lib/python3.8/dist-packages/docker/client.py", line 45, in __init__
self.api = APIClient(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/docker/api/client.py", line 197, in __init__
self._version = self._retrieve_server_version()
File "/usr/local/lib/python3.8/dist-packages/docker/api/client.py", line 221, in _retrieve_server_version
raise DockerException(
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
You’re trying to get ansible
to connect to a docker container, which it can’t because the docker socket file doesn’t exist.
I don’t think ansible has support for Singularity. When I look at connection plugins I see docker
, podman
, lxc
, lxd
, jail
and chroot
but no singularity.
If you have issues with the 18.04 VM, feel free to open tickets on Github and I can try to sort through them.
This seems like a obvious step before moving to more dashboards.
Portable is very important. The majority of AI/DL is on Ubuntu, and with RHEL licensing and changes to CentOS. I have written a lot of RPMs and RPM spec files. So if we have the spec-files we should be able to automate on other platforms fairly simply.
I am looking at the Ansible for OSC right now. So that may work.
It seems to meet the basic requirements of what I need, except for Ubuntu setup and updates.
I am starting to look at what:
- Packages are needed
- Any package modifications
- Configurations.
Then we should be able to take this and build *.deb packages and *.rpm packages.
Or a ‘script based’ install for the others.
I am starting to go through this now. So if anyone wants to work together to make this work on Ubuntu I am starting on that.
Or if I need to ‘reinvent’ to make it on Ubuntu, I will just make a similar package that has the basic functionality for Ubuntu.
- Slurm
- JupyterLab/Notebook
- TurboVNC - web
- Grafana style dashboard
- OpenLDAP - option
- View files and scheduling resources
- Anaconda/Python Virtual env to control versions
Thank you!
Mark
We’re definitely going to be shipping a .deb packages at some point.
I started this project to track our progress: Ubuntu Packaging · GitHub
Getting back to this after having a new cluster up and running on-site.
My new cluster is based on Ubuntu 18.04 with OpenPBS for a job scheduler. I have the latest Ansible setup on my main head node and pinging all compute nodes. I’ve cloned the ood-ansible repo and am currently trying to use the Ansible role to deploy OOD.
I’m trying to figure out exactly how to use this Ansible role. I’m a novice user of Ansible so maybe I’m not understanding something obvious. Any help with using this role is welcomed. Thanks!
We’re starting to publish .deb
files for 20.04 if that’s of interest to you. Though the ansible role doesn’t have support for installing yet.
That said, here’s the test playbook we use. You’ll need this variable install_from_src
set to true. Then you’ll need an inventory. If you want to run the playbook on the host it’self - then I guess it’d be localhost. Otherwise you’ll initiate it all from say your own laptop and the inventory entry is the FQDN of the server you want to install on.
Maybe you’re looking for storing and finding roles?
https://docs.ansible.com/ansible/latest/user_guide/playbooks_reuse_roles.html#storing-and-finding-roles
Or maybe you want to pull osc.open_ondemand
from galaxy?
https://docs.ansible.com/ansible/latest/galaxy/user_guide.html#installing-collections
Thanks. Looking through the YAML files, I had been wondering if I need to install from source by setting the install_from_src flag.
My uncertainty seems to be with what to pass to the command ansible-playbook and also where to configure my cluster for OOD. From what you said, it seems I need to make my own playbook similar to converge.yml. Would I configure my cluster with /etc/ood/config/clusters.d/<cluster_key>.yml after building and installing from the role?
I’ll checkout the role available in Ansible Galaxy
Hello,
I’m very interested in testing Open OnDemand on a ubuntu 20.04 cluster;
@Chase did you make progress in installing open ondemand ? For your interest, i’m experimenting with a vagrant virtual slurm cluster and at least succeeded in importing the playbook adding some variables in it .
This is here. Note this is, for now, failing at the task “TASK [osc.open_ondemand : build the project (this will take some time)]” (see the readme on the github repo) for some rake build
issues I did not investigate yet.
@jeff.ohrstrom you mention providing deb
packages for ubuntu 20.04. That would be probably more convenient than building from source with the ansible galaxy role. Would you mind indicating where the deb files can be downloaded ?
The deb packages at this time only exist for OnDemand 2.1 which still only has unstable nightly releases. The stable releases for OnDemand 2.0 do not have deb packages.
Has anyone installed OnDemand for ubuntu 20.04? Thanks for any help.