-
Notifications
You must be signed in to change notification settings - Fork 22
Add Ansible playbook and roles to configure the host and deploy Koina server #194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| # Ansible: Koina server deployment | ||
|
|
||
| This directory contains the Ansible playbooks and roles used to provision Koina inference servers. The main orchestration is performed by the `koina_server.yml` playbook and its included roles. Use this README as a quick reference for installing role requirements, running the playbook, and where to find common variables and templates. | ||
|
|
||
| _NOTE: The Ansible playbook to deploy KOINA server will fetch the TLS/SSL certificates automatically using Certbot for the specified domains from Let's Encrypt._ | ||
|
|
||
| ## Quick start | ||
|
|
||
| 1. From the repo root, install collections and roles listed in requirements.yml: | ||
| ```bash | ||
| cd docs/server/deployment/ansible | ||
| ansible-galaxy install -r requirements.yml | ||
| ``` | ||
|
|
||
| 2. Update the inventory file `hosts` and the Nginx templates in `templates/nginx/` as needed for your environment and also the variables in `koina_server.yml` (domain names, email address, paths, etc.). | ||
|
|
||
| 3. Run the main playbook (example inventory file and variables live in the repo): | ||
| ```bash | ||
| ansible-playbook koina_server.yml --ask-become-pass | ||
| ``` | ||
| Adjust the inventory path, extra-vars, and become options for your environment. | ||
|
|
||
| ## What is included | ||
|
|
||
| - koina_server.yml — main playbook that composes the server setup (Docker, KOINA/Triton, services, etc.). | ||
| - requirements.yml — pinned Ansible roles/collections required by the playbooks. | ||
| - roles/ — local roles used by the playbook (examples: koina-server, nvidia-container-toolkit, etc.). | ||
| - templates/ — templates used by roles (includes nginx vhost templates for the nginx role). | ||
| - defaults/ in each role — role-level default variables (recommended place to review defaults before overriding). | ||
| - tasks/, handlers/, files/ — standard Ansible role layout for each role. | ||
|
|
||
| ## Variables & configuration | ||
|
|
||
| - Primary variables and the flow of configuration are defined in `koina_server.yml` and the defaults files of each role (`roles/<role>/defaults/main.yml`). | ||
| - Override values per-host or per-group using `host_vars/` or `group_vars/` or pass via `-e` on the command line. | ||
| - Ensure to review and modify accordingly: | ||
| - `koina_server.yml` for the overall orchestration and variable flow. | ||
| - `roles/*/defaults/main.yml` to find complete variable names and defaults. | ||
| - Ensure that the `docker_user` variable in `koina_server.yml` is set to the user that should have permissions to run Docker commands (usually the default user on the server, e.g., `ubuntu`). | ||
|
|
||
| ## Nginx templates | ||
|
|
||
| - Nginx virtual host templates live under `templates/nginx` and are consumed by the `geerlingguy.nginx` role. Modify or copy these templates to customize upstreams, SSL, or proxy rules before running the playbook. | ||
|
|
||
| Example templates path: | ||
| ``` | ||
| templates/nginx/koina.conf.j2 | ||
| templates/nginx/koinarpc.conf.j2 | ||
| ``` | ||
|
|
||
| ### TLS/SSL Certificates | ||
| - The playbook uses the `geerlingguy.certbot` role to automatically obtain and renew TLS/SSL certificates from Let's Encrypt using Certbot. | ||
| - Ensure that the domain names specified in the variables are correctly pointed to your server's IP address before running the playbook. | ||
| - Port 80 must be open and accessible for the HTTP-01 challenge used by Let's Encrypt. | ||
| - The email address provided in the variables is used for important account notifications from Let's Encrypt. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| [koina_servers] | ||
| koina-bi-01 ansible_host=<YOUR_SERVER_IP> | ||
|
|
||
| [all:vars] | ||
| ansible_ssh_user=ubuntu # Change this if your server uses a different user (it is assumed that this user has sudo privileges) | ||
| ansible_ssh_private_key_file=~/.ssh/id_rsa # Path to your SSH private key |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| --- | ||
| # Playbook to deploy Minio on a VM | ||
| - hosts: koina_servers | ||
| become: true | ||
| vars_files: | ||
| - secret_group_vars/all.yml | ||
| vars: | ||
| koinarpc_docker_port: 8500 | ||
| koinahttp_docker_port: 8501 | ||
| koinametrics_docker_port: 8502 | ||
| KOINA_SERVER_NAME: "yourdomain.com" # Change this to your domain | ||
| KOINA_RPC_SERVER_NAME: "rpc.yourdomain.com" # Change this to your domain | ||
| ADMIN_EMAIL_ADDRESS: "admin@yourdomain.com" # Change this to your email address for Certbot/Let's Encrypt | ||
| KOINA_CONTAINER_DIR: "/opt/koina" # Directory to store Koina Docker container data and the compose file | ||
| docker_user: "ubuntu" # Change this to the user that should have Docker permissions | ||
| pre_tasks: | ||
| - name: Update and upgrade apt packages | ||
| ansible.builtin.apt: | ||
| update_cache: true | ||
| upgrade: dist | ||
|
|
||
| - name: Install dependencies | ||
| ansible.builtin.apt: | ||
| name: | ||
| - python3 | ||
| - python3-venv | ||
| - python3-pip | ||
| - ufw | ||
| - ubuntu-drivers-common | ||
| state: present | ||
| update_cache: true | ||
| roles: | ||
| - role: geerlingguy.docker | ||
| vars: | ||
| docker_users: | ||
| - "{{ docker_user }}" | ||
|
|
||
| - role: firewall | ||
|
|
||
| - role: nvidia-driver | ||
|
|
||
| - role: nvidia-container-toolkit | ||
|
|
||
| - role: geerlingguy.nginx | ||
| vars: | ||
| nginx_remove_default_vhost: true | ||
| nginx_vhosts: | ||
| - server_name: "{{ KOINA_SERVER_NAME }}" | ||
| template: "{{ playbook_dir }}/templates/nginx/koina.conf.j2" | ||
| - server_name: "{{ KOINA_RPC_SERVER_NAME }}" | ||
| template: "{{ playbook_dir }}/templates/nginx/koinarpc.conf.j2" | ||
| ssl_certificate_path: '/etc/letsencrypt/live/{{ KOINA_SERVER_NAME }}/fullchain.pem' | ||
| ssl_certificate_key_path: '/etc/letsencrypt/live/{{ KOINA_SERVER_NAME }}/privkey.pem' | ||
|
|
||
| - role: geerlingguy.certbot | ||
| vars: | ||
| certbot_create_if_missing: true | ||
| certbot_create_extra_args: '' | ||
| certbot_create_method: standalone | ||
| certbot_admin_email: "{{ ADMIN_EMAIL_ADDRESS }}" | ||
| certbot_create_standalone_stop_services: | ||
| - nginx | ||
| certbot_certs: | ||
| - domains: | ||
| - "{{ KOINA_SERVER_NAME }}" | ||
| - "{{ KOINA_RPC_SERVER_NAME }}" | ||
| webroot: '/var/www/certbot' | ||
|
|
||
| - role: koina-server | ||
| vars: | ||
| koina_container_dir: "{{ KOINA_CONTAINER_DIR }}" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| --- | ||
| collections: | ||
| - name: community.general | ||
| source: https://galaxy.ansible.com | ||
| - name: community.docker | ||
| source: https://galaxy.ansible.com | ||
| roles: | ||
| - name: geerlingguy.docker | ||
| version: 7.6.0 | ||
| - name: geerlingguy.certbot | ||
| version: 5.4.1 | ||
| - name: geerlingguy.nginx | ||
| version: 3.2.0 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # Ansible role: firewall | ||
|
|
||
| Role to install and configure the required firewall on target hosts (Ubuntu only). | ||
|
|
||
| ## Features | ||
| - Configures UFW with predefined rules | ||
| - Denies all incoming connections by default | ||
| - Allows all outgoing connections by default | ||
| - Opens specific ports for SSH, HTTP, HTTPS | ||
|
|
||
| ## Requirements | ||
| - Sudo/root privileges on target hosts | ||
|
|
||
| ## Example playbook | ||
| ```yaml | ||
| - hosts: all | ||
| become: true | ||
| roles: | ||
| - firewall | ||
| ``` |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| allowed_ports: | ||
| - 22 | ||
| - 80 | ||
| - 443 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| galaxy_info: | ||
| author: Sanjay Srikakulam | ||
| description: Ansible role to configure the required firewall on target hosts. | ||
| company: Forschungszentrum Jülich | ||
|
|
||
| # If the issue tracker for your role is not on github, uncomment the | ||
| # next line and provide a value | ||
| # issue_tracker_url: http://example.com/issue/tracker | ||
|
|
||
| # Choose a valid license ID from https://spdx.org - some suggested licenses: | ||
| # - BSD-3-Clause (default) | ||
| # - MIT | ||
| # - GPL-2.0-or-later | ||
| # - GPL-3.0-only | ||
| # - Apache-2.0 | ||
| # - CC-BY-4.0 | ||
| license: MIT | ||
|
|
||
| min_ansible_version: "2.1" | ||
|
|
||
| # If this a Container Enabled role, provide the minimum Ansible Container version. | ||
| # min_ansible_container_version: | ||
|
|
||
| # | ||
| # Provide a list of supported platforms, and for each platform a list of versions. | ||
| # If you don't wish to enumerate all versions for a particular platform, use 'all'. | ||
| # To view available platforms and versions (or releases), visit: | ||
| # https://galaxy.ansible.com/api/v1/platforms/ | ||
| # | ||
| platforms: | ||
| - name: Ubuntu | ||
| versions: | ||
| - all | ||
|
|
||
| galaxy_tags: [] | ||
| # List tags for your role here, one per line. A tag is a keyword that describes | ||
| # and categorizes the role. Users find roles by searching for tags. Be sure to | ||
| # remove the '[]' above, if you add tags to this list. | ||
| # | ||
| # NOTE: A tag is limited to a single word comprised of alphanumeric characters. | ||
| # Maximum 20 tags per role. | ||
|
|
||
| dependencies: [] | ||
| # List your role dependencies here, one per line. Be sure to remove the '[]' above, | ||
| # if you add dependencies to this list. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| --- | ||
| - name: Install required packages for firewall | ||
| ansible.builtin.apt: | ||
| name: | ||
| - ufw | ||
| state: present | ||
| update_cache: true | ||
|
|
||
| - name: Enable UFW | ||
| community.general.ufw: | ||
| state: enabled | ||
|
|
||
| - name: Allow required incoming ports | ||
| community.general.ufw: | ||
| rule: allow | ||
| port: '{{ item }}' | ||
| proto: tcp | ||
| loop: "{{ allowed_ports }}" | ||
|
|
||
| - name: Allow all outgoing traffic | ||
| community.general.ufw: | ||
| default: allow | ||
| direction: outgoing | ||
|
|
||
| - name: Deny all other incoming traffic | ||
| community.general.ufw: | ||
| default: deny | ||
| direction: incoming |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| # Ansible role: koina-server | ||
|
|
||
| Role to provision and configure a Koina inference server. | ||
|
|
||
| This role installs and configures the components required to run a Koina server (Docker + NVIDIA runtime, Triton model repository deployment steps, service configuration). It is intended for use from the repository's Ansible playbook [koina-server.yml](https://github.com/wilhelm-lab/koina/tree/main/docs/server/deployment/ansible/koina_server.yml). | ||
|
|
||
| ## Features | ||
| - Deploys Koina container with GPU support. | ||
|
|
||
| ## Requirements | ||
| - A target host with sudo privileges. | ||
| - Internet access to download packages and model artifacts. | ||
| - Docker, NVIDIA Container Toolkit, NVIDIA drivers. | ||
| - Other roles in the ansible roles directory as well as the [koina-server.yml](https://github.com/wilhelm-lab/koina/tree/main/docs/server/deployment/ansible/koina_server.yml) playbook. | ||
|
|
||
| ## Role variables | ||
| Define role variables in your playbook or inventory group_vars/host_vars. Typical variables include (examples only — adjust for your environment): | ||
|
|
||
| - koina_container_name: "koina-server" | ||
| - koina_container_dir: "" | ||
| - koinarpc_docker_port: 8500 | ||
| - koinahttp_docker_port: 8501 | ||
| - koinametrics_docker_port: 8502 | ||
| - koina_shm_size: '8gb' | ||
|
|
||
| (Note: Adapt the variables to your specific needs and also the Docker Compose template in the role's templates/ directory.) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| --- | ||
| koina_container_name: "koina-server" | ||
| koina_container_dir: "/opt/koina" | ||
| koinarpc_docker_port: 8500 | ||
| koinahttp_docker_port: 8501 | ||
| koinametrics_docker_port: 8502 | ||
| koina_shm_size: "2gb" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| --- |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| galaxy_info: | ||
| author: Sanjay Srikakulam | ||
| description: Ansible role to deploy and manage KOINA inference server | ||
| company: Forschungszentrum Jülich | ||
|
|
||
| # If the issue tracker for your role is not on github, uncomment the | ||
| # next line and provide a value | ||
| # issue_tracker_url: http://example.com/issue/tracker | ||
|
|
||
| # Choose a valid license ID from https://spdx.org - some suggested licenses: | ||
| # - BSD-3-Clause (default) | ||
| # - MIT | ||
| # - GPL-2.0-or-later | ||
| # - GPL-3.0-only | ||
| # - Apache-2.0 | ||
| # - CC-BY-4.0 | ||
| license: MIT | ||
|
|
||
| min_ansible_version: "2.1" | ||
|
|
||
| # If this a Container Enabled role, provide the minimum Ansible Container version. | ||
| # min_ansible_container_version: | ||
|
|
||
| # | ||
| # Provide a list of supported platforms, and for each platform a list of versions. | ||
| # If you don't wish to enumerate all versions for a particular platform, use 'all'. | ||
| # To view available platforms and versions (or releases), visit: | ||
| # https://galaxy.ansible.com/api/v1/platforms/ | ||
| # | ||
| platforms: | ||
| - name: Ubuntu | ||
| versions: | ||
| - all | ||
|
|
||
| galaxy_tags: [] | ||
| # List tags for your role here, one per line. A tag is a keyword that describes | ||
| # and categorizes the role. Users find roles by searching for tags. Be sure to | ||
| # remove the '[]' above, if you add tags to this list. | ||
| # | ||
| # NOTE: A tag is limited to a single word comprised of alphanumeric characters. | ||
| # Maximum 20 tags per role. | ||
|
|
||
| dependencies: [] | ||
| # List your role dependencies here, one per line. Be sure to remove the '[]' above, | ||
| # if you add dependencies to this list. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| --- | ||
| - name: Ensure koina container directory exists | ||
| ansible.builtin.file: | ||
| path: "{{ koina_container_dir }}" | ||
| state: directory | ||
| owner: root | ||
| group: root | ||
| mode: '0755' | ||
|
|
||
| - name: Template docker-compose.yml | ||
| ansible.builtin.template: | ||
| src: docker-compose.yml.j2 | ||
| dest: "{{ koina_container_dir }}/docker-compose.yml" | ||
| owner: root | ||
| group: root | ||
| mode: '0644' | ||
|
|
||
| - name: Start koina server container | ||
| community.docker.docker_compose_v2: | ||
| project_src: "{{ koina_container_dir }}" | ||
| files: | ||
| - docker-compose.yml | ||
| state: present | ||
|
|
||
| # - name: Ensure Koina server is running by curl health endpoint | ||
| # ansible.builtin.uri: | ||
| # url: "http://localhost:{{ koinahttp_docker_port }}/v2/health/ready" | ||
| # method: GET | ||
| # return_content: true | ||
| # status_code: 200 | ||
| # register: koina_health_check | ||
| # retries: 3 | ||
| # delay: 10 | ||
| # until: koina_health_check.status == 200 | ||
|
Comment on lines
+25
to
+34
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason this is included as comments?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because the server startup and the loading of all the models takes more than 15 to 20 minutes, hence checking if this is |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you delete this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this PR deletes that file.