Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,9 @@ models/ms2pip/**/*.json
# Hide symlink in models/repo
models/repo/*
node_modules

# Ansible
*ansible.log
*.vault_password
*collections/
*geerlingguy*
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Accessing a public server
### cURL
Here is an example HTTP request using only cURL sending a POST request to with a JSON body. You can find examples for all available models at https://koina.wilhelmlab.org/.
Here is an example HTTP request using only cURL sending a POST request to with a JSON body. You can find examples for all available models at https://koina.wilhelmlab.org/.

```bash
curl "https://koina.wilhelmlab.org/v2/models/Prosit_2019_intensity/infer" \
Expand Down Expand Up @@ -101,7 +101,7 @@ For examples of how to access models using Python, you can check out [our OpenAP
Koina depends on [docker](https://docs.docker.com/engine/install/) and [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/overview.html).
It has only been tested on Linux (Debian/Ubuntu) with Nvidia GPUs.

You can find an ansible script that installs all dependencies [here](docs/server/).
You can find an Ansible playbook that installs all dependencies and sets up the Koina server [here](docs/server/deployment/ansible/).

### How to run it
After installing the dependencies, you can pull the docker image and run it. If you have multiple GPUs installed on your server, you can choose which one is used by modifying `--gpus '"device=0"'`. The time it takes to pull the image depends on your connection speed. The first time, it might take up to 5 min. Due to the layered design of Docker images, updating to the latest version will likely (depending on the amount of changes) only take seconds. When the server is first started, Model files are downloaded from Zenodo. The duration of this also depends on connection speed but might take ~10 min as well. Once models are downloaded, the server startup takes ~2 minutes.
Expand Down
Binary file removed docs/server/.gpu-driver.yaml.swp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you delete this file?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this PR deletes that file.

Binary file not shown.
55 changes: 55 additions & 0 deletions docs/server/deployment/ansible/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Ansible: Koina server deployment

This directory contains the Ansible playbooks and roles used to provision Koina inference servers. The main orchestration is performed by the `koina_server.yml` playbook and its included roles. Use this README as a quick reference for installing role requirements, running the playbook, and where to find common variables and templates.

_NOTE: The Ansible playbook to deploy KOINA server will fetch the TLS/SSL certificates automatically using Certbot for the specified domains from Let's Encrypt._

## Quick start

1. From the repo root, install collections and roles listed in requirements.yml:
```bash
cd docs/server/deployment/ansible
ansible-galaxy install -r requirements.yml
```

2. Update the inventory file `hosts` and the Nginx templates in `templates/nginx/` as needed for your environment and also the variables in `koina_server.yml` (domain names, email address, paths, etc.).

3. Run the main playbook (example inventory file and variables live in the repo):
```bash
ansible-playbook koina_server.yml --ask-become-pass
```
Adjust the inventory path, extra-vars, and become options for your environment.

## What is included

- koina_server.yml — main playbook that composes the server setup (Docker, KOINA/Triton, services, etc.).
- requirements.yml — pinned Ansible roles/collections required by the playbooks.
- roles/ — local roles used by the playbook (examples: koina-server, nvidia-container-toolkit, etc.).
- templates/ — templates used by roles (includes nginx vhost templates for the nginx role).
- defaults/ in each role — role-level default variables (recommended place to review defaults before overriding).
- tasks/, handlers/, files/ — standard Ansible role layout for each role.

## Variables & configuration

- Primary variables and the flow of configuration are defined in `koina_server.yml` and the defaults files of each role (`roles/<role>/defaults/main.yml`).
- Override values per-host or per-group using `host_vars/` or `group_vars/` or pass via `-e` on the command line.
- Ensure to review and modify accordingly:
- `koina_server.yml` for the overall orchestration and variable flow.
- `roles/*/defaults/main.yml` to find complete variable names and defaults.
- Ensure that the `docker_user` variable in `koina_server.yml` is set to the user that should have permissions to run Docker commands (usually the default user on the server, e.g., `ubuntu`).

## Nginx templates

- Nginx virtual host templates live under `templates/nginx` and are consumed by the `geerlingguy.nginx` role. Modify or copy these templates to customize upstreams, SSL, or proxy rules before running the playbook.

Example templates path:
```
templates/nginx/koina.conf.j2
templates/nginx/koinarpc.conf.j2
```

### TLS/SSL Certificates
- The playbook uses the `geerlingguy.certbot` role to automatically obtain and renew TLS/SSL certificates from Let's Encrypt using Certbot.
- Ensure that the domain names specified in the variables are correctly pointed to your server's IP address before running the playbook.
- Port 80 must be open and accessible for the HTTP-01 challenge used by Let's Encrypt.
- The email address provided in the variables is used for important account notifications from Let's Encrypt.
6 changes: 6 additions & 0 deletions docs/server/deployment/ansible/hosts
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[koina_servers]
koina-bi-01 ansible_host=<YOUR_SERVER_IP>

[all:vars]
ansible_ssh_user=ubuntu # Change this if your server uses a different user (it is assumed that this user has sudo privileges)
ansible_ssh_private_key_file=~/.ssh/id_rsa # Path to your SSH private key
71 changes: 71 additions & 0 deletions docs/server/deployment/ansible/koina_server.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
# Playbook to deploy Minio on a VM
- hosts: koina_servers
become: true
vars_files:
- secret_group_vars/all.yml
vars:
koinarpc_docker_port: 8500
koinahttp_docker_port: 8501
koinametrics_docker_port: 8502
KOINA_SERVER_NAME: "yourdomain.com" # Change this to your domain
KOINA_RPC_SERVER_NAME: "rpc.yourdomain.com" # Change this to your domain
ADMIN_EMAIL_ADDRESS: "admin@yourdomain.com" # Change this to your email address for Certbot/Let's Encrypt
KOINA_CONTAINER_DIR: "/opt/koina" # Directory to store Koina Docker container data and the compose file
docker_user: "ubuntu" # Change this to the user that should have Docker permissions
pre_tasks:
- name: Update and upgrade apt packages
ansible.builtin.apt:
update_cache: true
upgrade: dist

- name: Install dependencies
ansible.builtin.apt:
name:
- python3
- python3-venv
- python3-pip
- ufw
- ubuntu-drivers-common
state: present
update_cache: true
roles:
- role: geerlingguy.docker
vars:
docker_users:
- "{{ docker_user }}"

- role: firewall

- role: nvidia-driver

- role: nvidia-container-toolkit

- role: geerlingguy.nginx
vars:
nginx_remove_default_vhost: true
nginx_vhosts:
- server_name: "{{ KOINA_SERVER_NAME }}"
template: "{{ playbook_dir }}/templates/nginx/koina.conf.j2"
- server_name: "{{ KOINA_RPC_SERVER_NAME }}"
template: "{{ playbook_dir }}/templates/nginx/koinarpc.conf.j2"
ssl_certificate_path: '/etc/letsencrypt/live/{{ KOINA_SERVER_NAME }}/fullchain.pem'
ssl_certificate_key_path: '/etc/letsencrypt/live/{{ KOINA_SERVER_NAME }}/privkey.pem'

- role: geerlingguy.certbot
vars:
certbot_create_if_missing: true
certbot_create_extra_args: ''
certbot_create_method: standalone
certbot_admin_email: "{{ ADMIN_EMAIL_ADDRESS }}"
certbot_create_standalone_stop_services:
- nginx
certbot_certs:
- domains:
- "{{ KOINA_SERVER_NAME }}"
- "{{ KOINA_RPC_SERVER_NAME }}"
webroot: '/var/www/certbot'

- role: koina-server
vars:
koina_container_dir: "{{ KOINA_CONTAINER_DIR }}"
13 changes: 13 additions & 0 deletions docs/server/deployment/ansible/requirements.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
collections:
- name: community.general
source: https://galaxy.ansible.com
- name: community.docker
source: https://galaxy.ansible.com
roles:
- name: geerlingguy.docker
version: 7.6.0
- name: geerlingguy.certbot
version: 5.4.1
- name: geerlingguy.nginx
version: 3.2.0
20 changes: 20 additions & 0 deletions docs/server/deployment/ansible/roles/firewall/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Ansible role: firewall

Role to install and configure the required firewall on target hosts (Ubuntu only).

## Features
- Configures UFW with predefined rules
- Denies all incoming connections by default
- Allows all outgoing connections by default
- Opens specific ports for SSH, HTTP, HTTPS

## Requirements
- Sudo/root privileges on target hosts

## Example playbook
```yaml
- hosts: all
become: true
roles:
- firewall
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
allowed_ports:
- 22
- 80
- 443
45 changes: 45 additions & 0 deletions docs/server/deployment/ansible/roles/firewall/meta/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
galaxy_info:
author: Sanjay Srikakulam
description: Ansible role to configure the required firewall on target hosts.
company: Forschungszentrum Jülich

# If the issue tracker for your role is not on github, uncomment the
# next line and provide a value
# issue_tracker_url: http://example.com/issue/tracker

# Choose a valid license ID from https://spdx.org - some suggested licenses:
# - BSD-3-Clause (default)
# - MIT
# - GPL-2.0-or-later
# - GPL-3.0-only
# - Apache-2.0
# - CC-BY-4.0
license: MIT

min_ansible_version: "2.1"

# If this a Container Enabled role, provide the minimum Ansible Container version.
# min_ansible_container_version:

#
# Provide a list of supported platforms, and for each platform a list of versions.
# If you don't wish to enumerate all versions for a particular platform, use 'all'.
# To view available platforms and versions (or releases), visit:
# https://galaxy.ansible.com/api/v1/platforms/
#
platforms:
- name: Ubuntu
versions:
- all

galaxy_tags: []
# List tags for your role here, one per line. A tag is a keyword that describes
# and categorizes the role. Users find roles by searching for tags. Be sure to
# remove the '[]' above, if you add tags to this list.
#
# NOTE: A tag is limited to a single word comprised of alphanumeric characters.
# Maximum 20 tags per role.

dependencies: []
# List your role dependencies here, one per line. Be sure to remove the '[]' above,
# if you add dependencies to this list.
28 changes: 28 additions & 0 deletions docs/server/deployment/ansible/roles/firewall/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
- name: Install required packages for firewall
ansible.builtin.apt:
name:
- ufw
state: present
update_cache: true

- name: Enable UFW
community.general.ufw:
state: enabled

- name: Allow required incoming ports
community.general.ufw:
rule: allow
port: '{{ item }}'
proto: tcp
loop: "{{ allowed_ports }}"

- name: Allow all outgoing traffic
community.general.ufw:
default: allow
direction: outgoing

- name: Deny all other incoming traffic
community.general.ufw:
default: deny
direction: incoming
26 changes: 26 additions & 0 deletions docs/server/deployment/ansible/roles/koina-server/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Ansible role: koina-server

Role to provision and configure a Koina inference server.

This role installs and configures the components required to run a Koina server (Docker + NVIDIA runtime, Triton model repository deployment steps, service configuration). It is intended for use from the repository's Ansible playbook [koina-server.yml](https://github.com/wilhelm-lab/koina/tree/main/docs/server/deployment/ansible/koina_server.yml).

## Features
- Deploys Koina container with GPU support.

## Requirements
- A target host with sudo privileges.
- Internet access to download packages and model artifacts.
- Docker, NVIDIA Container Toolkit, NVIDIA drivers.
- Other roles in the ansible roles directory as well as the [koina-server.yml](https://github.com/wilhelm-lab/koina/tree/main/docs/server/deployment/ansible/koina_server.yml) playbook.

## Role variables
Define role variables in your playbook or inventory group_vars/host_vars. Typical variables include (examples only — adjust for your environment):

- koina_container_name: "koina-server"
- koina_container_dir: ""
- koinarpc_docker_port: 8500
- koinahttp_docker_port: 8501
- koinametrics_docker_port: 8502
- koina_shm_size: '8gb'

(Note: Adapt the variables to your specific needs and also the Docker Compose template in the role's templates/ directory.)
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
koina_container_name: "koina-server"
koina_container_dir: "/opt/koina"
koinarpc_docker_port: 8500
koinahttp_docker_port: 8501
koinametrics_docker_port: 8502
koina_shm_size: "2gb"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
---
45 changes: 45 additions & 0 deletions docs/server/deployment/ansible/roles/koina-server/meta/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
galaxy_info:
author: Sanjay Srikakulam
description: Ansible role to deploy and manage KOINA inference server
company: Forschungszentrum Jülich

# If the issue tracker for your role is not on github, uncomment the
# next line and provide a value
# issue_tracker_url: http://example.com/issue/tracker

# Choose a valid license ID from https://spdx.org - some suggested licenses:
# - BSD-3-Clause (default)
# - MIT
# - GPL-2.0-or-later
# - GPL-3.0-only
# - Apache-2.0
# - CC-BY-4.0
license: MIT

min_ansible_version: "2.1"

# If this a Container Enabled role, provide the minimum Ansible Container version.
# min_ansible_container_version:

#
# Provide a list of supported platforms, and for each platform a list of versions.
# If you don't wish to enumerate all versions for a particular platform, use 'all'.
# To view available platforms and versions (or releases), visit:
# https://galaxy.ansible.com/api/v1/platforms/
#
platforms:
- name: Ubuntu
versions:
- all

galaxy_tags: []
# List tags for your role here, one per line. A tag is a keyword that describes
# and categorizes the role. Users find roles by searching for tags. Be sure to
# remove the '[]' above, if you add tags to this list.
#
# NOTE: A tag is limited to a single word comprised of alphanumeric characters.
# Maximum 20 tags per role.

dependencies: []
# List your role dependencies here, one per line. Be sure to remove the '[]' above,
# if you add dependencies to this list.
34 changes: 34 additions & 0 deletions docs/server/deployment/ansible/roles/koina-server/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
- name: Ensure koina container directory exists
ansible.builtin.file:
path: "{{ koina_container_dir }}"
state: directory
owner: root
group: root
mode: '0755'

- name: Template docker-compose.yml
ansible.builtin.template:
src: docker-compose.yml.j2
dest: "{{ koina_container_dir }}/docker-compose.yml"
owner: root
group: root
mode: '0644'

- name: Start koina server container
community.docker.docker_compose_v2:
project_src: "{{ koina_container_dir }}"
files:
- docker-compose.yml
state: present

# - name: Ensure Koina server is running by curl health endpoint
# ansible.builtin.uri:
# url: "http://localhost:{{ koinahttp_docker_port }}/v2/health/ready"
# method: GET
# return_content: true
# status_code: 200
# register: koina_health_check
# retries: 3
# delay: 10
# until: koina_health_check.status == 200
Comment on lines +25 to +34
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is included as comments?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the server startup and the loading of all the models takes more than 15 to 20 minutes, hence checking if this is ready via the task is not feasible, so I commented it out.

Loading