GitHub - radical-experiments/ollama: Initial Ollama experiments

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
plots		plots
polaris_experiments		polaris_experiments
zmq_baseline		zmq_baseline
.flake8		.flake8
.pylintrc		.pylintrc
README		README
get_ollama		get_ollama
radical.ollama.prof		radical.ollama.prof
start_ollama.sh		start_ollama.sh
test_ollama.py		test_ollama.py
test_ollama_client.py		test_ollama_client.py

Repository files navigation

# Setup

- use `./get_ollama` to install ollama in user space
- to run an experiment
- `export OLLAMA_NUM_PARALLEL=32`
- `export OLLAMA_MAX_QUEUE=1024`
- `./ollama serve`
- on a different shell, run the client(s):

- `curl http://localhost:11434/api/generate -d '{ "model": "llama3:latest", "prompt": "hi","stream": false}'`

- for RP based experiments, run `./test.py` as driver which uses `./client.py`
as client tasks

Note that the RP experiments require the `ollama` module installed in the
virtualenv the client task is running in.

# Results:

## Frontier:

- running ollama on CPUs only does not seem to yield any scaling behavior
whatsoever: after initial warmup, query response times are constant (but
noisy).
- GPU utilization on Frontier requires the `rocm/6.0.0` module to be loaded,
and requires the allocation to use `--gres=gpu:8` to obtain correct
permissions. Ollama then 'sees' the GPUs - however, it hangs when trying to
use them.
- support suggested to use `module load craype-accel-amd-gfx90a` (TODO AM:
test)

## Polaris:

- GPUs are being used out of the box, plots for initial runs are under
`polaris/`.
- We cannot determine what portion of a query response time is spent in the
Ollama wait queue, and what portion is spent in the model. TODO:
investigate Ollama profiling.
- The current plots do not yet take warmup into account.
- TODO: query rate plots over time (`exec_stop` events)

## RP integration

- the branch `feature/service_startup_2` simplifies agent service deployment
and integration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

radical-experiments/ollama

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages