LlamaOrch is simple Bash-based CLI Orchestrator for managing LLMs in llama.cpp server. It lets you start, stop, list and monitor LLMs in llama-server with ease.
- Interactive model selection with fzf (or numbered fallback)
- Start, stop and monitor llama-server instances
- Per-model config files — customize any llama-server flag
- Live status with port detection, PID tracking and clickable URLs
- Log tailing for real-time output
- Zero dependencies beyond Bash (fzf is optional)
The first pre-built package is now released. You can find the release here
or quick install the latest version with curl
curl -fsSL https://raw.githubusercontent.com/alasgarovs/llamaorch/main/install | bashgit clone https://github.com/alasgarovs/llamaorch.git
cd llamaorch
sh src/configureThis installs the llamaorch command to ~/.local/bin/llamaorch and sets up the config directory at ~/.llamaorch/.
| Command | Description |
|---|---|
run |
Launch a model with interactive selection |
stop |
Gracefully stop a running model |
restart |
Stop and start a running model automatically |
ps |
Show status of all configured models (ports, PIDs, URLs) |
ls |
List all available model configs |
log |
Tail the log file for a model (Ctrl+C to exit) |
create <name> |
Create a new model config file and open it in nano |
edit |
Edit an existing model config in nano |
rm |
Delete a model config, PID file and log |
help |
Display command reference |
Model configs live in ~/.llamaorch/config/ as individual .sh scripts. Each script is a standard Bash file that launches llama-server with your desired flags.
#!/bin/bash
llama-server \
-m ~/.llamaorch/models/example-model.gguf \
-ngl 28 \
-c 6144 \
-t 6 \
-b 192 \
--ubatch-size 64 \
--flash-attn off \
--cont-batching \
--port 18080 \
--host 0.0.0.0The
--portflag is required for live status detection. LlamaOrch parses it automatically.
~/.llamaorch/
├── bin/
│ └── llamaorch # main executable
├── config/
│ ├── default # example config
│ └── my-model.sh # your model configs
├── pids/
│ ├── my-model.pid # PID files
│ └── my-model.log # log files
└── models/ # place your .gguf files here
- llama-server (from llama.cpp) installed and in
$PATH - fzf (optional — provides fuzzy-finder UI; falls back to numbered menu)
- lsof (for port/PID detection)
⭐ Star us on GitHub if you find this project helpful!
