adi-llm-uzu-plugin

ADI plugin for local LLM inference on Apple Silicon using the Uzu engine.

Features

🚀 Apple Silicon Optimized: ~35 tokens/sec on M2 (Llama-3.2-1B)
🔒 100% Local: No network, fully offline inference
📦 Pre-built Binaries: No build tools required for users
⚡ Fast Installation: adi plugin install adi.llm.uzu
🎯 Simple API: CLI and programmatic access

Installation

For Users (recommended):

# Install pre-built binary from plugin registry
adi plugin install adi.llm.uzu

For Developers:

# Requirements: Metal Toolchain
xcodebuild -downloadComponent MetalToolchain

# Build plugin
cargo build --release

# Install locally
adi plugin install --local target/release/libadi_llm_uzu_plugin.dylib

Usage

Load a Model

adi llm-uzu load models/llama-3.2-1b.gguf

Generate Text

adi llm-uzu generate models/llama-3.2-1b.gguf "Explain Rust ownership"

List Loaded Models

adi llm-uzu list

Model Information

adi llm-uzu info models/llama-3.2-1b.gguf

Unload Model

adi llm-uzu unload models/llama-3.2-1b.gguf

Programmatic Access

Use the inference service from other plugins or applications:

// Register service dependency in plugin.toml
[[requires]]
id = "adi.llm.inference"
version = "^1.0.0"

// Call from your code
let args = json!({
    "model_path": "models/llama-3.2-1b.gguf",
    "prompt": "Hello, world!",
    "max_tokens": 128,
    "temperature": 0.7
});

let result = service.invoke("generate", &args)?;

Supported Models

Download GGUF models from:

Recommended models:

Llama 3.2 1B/3B - Fast, general purpose
Qwen 2.5 1B/3B - Multilingual
Gemma 2B - Efficient, high quality

Requirements

macOS with Apple Silicon (M1/M2/M3+)
Model files in GGUF format

Performance

Model	Apple M2 (tokens/sec)
Llama-3.2-1B	~35
Qwen-2.5-1B	~33
Gemma-2B	~28

Why Use This Plugin?

vs OpenAI/Anthropic:

✅ Free (no API costs)
✅ Private (100% local)
✅ Fast (no network latency)
❌ Smaller models (less capable)

vs lib-client-ollama:

✅ Faster on Apple Silicon
✅ Lower overhead (no server)
❌ macOS only
❌ Fewer features

Troubleshooting

"Plugin not found"

Install from registry:

adi plugin install adi.llm.uzu

"Model not found"

Check model file exists:

ls -lh models/llama-3.2-1b.gguf

"Failed to load model"

Ensure:

You're on Apple Silicon (M1/M2/M3)
Model is GGUF format
Model fits in memory

License

MIT

Contributing

Contributions welcome! Open an issue or PR on GitHub.

Related Projects

Uzu - Inference engine
lib-client-uzu - Rust client
ADI - Agent development infrastructure

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

adi-llm-uzu-plugin

Features

Installation

Usage

Load a Model

Generate Text

List Loaded Models

Model Information

Unload Model

Programmatic Access

Supported Models

Requirements

Performance

Why Use This Plugin?

Troubleshooting

"Plugin not found"

"Model not found"

"Failed to load model"

License

Contributing

Related Projects

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

adi-llm-uzu-plugin

Features

Installation

Usage

Load a Model

Generate Text

List Loaded Models

Model Information

Unload Model

Programmatic Access

Supported Models

Requirements

Performance

Why Use This Plugin?

Troubleshooting

"Plugin not found"

"Model not found"

"Failed to load model"

License

Contributing

Related Projects

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages