Skip to content

feat: add smart-llama - dynamic model loading layer for llama.cpp#1

Merged
fdematos merged 7 commits intomainfrom
feature/smart-llama
Jan 25, 2026
Merged

feat: add smart-llama - dynamic model loading layer for llama.cpp#1
fdematos merged 7 commits intomainfrom
feature/smart-llama

Conversation

@fdematos
Copy link
Contributor

Summary

  • Go server wrapping llama-server as subprocess for dynamic model loading
  • Single active model with automatic swap when a different model is requested
  • OpenAI-compatible API (/v1/chat/completions, /v1/models, /health)

Features

  • Config: YAML-based configuration for models (native llama-server args format)
  • Process Management: Subprocess lifecycle with health checks, graceful shutdown
  • Reverse Proxy: Transparent proxying to internal llama-server
  • CI/CD: GitHub Actions for tests/build on push/PR + multi-platform release on tags

Structure

cmd/smart-llama/       - Entry point
internal/config/       - YAML config loading
internal/process/      - Subprocess management
internal/proxy/        - HTTP reverse proxy
internal/server/       - HTTP server + routing
.github/workflows/     - CI + Release workflows

Testing

All packages include unit tests. Run with:

go test ./...

- Go server wrapping llama-server as subprocess
- Single active model with automatic swap on request
- OpenAI-compatible API (/v1/chat/completions, /v1/models)
- YAML-based model configuration (native llama-server args)
- Process lifecycle management with health checks
- Reverse proxy to llama-server

Includes:
- internal/config: YAML config loading
- internal/process: subprocess management (spawn/kill/health)
- internal/proxy: HTTP reverse proxy
- internal/server: HTTP server with model swap logic
- GitHub Actions: CI (tests/build) + Release (multi-platform binaries)
- Upgrade Go version from 1.21 to 1.23.5
- Disable setup-go cache to avoid tar extraction errors
- Add smart-llama binary to .gitignore
@fdematos fdematos merged commit 987dd02 into main Jan 25, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments