Run powerful AI models locally with zero compromise on privacy. A beautiful desktop application that brings enterprise-grade language models to your machine with OpenAI-compatible APIs.
π― Features β’ π₯ Installation β’ π Quick Start β’ π Documentation β’ π€ Contributing
AgentDock is not just another AI toolβit's your complete AI infrastructure running entirely on your machine. Imagine having the power of ChatGPT, but with full control, zero costs, and complete privacy. That's AgentDock.
- πΈ Tired of API costs? Run unlimited AI requests without paying per token
- π Privacy concerns? Your data never leaves your machine
- π Need offline AI? Work anywhere, no internet required
- π§ Want customization? Fine-tune and switch models instantly
- π Developer-friendly? Drop-in replacement for OpenAI's API
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β "The easiest way to run local AI models with a β
β professional-grade API that works with all your β
β existing tools and code." β
β β
β - Works with LangChain, AutoGPT, Continue.dev, and more β
β - Drop-in replacement for OpenAI's API β
β - Beautiful UI + Powerful API β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
|
# That's it. Your code doesn't change.
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:5000/v1", # Point to AgentDock
api_key="sk-agentdock-admin"
)
# Works exactly like OpenAI
response = client.chat.completions.create(
model="llama-2-7b-chat",
messages=[{"role": "user", "content": "Explain quantum computing"}],
stream=True # Streaming support!
)What You Get:
- β
/v1/chat/completions- Chat completions with streaming - β
/v1/models- List available models - β Bearer token authentication
- β Works with LangChain, LlamaIndex, AutoGPT, Continue.dev
- β Network accessible - use from other devices on your LAN
|
|
|
- π οΈ Built-in Swagger UI - Interactive API documentation at
/swagger - π Analytics Dashboard - Track API usage, response times, token counts
- π API Key Management - Create, revoke, and manage multiple keys
- π Request Logging - Full request/response logging for debugging
- π§ͺ API Playground - Test endpoints with code examples in Python, Node.js, cURL
| Feature | Description |
|---|---|
| GPU Acceleration | CUDA (NVIDIA), ROCm (AMD), SYCL (Intel), Metal (Apple Silicon) |
| CPU Optimization | AVX, AVX2, AVX512 instruction sets |
| Memory Management | Automatic context size optimization |
| Multi-Model Support | Switch models without restart |
| Streaming Responses | Real-time token generation |
- π 100% Local - No data ever leaves your machine
- π API Authentication - Bearer token security
- π« No Telemetry - We don't track anything
- ποΈ Data Control - All conversations stored locally
- π Offline Capable - Works without internet after setup
Get started in 3 minutes:
-
Download the installer
- Visit Releases
- Choose your platform:
- πͺ Windows:
AgentDock-Setup-x.x.x.exe(Installer) or.exe(Portable) - π macOS:
AgentDock-x.x.x.dmg(Intel) or.arm64.dmg(Apple Silicon) - π§ Linux:
AgentDock-x.x.x.AppImage(Universal) or.deb(Debian/Ubuntu)
- πͺ Windows:
-
Install & Launch
- Run the installer
- AgentDock starts automatically
- First launch: App detects your GPU and downloads optimal llama.cpp binaries (~100-300MB)
-
Download a Model
- Go to Models tab
- Click Recommended for You
- Download suggested model (or search for others)
- Model auto-loads when download completes
-
Start Using
- Chat: Test the model in the chat interface
- API: Your OpenAI-compatible API is running at
http://localhost:5000/v1 - Swagger: Explore API docs at
http://localhost:5000/swagger
Prerequisites:
Quick Setup:
# Clone the repository
git clone https://github.com/KauanCerqueira/AgentDock.git
cd AgentDock
# Automatic setup (detects GPU, downloads binaries, installs deps)
# Windows PowerShell:
.\setup.ps1
# macOS/Linux:
chmod +x setup.sh && ./setup.sh
# Start development server
npm run devWhat the setup script does:
- Detects your GPU (NVIDIA β CUDA, AMD β ROCm, Intel β SYCL, None β CPU)
- Downloads appropriate llama.cpp binaries from official releases
- Installs all npm dependencies
- Sets up the backend and frontend
- You're ready to code!
Option 1: Development Mode (for contributors)
npm run devThis starts:
- π§ Backend API:
http://localhost:5000 - π¨ Frontend UI:
http://localhost:5173 - π₯οΈ Electron app in development mode
Option 2: Production Build
npm run build
npm startOption 3: Backend Only (if you just want the API)
cd src/AgentDock.Backend
dotnet run1. Python Example (Most Popular)
from openai import OpenAI
# Connect to AgentDock
client = OpenAI(
base_url="http://localhost:5000/v1",
api_key="sk-agentdock-admin"
)
# Simple completion
response = client.chat.completions.create(
model="llama-2-7b-chat.Q2_K.gguf",
messages=[
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to calculate fibonacci numbers"}
],
temperature=0.7,
max_tokens=500
)
print(response.choices[0].message.content)
# Streaming example
stream = client.chat.completions.create(
model="llama-2-7b-chat.Q2_K.gguf",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")2. Node.js / TypeScript
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'http://localhost:5000/v1',
apiKey: 'sk-agentdock-admin'
});
async function chat() {
const response = await client.chat.completions.create({
model: 'llama-2-7b-chat.Q2_K.gguf',
messages: [
{ role: 'user', content: 'Explain async/await in JavaScript' }
]
});
console.log(response.choices[0].message.content);
}
chat();3. cURL (Terminal)
curl http://localhost:5000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-agentdock-admin" \
-d '{
"model": "llama-2-7b-chat.Q2_K.gguf",
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
],
"temperature": 0.8,
"max_tokens": 200
}'4. LangChain Integration
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
# Point to AgentDock
llm = ChatOpenAI(
base_url="http://localhost:5000/v1",
api_key="sk-agentdock-admin",
model="llama-2-7b-chat.Q2_K.gguf"
)
# Use anywhere in LangChain
messages = [HumanMessage(content="Translate 'hello' to French")]
response = llm.invoke(messages)
print(response.content)Your AgentDock API can be accessed from any device on your network:
-
Find your machine's IP address
- Windows:
ipconfigβ look for IPv4 Address - macOS/Linux:
ifconfigorip addrβ look for inet - Example:
192.168.1.100
- Windows:
-
Update your base URL
client = OpenAI( base_url="http://192.168.1.100:5000/v1", # Use your IP api_key="sk-agentdock-admin" )
-
Firewall: Ensure port 5000 is allowed through your firewall
Use Cases:
- π± Run AgentDock on a powerful desktop, access from laptop/tablet
- π¬ Share with team members on the same network
- π Run on a home server, access from anywhere in your house
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AgentDock β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββ ββββββββββββββββ ββββββββββββββββββ β
β β Electron βββββββΊβ React UI βββββββΊβ .NET Backend β β
β β Desktop β β (Frontend) β β (API) β β
β βββββββββββββββ ββββββββββββββββ ββββββββββ¬ββββββββ β
β β β
β ββββββββββΌββββββββ β
β β llama.cpp β β
β β (Inference) β β
β ββββββββββ¬ββββββββ β
β β β
β ββββββββββΌββββββββ β
β β AI Models β β
β β (.gguf) β β
β ββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Technology Stack:
| Layer | Technologies |
|---|---|
| Desktop | Electron, Node.js |
| Frontend | React 18, TypeScript, TailwindCSS, Vite, React Router |
| Backend | .NET 8, ASP.NET Core, Swagger/OpenAPI |
| AI Engine | llama.cpp (CPU/CUDA/ROCm/Metal/Vulkan) |
| Models | GGUF format (Llama, Mistral, Phi, etc.) |
AgentDock/
β
βββ electron/ # Electron desktop application
β βββ main.js # Main process (app lifecycle)
β βββ preload.js # Preload scripts (security bridge)
β βββ dev.js # Development launcher
β
βββ src/
β βββ AgentDock.Backend/ # .NET 8 Web API
β β βββ Controllers/ # API endpoints
β β β βββ ChatController.cs # Chat completions
β β β βββ ModelsController.cs # Model management
β β β βββ OpenAIController.cs # OpenAI-compatible routes
β β β βββ AnalyticsController.cs # Usage analytics
β β β βββ ...
β β β
β β βββ Services/ # Business logic
β β β βββ SettingsService.cs
β β β βββ SystemMonitorService.cs
β β β βββ LogsService.cs
β β β βββ ...
β β β
β β βββ Infrastructure/ # External integrations
β β β βββ Llama/
β β β β βββ LlamaCppService.cs # llama.cpp HTTP client
β β β β βββ LlamaLifecycleService.cs # Process management
β β β β
β β β βββ HuggingFace/
β β β βββ HuggingFaceService.cs # Model search & details
β β β βββ ModelDownloadManager.cs # Download queue
β β β βββ ModelRecommendationService.cs
β β β
β β βββ Core/ # Domain models
β β β βββ Interfaces/
β β β βββ Models/
β β β
β β βββ models/ # AI model files (.gguf)
β β βββ bin/llama/ # llama.cpp binaries
β β βββ appsettings.json # Configuration
β β
β βββ AgentDock.UI/ # React frontend
β βββ src/
β β βββ components/ # Reusable UI components
β β β βββ Layout.tsx
β β β βββ ModelBrowser.tsx
β β β βββ ModelDetailsDrawer.tsx
β β β βββ ui/ # shadcn/ui components
β β β
β β βββ pages/ # Main application pages
β β β βββ Dashboard.tsx
β β β βββ Chat.tsx
β β β βββ Models.tsx
β β β βββ DownloadManager.tsx
β β β βββ DownloadedModels.tsx
β β β βββ APIPlayground.tsx
β β β βββ Analytics.tsx
β β β βββ Settings.tsx
β β β
β β βββ api/ # API client
β β βββ hooks/ # Custom React hooks
β β βββ lib/ # Utilities
β β βββ locales/ # i18n translations
β β βββ types/ # TypeScript types
β β
β βββ public/ # Static assets
β
βββ llama.cpp/ # llama.cpp binaries (zips)
β βββ llama-b7648-bin-win-cpu-x64.zip
β βββ llama-b7648-bin-win-cuda-12.4-x64.zip
β βββ llama-b7648-bin-win-vulkan-x64.zip
β βββ models/ # Optional: model storage
β
βββ package.json # npm dependencies & scripts
βββ electron-builder.json # Electron build configuration
βββ setup.ps1 # Windows setup script
βββ setup.sh # Linux/macOS setup script
βββ README.md # You are here!
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions |
POST | Create chat completion (streaming supported) |
/v1/models |
GET | List available models |
| Endpoint | Method | Description |
|---|---|---|
/api/models |
GET | List local GGUF models |
/api/models/search |
GET | Search HuggingFace models |
/api/models/suggestions |
GET | Get hardware-based recommendations |
/api/models/download |
POST | Start model download |
/api/models/download/{id} |
GET | Check download progress |
/api/models/downloaded |
GET | List downloaded models |
/api/analytics |
GET | Get API usage analytics |
/api/engine/health |
GET | Check llama.cpp server health |
/swagger |
GET | Interactive API documentation |
appsettings.json (Backend Configuration)
{
"Llama": {
"BaseUrl": "http://127.0.0.1:8080",
"Port": 8080,
"Host": "127.0.0.1",
"DefaultModel": "llama-2-7b-chat.Q2_K.gguf",
"ModelsPath": "models",
"ExecutablePath": "bin/llama/llama-server.exe",
"GpuLayers": 35, // 0 for CPU-only, 35+ for GPU
"ContextSize": 4096, // Model context window
"RequestTimeout": 300
},
"Security": {
"ApiKey": "sk-agentdock-admin" // Change this!
}
}Environment Variables (Optional)
# Override default configuration
LLAMA_PORT=8080
LLAMA_GPU_LAYERS=35
API_KEY=your-secure-key-here| Component | Requirement |
|---|---|
| OS | Windows 10/11, macOS 11+, Ubuntu 20.04+ |
| CPU | x64 processor with AVX support |
| RAM | 8 GB (can run small models) |
| Storage | 10 GB + model sizes (2-40 GB per model) |
| GPU | Optional (CPU-only works fine) |
| Component | Recommendation |
|---|---|
| RAM | 16-32 GB (for 7B-13B models) |
| GPU | NVIDIA RTX 3060+ (12GB VRAM) or AMD RX 6800+ |
| Storage | SSD with 50+ GB free |
| GPU Vendor | Technology | Models Supported |
|---|---|---|
| NVIDIA | CUDA 12.4+ | GeForce GTX 1060+, RTX series, Tesla, A100 |
| AMD | ROCm 5.0+ | RX 6000+, Radeon VII, MI series |
| Intel | SYCL/oneAPI | Arc A-series, Iris Xe |
| Apple | Metal | M1, M2, M3 (all variants) |
| Universal | Vulkan | Any GPU with Vulkan 1.2+ |
We love contributions! AgentDock is a community-driven project and we welcome developers of all skill levels.
- π Report Bugs: Found an issue? Open a bug report
- π‘ Suggest Features: Have an idea? Request a feature
- π Improve Docs: Documentation can always be better
- π Translate: Help us reach more users (i18n support built-in!)
- π¨ Design: UI/UX improvements welcome
- π» Code: Implement features, fix bugs, optimize performance
-
Fork & Clone
git clone https://github.com/YOUR_USERNAME/AgentDock.git cd AgentDock -
Setup Development Environment
./setup.ps1 # Windows ./setup.sh # Linux/macOS
-
Create a Feature Branch
git checkout -b feature/amazing-new-feature
-
Make Your Changes
- Write clean, readable code
- Follow existing code style
- Add comments for complex logic
- Update documentation if needed
-
Test Your Changes
npm run dev # Test in development mode npm run build # Ensure production build works
-
Commit with Conventional Commits
git commit -m "feat: add amazing new feature"Commit Types:
feat:New featurefix:Bug fixdocs:Documentation changesstyle:Code formatting (no logic changes)refactor:Code refactoringperf:Performance improvementstest:Adding testschore:Build/tooling changes
-
Push & Create PR
git push origin feature/amazing-new-feature
Then open a Pull Request on GitHub with a clear description.
- Automated Checks: CI/CD runs tests and builds
- Code Review: Maintainers review your code
- Feedback: We may request changes
- Approval: Once approved, we merge!
- Release: Your contribution ships in the next release
New to the project? Look for issues labeled good first issue
We follow the Contributor Covenant. Be respectful, inclusive, and constructive.
Q: Do I need to pay for anything?
A: No! AgentDock is 100% free and open-source. You only pay for the electricity to run it on your machine. No subscriptions, no API costs.
Q: Is my data private?
A: Absolutely. Everything runs locally on your machine. No data is ever sent to external servers (except when downloading models from HuggingFace, which is a one-time thing).
Q: Can I use this commercially?
A: Yes! AgentDock is MIT licensed. Use it however you wantβpersonal, commercial, enterprise. Just keep the license file.
Q: What models can I use?
A: Any GGUF model from HuggingFace or elsewhere. Popular choices:
- Llama 2 (7B, 13B, 70B)
- Mistral (7B)
- Phi-2 (2.7B - great for low-end hardware)
- Code Llama (7B, 13B, 34B)
- Mixtral (8x7B)
Q: How much RAM do I need?
A: Depends on the model:
- 2-3B models: 4-6 GB RAM
- 7B models: 8-12 GB RAM
- 13B models: 16-24 GB RAM
- 70B models: 64+ GB RAM (or use smaller quantizations)
AgentDock shows you compatibility before downloading!
Q: Do I need a powerful GPU?
A: No! AgentDock works great on CPU-only. GPU just makes it faster. Even a GTX 1060 can give you 5-10x speedup.
Q: Can I use this with LangChain/AutoGPT/etc?
A: Yes! Just point the base_url to http://localhost:5000/v1. Any tool that supports OpenAI's API will work.
Q: How do I update models?
A: Just download a new one from the Models page. You can have multiple models and switch between them instantly.
- Multi-Model Support - Run multiple models simultaneously
- Model Fine-Tuning - UI for LoRA fine-tuning
- Voice Input/Output - TTS and STT integration
- Plugins System - Extend functionality with plugins
- Cloud Sync - Sync settings across devices (optional)
- Docker Support - Run AgentDock in containers
- Function Calling - OpenAI function calling API
- Vision Models - Support for LLaVA and other vision models
- Model Merging - Merge multiple models in the UI
- Collaborative workspaces
- Built-in RAG (Retrieval Augmented Generation)
- Model quantization tools
- Prompt template library
- Mobile companion app
Vote on features: GitHub Discussions
MIT License
Copyright (c) 2024-2026 AgentDock Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
See LICENSE file for full details.
AgentDock stands on the shoulders of giants:
- llama.cpp - The incredible C++ inference engine that makes this all possible
- OpenAI - For the API specification that became the industry standard
- HuggingFace - For hosting and democratizing AI models
- Electron - For making cross-platform desktop apps easy
- React - For the amazing UI framework
- .NET - For the powerful backend framework
- Tailwind CSS - For making styling actually enjoyable
And to all our contributors who make AgentDock better every day! π
- π¬ Discord: Join our community (coming soon!)
- π Issues: Report bugs
- π‘ Discussions: Feature requests & ideas
- π§ Email: support@agentdock.dev
- π¦ Twitter: @AgentDock (coming soon!)
If AgentDock helps you, please consider giving us a star. It helps others discover the project!
Made with β€οΈ by developers, for developers
Privacy-first β’ Open-source β’ Community-driven