Supercharge your AI assistant with local LLM access, run powerful AI models right on your own computer, no internet required.
A Python MCP server that exposes your local Ollama models as tools for AI assistants like Windsurf, VS Code, Claude Desktop, and more.
Connect your local LLMs to any MCP-compatible AI assistant. No cloud APIs needed.
| Tool | What it does |
|---|---|
ollama_chat |
Chat with any local model (multi-turn, tool-calling) |
ollama_generate |
Generate text completions |
ollama_embed |
Create vector embeddings |
ollama_list |
List installed models |
ollama_show |
Inspect model details |
ollama_pull |
Download new models |
ollama_delete |
Remove models |
ollama_ps |
List running models |
Prerequisites: Python 3.10+, Ollama running locally
pip install mcp-ollama-pythonAdd to your MCP config (mcp_config.json):
{
"mcpServers": {
"ollama": {
"command": "py",
"args": ["-m", "mcp_ollama_python"],
"disabled": false
}
}
}Restart your editor — done. Your AI assistant can now use local Ollama models.
Type in your AI assistant's chat:
MCP Tool: ollama / ollama_chat — Use model llama3.1 and explain quantum computing
- 🔧 8 MCP tools — Full Ollama SDK access
- 🔄 Hot-swap architecture — Drop a file in
tools/, it's auto-discovered - 🎯 Type-safe — Pydantic models throughout
- 🚀 Lightweight — Minimal dependencies, fast startup
- 🔌 Universal — Works with any MCP client
| Guide | Description |
|---|---|
| Installation | Setup and prerequisites |
| Available Tools | All tools with examples |
| Configuration | Environment variables, model config |
| Windsurf Integration | Complete Windsurf setup guide |
| VS Code Integration | VS Code setup |
| Architecture | How it works, adding tools |
| Server Control | Start/stop/manage the server |
| Interactive Manager | Menu-driven management UI |
| Development | Contributing, code quality |
Made with ❤️ using Python, Poetry, and Ollama