Python CLI tools for chatting with Ollama models and experimenting with multi-AI collaboration.
- Session-Based Conversation - Natural chat that persists, with spec extraction and workflow triggering
- Workflow Framework - Deterministic multi-agent pipelines with feedback loops and audit trails
- Tool-Calling Agent - AI agent that can read/write files and run commands
- Multi-Persona Collaboration - Two AI personas working together on tasks
- Interactive Chat Room - @ mention multiple personas in real-time
- Batch Processing - Process markdown files and generate code
# Launch the REPL
python3 cli.py
# You'll see:
# ╔═══════════════════════════════════════╗
# ║ ollama-chat ║
# ║ Multi-agent AI workflows ║
# ╚═══════════════════════════════════════╝
# [gemma3:1b]>
# Then use slash commands:
/help # Show all commands
/session # List all sessions
/session my-project # Start/resume session
/workflow "Build a cache" # Run workflow
/agent "List files" # Tool-calling agent
/chat # Simple chat mode
/collab "Design API" # Two-persona collaboration
/room # Multi-persona chat room
/models # List available models
/quit # ExitYou can also run commands directly:
python3 cli.py session
python3 cli.py models- Python 3.10+
- Ollama running locally (default port 11434)
- For Claude personas:
ANTHROPIC_API_KEYenvironment variable set
# Clone the repo
git clone https://github.com/TaylorHuston/ollama-chat.git
cd ollama-chat
# Install dependencies
pip install -r requirements.txt
# Pull recommended models
ollama pull llama3.2:3b # Required for tool-calling agent
ollama pull gemma3:1b # Good for fast chat/collaboration
ollama pull qwen2.5:0.5b # Ultra-lightweight option# Install as editable package (in a venv)
python -m venv .venv
source .venv/bin/activate
pip install -e .
# Now you can use:
ollama-chat conv my-project
oc workflow "Build X" # 'oc' is a short aliaslangchain-core- Base LangChain abstractionslangchain-ollama- Ollama integrationlangchain-anthropic- Claude integrationlanggraph- Workflow state graphsrich- Terminal formattingrequests- HTTP client
Launch the REPL with python3 cli.py and use these slash commands. Each feature can also be accessed via the standalone Python scripts.
The primary tool for interactive design work. Named, persistent conversations that can extract specs and trigger implementation workflows.
# In the REPL:
/session # List all sessions
/session my-project # Start or resume a session
/session my-project --new # Force create new session
/session my-project --history # Show conversation history
/session my-project --spec # Show extracted spec
/session my-project --delete # Delete a sessionIn-chat commands:
/help Show available commands
/history Show full conversation history
/summary Show session metadata
/summarize Extract structured spec from conversation (AI-powered)
/spec Show saved spec
/workflow Run implementation workflow using saved spec
/clear Clear conversation history
/save Force save session
/quit Exit
Typical workflow:
- Launch:
python3 cli.py - Start a session:
/session api-design - Have a natural conversation to design your feature
- Type
/summarizeto extract a structured spec - Type
/workflowto automatically implement it - Resume later:
/session api-design
Deterministic multi-agent pipelines with conditional routing, feedback loops, and full audit trails.
# In the REPL:
/workflow "Write a function to merge two sorted lists"
/workflow --persist "Implement quicksort" # Save run history
/workflow --list-runs # List previous runs
/workflow --inspect <run-id> # Inspect a specific run
/workflow --visualize # Show workflow structure
# Set model before running:
/model gemma3:1b
/workflow "Build a cache class"Built-in workflow: spec_implement_review
spec (write detailed specification)
↓
implement (generate code from spec)
↓
review (score 0-100 + feedback)
↓
score >= threshold? → DONE
↓ (no)
implement (with feedback) ← loop
Creating custom workflows:
from workflow import (
Workflow, LLMNode, SpecWriterNode,
ImplementerNode, ReviewerNode, ToolNode
)
# Define a custom workflow
workflow = (
Workflow("my_workflow")
.add_node("plan", LLMNode(
model="gemma3:1b",
system_prompt="You are a planner...",
prompt_template="Plan this: {task}",
output_key="plan"
))
.add_node("execute", ToolNode(
model="llama3.2:3b",
prompt_template="Execute this plan: {plan}"
))
.add_edge("plan", "execute")
.set_entry("plan")
)
result = workflow.run({"task": "Create a hello world file"}, persist=True)Available node types:
| Node | Purpose |
|---|---|
LLMNode |
Base node - invoke LLM with prompt template |
SpecWriterNode |
Write detailed specifications |
ImplementerNode |
Generate code from specs |
ReviewerNode |
Review code, output score + feedback |
ToolNode |
LLM with tool-calling capabilities |
An AI agent with filesystem and shell access. Can autonomously complete tasks by calling tools.
# In the REPL:
/agent "List all Python files in this directory"
/agent "Read requirements.txt and summarize the dependencies"
/agent "Create a hello.py file that prints Hello World"
/agent "What's the current git branch?"
/agent # Interactive mode - enter multiple tasksAvailable tools:
| Tool | Description |
|---|---|
read_file(path) |
Read file contents |
write_file(path, content) |
Create or overwrite a file |
list_files(path) |
List directory contents |
run_command(command) |
Execute shell commands |
search_files(pattern, path) |
Find files by glob pattern |
Model requirements: The agent requires a model that supports tool/function calling:
llama3.2:3b(default, recommended)llama3.1mistral
Models like gemma3:1b and qwen2.5:0.5b do NOT support tool calling.
Basic single-model chat interface.
# In the REPL:
/models # List available models
/chat "What is the capital of France?" # One-shot message
/model gemma3:1b # Set model
/chat # Interactive chat modeTwo AI personas collaborate on a task, passing context back and forth.
# In the REPL:
/collab "Design a key-value store" # Default: Architect + Developer
/collab --list # List available personasChat with multiple AI personas using @ mentions.
# In the REPL:
/room # Start with default personas (architect, developer)Chat room commands:
@architect Design a REST API for a todo app
@claude What do you think of Architect's design?
@all Let's write a haiku together
/add critic # Add a persona to the room
/remove developer # Remove a persona
/list # Show active personas
/personas # Show all available personas
/clear # Clear conversation history
/quit # Exit the chat room
Process markdown files with AI and extract code to runnable files.
# In the REPL:
/batch # Process INPUT.md -> output.py
/batch -i prompt.md -o result.py # Custom input/output
/batch -p claude # Use Claude personaollama-chat/
├── cli.py # Unified REPL entry point
├── pyproject.toml # Package configuration
├── conversation.py # Session-aware chat with /summarize and /workflow
├── workflow.py # Deterministic multi-agent workflows (LangGraph)
├── agent.py # Tool-calling autonomous agent
├── chat.py # Simple single-model chat
├── collab.py # Two-persona collaboration
├── chat_room.py # Interactive multi-persona chat room
├── batch.py # Markdown to code batch processing
├── sessions.py # Session management CLI and library
├── handoffs.py # JSON persistence for workflow audit trails
├── personas.py # Shared persona/LLM utilities (LangChain)
├── tools.py # LangChain tool definitions
├── personas.json # Persona configuration
├── requirements.txt # Python dependencies
├── sessions/ # Persistent conversation sessions (gitignored)
│ └── {session}/
│ ├── meta.json
│ ├── history.json
│ ├── spec.md
│ └── output.py
└── workflow_runs/ # Workflow audit trails (gitignored)
└── {timestamp}_{workflow}/
├── 00_input.json
├── 01_spec.json
├── 02_implement.json
└── final.json
Two Interaction Modes:
- Conversational - Natural back-and-forth to design specs (
conversation.py) - Workflow - Deterministic execution with review gates (
workflow.py)
Handoff Files:
When workflows run with --persist, each node writes a JSON "handoff" file containing:
- Input state it received
- Output state it produced
- Timestamp and duration
- Any errors
This creates a full audit trail for debugging and reproducibility.
The project uses LangChain for:
- Unified API - Same code works with Ollama and Claude
- Streaming - Real-time response output
- Tool Calling - Structured function calls for the agent
- Message Management - Proper conversation history handling
Workflows use LangGraph for:
- State Graphs - Define nodes and edges declaratively
- Conditional Routing - Branch based on state (e.g., review score)
- Iteration Control - Built-in loop management
All personas are defined in personas.json. You can add, modify, or remove personas by editing this file.
{
"architect": {
"name": "Architect",
"model": "gemma3:1b",
"backend": "ollama",
"system_prompt": "You are a senior software architect..."
},
"claude": {
"name": "Claude",
"model": "claude-sonnet-4-20250514",
"backend": "claude",
"system_prompt": "You are Claude, an AI assistant by Anthropic..."
}
}Fields:
name- Display name for the personamodel- Model identifier (Ollama model name or Claude model ID)backend- Eitherollamaorclaudesystem_prompt- Instructions that define the persona's behavior
| Persona | Backend | Model | Role |
|---|---|---|---|
architect |
Ollama | gemma3:1b | High-level system design |
developer |
Ollama | gemma3:1b | Implementation focus |
critic |
Ollama | gemma3:1b | Find flaws, suggest improvements |
creative |
Ollama | gemma3:1b | Novel and creative approaches |
claude |
Claude | claude-sonnet-4-20250514 | General-purpose assistant |
Make sure Ollama is running:
ollama servePull the required model:
ollama pull gemma3:1bThe agent requires a tool-capable model. Use llama3.2:3b or llama3.1:
ollama pull llama3.2:3b
python3 agent.py -m llama3.2:3b "your task"Set your Anthropic API key:
export ANTHROPIC_API_KEY="your-key-here"Use smaller models:
qwen2.5:0.5b(397 MB)gemma3:1b(815 MB)llama3.2:3b(2.0 GB)
MIT