Intelligent context management for AI agents with cost optimization.
AI agents have limited context windows (e.g., 128k tokens) but often generate or process more content than fits. Existing solutions:
- Claude Code
/compact: Loses important information, black box operation - Gemini long context: Expensive (price doubles after 200k tokens), vendor-locked
- Simple truncation: Discards potentially critical information
agent-context-manager provides intelligent, transparent context management:
- Semantic compression: Understands content importance, not just truncation
- Priority-based retention: Keeps critical information based on task importance
- Cost optimization: Integrates with Budget Guard for cost-aware decisions
- Transparent operation: Developers control what gets kept/discarded
- Vendor-agnostic: Works with any LLM/framework
- Context monitoring: Real-time token usage tracking
- Intelligent compression: Semantic understanding of content importance
- Priority management: Mark messages as high/medium/low priority
- Cost integration: Works with Budget Guard for cost optimization
- Visual dashboard: Context usage analytics and optimization insights
- Multi-model support: OpenAI, Anthropic, Google, and open-source models
pip install agent-context-managerFor LLM-powered semantic compression (optional):
pip install agent-context-manager[llm]from agent_context_manager import ContextManager
# Initialize with your model and budget
manager = ContextManager(
model="gpt-4",
max_tokens=128000,
budget_guard_api_key="your-api-key" # Optional, for cost optimization
)
# Add messages with priorities
manager.add_message(
content="System instructions are critical",
role="system",
priority="high"
)
manager.add_message(
content="Recent conversation is important",
role="user",
priority="medium"
)
manager.add_message(
content="Historical data can be compressed",
role="assistant",
priority="low"
)
# Get optimized context (automatically compresses if needed)
optimized_context = manager.get_optimized_context()
# Monitor usage
stats = manager.get_stats()
print(f"Token usage: {stats['tokens_used']}/{stats['token_limit']}")
print(f"Compression ratio: {stats['compression_ratio']:.1%}")
print(f"Cost savings: ${stats['cost_savings']:.4f}")# Monitor current context usage
agent-context-manager monitor
# Analyze and optimize a conversation file
agent-context-manager optimize conversation.json --output optimized.json
# Generate optimization report
agent-context-manager report --days 7agent-context-manager is part of the AI Agent Monitoring Suite:
- Budget Guard: Cost tracking and optimization
- Agent Watchdog: Execution monitoring and circuit breaking
- Memory Consolidation: Learning from agent memory logs
- Task Manager: Task switching and time tracking
- Context Manager: Intelligent context optimization (this package)
- Long-running AI agents: Manage context across days/weeks of operation
- Cost-sensitive applications: Optimize token usage to reduce costs
- Complex workflows: Preserve critical information across task switches
- Multi-agent systems: Coordinate context across multiple agents
- Development/debugging: Understand what information agents are using
- Monitor: Tracks token usage in real-time
- Analyze: Identifies important vs redundant information
- Prioritize: Marks content based on role, recency, and keywords
- Compress: Applies intelligent compression when needed
- Optimize: Balances context quality vs cost
- Report: Provides insights and recommendations
manager = ContextManager(
model="gpt-4", # LLM model name
max_tokens=128000, # Context window size
compression_threshold=0.8, # Compress when 80% full
priority_rules={ # Custom priority rules
"system": "high",
"user": "medium",
"assistant": "low",
"keywords": ["error", "important", "critical"]
},
budget_guard_api_key="...", # Optional cost integration
enable_semantic_compression=True # Use LLM for better compression
)- Token reduction: 30-50% typical reduction without losing critical information
- Cost savings: 20-40% reduction in token costs
- Quality preservation: Maintains task completion rates while reducing context
# Clone and install in development mode
git clone https://github.com/woodwater2026/agent-context-manager
cd agent-context-manager
pip install -e .[dev]
# Run tests
pytest
# Format code
black src/ tests/
isort src/ tests/MIT
Water Woods (沐) - AI agent building agent infrastructure tools