-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Enable hardware-aware task routing in the multi-agent swarm, intelligently distributing work across different model sizes and GPU configurations.
Problem
The swarm currently treats all agents equally regardless of the underlying hardware. On high-performance local setups (dual RTX 4090s, mixed GPU configurations), there's no way to route lightweight tasks to faster/smaller models while reserving large models for complex reasoning.
Proposal
Implement hardware-aware task scheduling that:
- Model routing: Route syntax-level verification (lint, format, type check) to fast quantized models (e.g., Qwen 1.5B) while reserving 8B/32B models for deep architectural planning
- GPU-aware scheduling: Detect available GPUs and their VRAM, distribute agents accordingly
- Latency-optimized dispatch: Measure per-model response latency and route time-sensitive tasks (interactive feedback) to faster models
- Cost-aware batching: Group similar small tasks for batch processing on smaller models
Implementation Ideas
- Extend
Configwith a[models]section supporting multiple model profiles with capability tags - Add
ModelRouterthat selects the best model for each task based on complexity estimation - Use the existing
sysinfodependency for hardware detection - Integrate with
nvml_wrapper(already in deps) for GPU monitoring - Task complexity heuristic: token count, file count, language complexity → model selection
Example Config
[models.fast]
endpoint = "http://localhost:8001/v1"
model = "Qwen/Qwen3.5-1.5B"
capabilities = ["syntax", "format", "lint"]
[models.deep]
endpoint = "http://localhost:8002/v1"
model = "Qwen/Qwen3.5-32B"
capabilities = ["architecture", "refactor", "security"]Relevant Code
src/tools/swarm_tool.rs— swarm dispatchsrc/orchestration/— multi-agent orchestrationsrc/config/mod.rs— model profiles already exist (resolve_model())
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request