Skip to content

HPC & Hardware-Aware Task Scheduling for Multi-Agent Swarm #56

@galic1987

Description

@galic1987

Summary

Enable hardware-aware task routing in the multi-agent swarm, intelligently distributing work across different model sizes and GPU configurations.

Problem

The swarm currently treats all agents equally regardless of the underlying hardware. On high-performance local setups (dual RTX 4090s, mixed GPU configurations), there's no way to route lightweight tasks to faster/smaller models while reserving large models for complex reasoning.

Proposal

Implement hardware-aware task scheduling that:

  • Model routing: Route syntax-level verification (lint, format, type check) to fast quantized models (e.g., Qwen 1.5B) while reserving 8B/32B models for deep architectural planning
  • GPU-aware scheduling: Detect available GPUs and their VRAM, distribute agents accordingly
  • Latency-optimized dispatch: Measure per-model response latency and route time-sensitive tasks (interactive feedback) to faster models
  • Cost-aware batching: Group similar small tasks for batch processing on smaller models

Implementation Ideas

  • Extend Config with a [models] section supporting multiple model profiles with capability tags
  • Add ModelRouter that selects the best model for each task based on complexity estimation
  • Use the existing sysinfo dependency for hardware detection
  • Integrate with nvml_wrapper (already in deps) for GPU monitoring
  • Task complexity heuristic: token count, file count, language complexity → model selection

Example Config

[models.fast]
endpoint = "http://localhost:8001/v1"
model = "Qwen/Qwen3.5-1.5B"
capabilities = ["syntax", "format", "lint"]

[models.deep]
endpoint = "http://localhost:8002/v1"
model = "Qwen/Qwen3.5-32B"
capabilities = ["architecture", "refactor", "security"]

Relevant Code

  • src/tools/swarm_tool.rs — swarm dispatch
  • src/orchestration/ — multi-agent orchestration
  • src/config/mod.rs — model profiles already exist (resolve_model())

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions