Agentic Infra

AI-Powered Kubernetes Infrastructure Management Platform

Built on Agno · Pluggable MCP + Skills Architecture · K3s Native

Overview

Agentic Infra turns natural language into infrastructure operations. It uses LLM-powered agents to handle Kubernetes cluster deployment (Day 0/1) and intelligent operations (Day 2) — from bootstrapping a K3s cluster via SSH to diagnosing pod crashes through conversational AI.

Key Features

K3s Auto-Deploy — Provision lightweight Kubernetes clusters across bare-metal or VMs via SSH, no manual intervention
Pluggable MCP Servers — 6 MCP servers (87+ tools) dynamically discovered and loaded from YAML config
Pluggable Skills — Domain knowledge hot-loaded from filesystem, auto-matched to user queries
Multi-Agent Teams — 5 specialized agents organized into 2 teams with routing and coordination
GitOps Native — ArgoCD-driven deployments with human-approval gates
Full Observability — VictoriaMetrics + kube-prometheus-stack + VictoriaLogs

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Web Portal                           │
│              Next.js + AG-UI Protocol                   │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│                 AgentOS Runtime                          │
│              Agno Framework + FastAPI                    │
│                                                         │
│  ┌─────────────┐  ┌──────────────┐  ┌───────────────┐  │
│  │  OpsTeam    │  │  InfraTeam   │  │   Workflows   │  │
│  │  (route)    │  │ (coordinate) │  │               │  │
│  │ ┌─────────┐ │  │ ┌──────────┐ │  │ Bootstrap     │  │
│  │ │Monitor  │ │  │ │InfraDeploy│ │  │ ComponentDeploy│ │
│  │ │Logging  │ │  │ │Middleware│ │  │               │  │
│  │ │Maintain │ │  │ │Monitor   │ │  │               │  │
│  │ └─────────┘ │  │ └──────────┘ │  └───────────────┘  │
│  └─────────────┘  └──────────────┘                      │
│         │                  │          ┌──────────────┐   │
│         │    Skills        │          │ PromptBuilder│   │
│         │  ┌──────────┐   │          └──────────────┘   │
│         │  │ k3s-ops  │   │                             │
│         │  │ k8s-diag │   │                             │
│         │  │ monitor  │   │                             │
│         │  │ logging  │   │                             │
│         │  │ infra    │   │                             │
│         │  │ middleware│  │                             │
│         │  └──────────┘   │                             │
└─────────┼─────────────────┼─────────────────────────────┘
          │                 │
┌─────────▼─────────────────▼─────────────────────────────┐
│                    MCP Servers                           │
│                                                         │
│  ┌────────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│  │ linux-mcp  │ │ k8s-mcp  │ │argocd-mcp│ │deploy-mcp│ │
│  │  SSH ops   │ │apiserver │ │  GitOps   │ │Helm/kubectl││
│  └────────────┘ └──────────┘ └──────────┘ └──────────┘ │
│  ┌────────────┐ ┌──────────┐                            │
│  │ gitops-mcp │ │aianalysis│                            │
│  │ Git/manifests│ │Prometheus│                           │
│  └────────────┘ └──────────┘                            │
└─────────────────────────────────────────────────────────┘

Project Structure

agentic-infra/
├── agent-os/                  # AgentOS core service
│   ├── agents/                # 5 specialized agents
│   ├── teams/                 # 2 team orchestrators (route + coordinate)
│   ├── workflows/             # Cluster bootstrap + component deploy
│   ├── tools/                 # MCPRegistry, AgentFactory, PromptBuilder
│   ├── skills/                # 6 pluggable skill packs
│   │   ├── k3s-ops/           #   K3s deploy & maintenance
│   │   ├── k8s-diagnostic/    #   K8s troubleshooting
│   │   ├── monitoring-ops/    #   VictoriaMetrics operations
│   │   ├── logging-ops/       #   VictoriaLogs operations
│   │   ├── infra-deploy-guide/#   Infrastructure deployment guide
│   │   └── middleware-ops/    #   Middleware operations
│   ├── prompts/               # Layered prompt builder
│   └── knowledge/             # Runbooks
├── mcp-servers/               # 6 MCP servers (87+ tools)
│   ├── linux-mcp/             #   SSH remote ops (22 tools)
│   ├── k8s-mcp/               #   K8s apiserver direct (19 tools)
│   ├── argocd-mcp/            #   ArgoCD management (11 tools)
│   ├── deploy-mcp/            #   Helm + kubectl (21 tools)
│   ├── gitops-mcp/            #   Git + manifest gen (14 tools)
│   └── aianalysis-mcp/        #   Observability queries
├── charts/                    # Helm values templates
├── portal/                    # Next.js web UI
└── deploy/                    # Docker Compose + K8s manifests

Quick Start

Prerequisites

Python 3.12+
Docker & Docker Compose
At least one LLM API key (Anthropic or DeepSeek)

Local Development

# Clone
git clone git@github.com:clcc2019/agentic-infra.git
cd agentic-infra

# Configure
cp .env.example .env
# Edit .env — set at least one LLM API key

# Launch full stack
cd deploy && docker-compose up -d

# Access
# Portal:      http://localhost:3000
# AgentOS API: http://localhost:8000/docs

AgentOS Only (Dev Mode)

uv venv --python 3.12 && source .venv/bin/activate
uv pip install -e .
fastapi dev agent-os/app.py

Agent Capabilities

Agent	Model	Responsibility	MCP Tools
MonitorAgent	DeepSeek	Metrics queries, alert rules, dashboards	aianalysis-mcp
LoggingAgent	DeepSeek	Log search & analysis, collection config	aianalysis-mcp
MaintenanceAgent	Claude	Pod/node diagnostics, cluster health checks	aianalysis + k8s-mcp + linux-mcp
InfraDeployAgent	Claude	K3s deploy, infra bootstrap (ArgoCD/VM/VL)	linux + k8s + argocd + deploy + gitops
MiddlewareAgent	Claude	Middleware lifecycle (Redis/MySQL/Kafka)	argocd + deploy + gitops

MCP Servers

Server	Port	Tools	Description
linux-mcp	8085	22	SSH remote execution, system info, service management, network checks
k8s-mcp	8086	19	K8s apiserver direct — Pod/Node/Namespace/Resource CRUD, exec, logs
argocd-mcp	8082	11	ArgoCD app lifecycle, sync, rollback, repo management
deploy-mcp	8083	21	Helm install/upgrade/rollback, kubectl resource ops
gitops-mcp	8084	14	Git clone/commit/push, Helm values update, manifest gen
aianalysis-mcp	8081	—	Prometheus, VictoriaLogs, alerting, tracing

Pluggable MCP Config

All MCP servers are declared in mcp_servers.yaml — add new servers without touching agent code:

servers:
  - name: my-custom-mcp
    url: http://localhost:9090/mcp
    description: My custom operations server
    enabled: true
    tags: [custom, ops]
    trigger_keywords: [custom, special]

Runtime management via Admin API:

# Register new MCP at runtime
curl -X POST http://localhost:8000/api/admin/mcp/register \
  -H 'Content-Type: application/json' \
  -d '{"name":"my-mcp","url":"http://localhost:9090/mcp"}'

# Check status
curl http://localhost:8000/api/admin/mcp/status

Skills System

Skills are self-contained knowledge packs (Markdown + scripts + references) auto-discovered from filesystem:

skills/k3s-ops/
├── SKILL.md              # Instructions + trigger keywords
├── scripts/
│   ├── install_k3s_server.sh
│   ├── install_k3s_agent.sh
│   ├── k3s_health_check.sh
│   └── k3s_backup.sh
└── references/
    └── k3s_config_options.md

Progressive loading: Discovery (keyword match) → Activation (inject into prompt) → Execution (run scripts)

Workflows

ClusterBootstrap — Full cluster initialization

Validate Cluster → Deploy ArgoCD → Deploy Monitoring → Deploy Logging → Global Verify
                      ↑                  ↑                  ↑
                 Human Approval     Human Approval     Human Approval

K3s Deployment — End-to-end via SSH

User: "Deploy K3s cluster on 3 servers"
  │
  ├─ linux_get_system_info  →  Pre-flight checks
  ├─ linux_upload_script    →  Upload install script
  ├─ linux_execute_command  →  Install K3s server
  ├─ linux_read_file        →  Read node token
  ├─ linux_execute_command  →  Join worker nodes
  └─ nodes_list + pods_list →  Verify cluster

Tech Stack

Layer	Technology
Agent Framework	Agno — AgentOS + Teams + Workflows
LLM	Claude (complex reasoning) + DeepSeek (lightweight queries)
Tool Protocol	MCP — Model Context Protocol
Frontend	Next.js + AG-UI Protocol
K8s Distribution	K3s — Lightweight Kubernetes
IaC	Helm Charts + ArgoCD GitOps
Monitoring	VictoriaMetrics + kube-prometheus-stack
Logging	VictoriaLogs

License

Apache License 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic Infra

Overview

Key Features

Architecture

Project Structure

Quick Start

Prerequisites

Local Development

AgentOS Only (Dev Mode)

Agent Capabilities

MCP Servers

Pluggable MCP Config

Skills System

Workflows

ClusterBootstrap — Full cluster initialization

K3s Deployment — End-to-end via SSH

Tech Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
agent-os		agent-os
charts		charts
deploy		deploy
docs		docs
mcp-servers		mcp-servers
portal		portal
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
README_CN.md		README_CN.md
mcp_servers.yaml		mcp_servers.yaml
pyproject.toml		pyproject.toml
screenshot.js		screenshot.js
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Agentic Infra

Overview

Key Features

Architecture

Project Structure

Quick Start

Prerequisites

Local Development

AgentOS Only (Dev Mode)

Agent Capabilities

MCP Servers

Pluggable MCP Config

Skills System

Workflows

ClusterBootstrap — Full cluster initialization

K3s Deployment — End-to-end via SSH

Tech Stack

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages