Professional DevOps skills for Clawdbot
Bring enterprise-grade DevOps expertise to your Clawdbot instance with safe, auditable execution.
Built for Clawdbot - The AI assistant that actually helps with DevOps
Be a part of Agentic Ops Buiders - To build Agentic Devops Automation together with fellow builders.
DevOps Execution Engine is a comprehensive skill package for Clawdbot that transforms it into a professional DevOps assistant.
Clawdbot is the only AI assistant that:
- Actually executes commands (not just suggests them like ChatGPT)
- Integrates with your infrastructure (kubectl, AWS CLI, Terraform, etc.)
- Provides human-in-the-loop safety (approve before execution)
- Maintains audit trails (complete accountability)
- Works across platforms (Telegram, Discord, WhatsApp, CLI, web)
This skill package extends Clawdbot with 11 production-ready DevOps skills, giving it deep domain expertise in:
- Kubernetes operations and debugging
- Cloud cost optimization
- Incident response
- Infrastructure as Code
- Container management
- And much more...
Primary Platform: Clawdbot ⭐ (Full integration)
Also Compatible With:
- LangChain - Can be adapted as custom tools
- AutoGPT/BabyAGI - Execution engine can be integrated
- Custom AI Agents - Core modules are platform-agnostic Node.js
Note: Full functionality (approval workflow, audit logging, skill integration) works best with Clawdbot's architecture.
DevOps teams face a dilemma:
- ✅ AI can diagnose issues faster than humans
- ❌ But you can't let AI execute commands blindly in production
- ✅ Manual execution is slow and error-prone
- ❌ But automation without oversight is dangerous
Current options are inadequate:
- Pure automation → Risky, no human oversight
- Manual everything → Slow, defeats the purpose of AI
- ChatGPT → Can't actually execute, just suggests commands
DevOps Execution Engine bridges the gap:
- AI Diagnoses - Analyzes logs, metrics, cluster state
- AI Generates Plan - Creates detailed, reviewable execution plan
- Human Approves - You review and approve (or reject)
- AI Executes Safely - Runs with monitoring, rollback ready
- AI Verifies - Confirms success and logs everything
The result: AI speed + Human judgment = Safe, fast operations
- No auto-execution - Always requires human approval for risky operations
- Risk classification - Every action rated LOW/MEDIUM/HIGH/CRITICAL
- Rollback plans - Every plan includes how to undo
- Audit trail - Complete log of who approved what and when
- Pre/post validation - Checks before and after execution
11 Production-Ready DevOps Skills:
- Kubernetes - Debug, deploy, manage (k8s-debug, k8s-deploy, argocd-gitops)
- Cloud - AWS operations and cost optimization (aws-ops, cost-optimization)
- Infrastructure - Terraform, Docker operations (terraform-workflow, docker-ops)
- Operations - Incident response, log analysis, health checks (incident-response, log-analysis, system-health)
- Development - Git workflows and best practices (git-workflow)
Incident Response:
You: SEV1 - API is down!
AI: [diagnoses → identifies database crash → generates recovery plan]
You: [reviews plan → approves]
AI: [executes → restores service → verifies → logs incident]
Result: 5-minute recovery instead of 30-minute scramble
Cost Optimization:
You: Find AWS cost savings
AI: [analyzes → identifies $3,250/month in waste]
You: [reviews idle resources → approves cleanup]
AI: [terminates safely → verifies → updates inventory]
Result: $39,000/year saved with full audit trail
Safe Deployments:
You: Deploy api v2.5.0 with canary strategy
AI: [generates multi-stage canary plan → monitors each stage]
You: [approves each stage after review]
AI: [deploys → monitors → promotes → completes]
Result: Zero-downtime deployment with human gates
With this skill package installed, your Clawdbot can:
✅ Diagnose production issues in seconds (not hours)
✅ Generate safe execution plans with full rollback procedures
✅ Execute approved changes with monitoring and verification
✅ Respond to incidents with structured playbooks
✅ Optimize cloud costs and find waste automatically
✅ Deploy safely with canary strategies and human gates
✅ Maintain complete audit trails for compliance
DevOps Teams Using Clawdbot who want:
- Professional-grade DevOps expertise built-in
- Safe execution with human oversight
- Domain knowledge for Kubernetes, AWS, Docker, Terraform
- Incident response capabilities
- Cost optimization insights
- Complete audit trails for compliance
Platform Engineers who want to:
- Give their team a 24/7 DevOps assistant
- Standardize operations with tested playbooks
- Reduce mean-time-to-recovery (MTTR)
- Onboard new team members faster
Solo DevOps/SREs who want:
- A second pair of eyes before executing
- Quick diagnosis without searching docs
- Structured incident response
- Cost optimization without manual analysis
You need Clawdbot installed first:
# Install Clawdbot
npm install -g clawdbot
# Start the gateway
clawdbot gateway start
# Verify it's running
clawdbot status📚 New to Clawdbot? Check out docs.clawd.bot for installation guide.
Optional tools (depending on what you'll manage):
kubectlfor Kubernetes operationsawsCLI for AWS operationsterraformfor IaC operationsdockerfor container operations
# 1. Clone this skill package
git clone https://github.com/agenticdevops/devops-execution-engine.git
cd devops-execution-engine
# 2. Install into your Clawdbot instance
clawdbot skills:install .
# 3. Verify installation
clawdbot skills:list | grep devops-execution-engineThat's it! Your Clawdbot now has professional DevOps expertise.
Start with read-only operations to build trust:
# Start Clawdbot chat
clawdbot chatThen try these safe commands:
You: Check cluster health
You: List all pods across namespaces
You: Show recent Kubernetes events
You: Analyze system resource usage
All read-only, zero risk. Get familiar with how it works.
When you're ready to let AI execute (with your approval):
You: I have pods in CrashLoopBackOff, can you fix them?
AI: [diagnoses the issue]
[generates detailed execution plan]
[shows you exactly what will be done]
[waits for your approval]
You: yes
AI: [executes step-by-step with progress updates]
[verifies the fix worked]
[logs everything to audit trail]
You're always in control. Review every plan before approving.
┌─────────────────────────────────────────────────────────────┐
│ 1. You: "Fix the crashloop pods" │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 2. AI Diagnoses (read-only, safe) │
│ - Checks pod status │
│ - Analyzes logs │
│ - Reviews events │
│ - Identifies root cause │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 3. AI Generates Execution Plan │
│ │
│ 📋 PLAN: Fix CrashLoopBackOff │
│ Risk: MEDIUM | Time: ~5min │
│ │
│ Steps: │
│ 1. Increase memory limit 256Mi → 512Mi │
│ 2. Wait for rollout (5min) │
│ 3. Verify all pods running │
│ │
│ Rollback: kubectl rollout undo deployment/api │
│ │
│ Approve? (yes/no) │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 4. YOU REVIEW & APPROVE │
│ - Read the plan │
│ - Understand impact │
│ - Check rollback procedure │
│ - Approve or reject │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 5. AI Executes (only after approval) │
│ ✓ Step 1: Patching deployment... done │
│ ✓ Step 2: Waiting for rollout... done (2m 15s) │
│ ✓ Step 3: Verifying pods... all running │
│ │
│ ✅ Complete! Logged to audit trail │
└─────────────────────────────────────────────────────────────┘
Every action creates an audit entry:
{
"timestamp": "2026-01-26T13:00:00Z",
"plan_id": "plan-20260126-001",
"action": "kubectl patch deployment",
"risk": "MEDIUM",
"status": "success",
"approver": "your-name",
"duration_seconds": 135
}Full transparency. Full accountability.
You: Check cluster health
Clawd: [runs diagnostics]
- 3/3 nodes ready
- 2 pods in CrashLoopBackOff (api-service)
- Disk usage: worker-1 at 85%
You: Fix the crashloop
Clawd: 📋 EXECUTION PLAN: plan-001
Title: Fix CrashLoopBackOff in api-service
Risk: MEDIUM
Time: ~5min
Steps:
1. Increase memory 256Mi → 512Mi
2. Wait for rollout (5min)
3. Verify pods running
Approve? (yes/no)
You: yes
Clawd: ✅ Executing...
[runs steps with progress]
✅ Completed! All pods running.
- No auto-execution - always requires approval
- Risk assessment for every action
- Pre-flight validation
- Rollback plans included
- Complete audit trail
Kubernetes
- k8s-debug, k8s-deploy, argocd-gitops
Cloud
- aws-ops, cost-optimization
Infrastructure
- terraform-workflow, docker-ops
Operations
- incident-response, log-analysis, system-health, git-workflow
Every operation generates a YAML execution plan:
plan:
title: "What I'm fixing"
risk: MEDIUM
estimated_time: 5min
rollback: ["how to undo"]
steps:
- action: kubectl_patch
command: "exact command"
risk: MEDIUM- Incident Response - Structured playbooks for outages
- Kubernetes Management - Debug and fix cluster issues
- Cost Optimization - Find and eliminate waste
- Safe Deployments - Deploy with confidence and rollback
- Infrastructure as Code - Terraform workflows
- Container Operations - Docker debugging and management
- 📖 SKILL.md - Complete skill documentation
- 🚀 docs/INSTALLATION.md - Installation guide
- 📚 docs/SKILLS.md - Skills reference
- 🔒 docs/SAFETY.md - Safety model
- 💡 docs/EXAMPLES.md - Usage examples
- 🔧 docs/API.md - API reference
"Debug pods in production"
"Why is api-service crashing?"
"Check node resource usage"
"We have a SEV1 - API down"
"High error rates in payment service"
"Check recent deployments"
"Analyze AWS costs"
"Find idle resources"
"Suggest optimizations"
"Deploy api v2.1.0 to prod"
"Rollback last deployment"
"Check ArgoCD status"
devops-execution-engine/
├── SKILL.md # Main documentation
├── core/ # Execution engine
│ ├── plan-generator.js
│ ├── executor.js
│ ├── approval.js
│ └── logger.js
├── templates/ # Plan templates
├── skills/ # 11 DevOps skills
├── examples/ # Example plans
└── docs/ # Documentation
- 🟢 LOW - Read-only, no impact
- 🟡 MEDIUM - Resource changes, reversible
- 🔴 HIGH - Production changes, potential downtime
- ⛔ CRITICAL - Data/security operations
- Generate plan
- Present with risk assessment
- Wait for approval
- Execute with monitoring
- Validate results
- Log to audit trail
We welcome contributions!
- 🐛 Bug reports - Open an issue
- 💡 Feature requests - Start a discussion
- 🔧 Pull requests - See CONTRIBUTING.md
- 📚 Documentation - Improvements welcome
- 🎓 Skills - Add new DevOps skills
- Clawdbot v1.0.0+
- kubectl (for Kubernetes operations)
- aws CLI (for AWS operations, optional)
- terraform (for IaC operations, optional)
- docker (for container operations, optional)
Apache 2.0 - See LICENSE
- GitHub Issues - Bug reports and features
- Discussions - Questions and community
- Discord - https://discord.com/invite/clawd
- Docs - https://docs.clawd.bot
DevOps teams love AI assistance but fear automation. This skill bridges that gap:
- AI does the diagnosis and planning
- Human reviews and approves
- AI executes safely with monitoring
- Everything is logged and reversible
The best of both worlds: AI speed + human oversight.
Clawdbot is different from other AI assistants:
| Feature | ChatGPT | GitHub Copilot | Clawdbot + DevOps Skills |
|---|---|---|---|
| Suggests commands | ✅ | ❌ | ✅ |
| Actually executes | ❌ | ❌ | ✅ |
| Domain expertise | ❌ | Code only | ✅ DevOps |
| Approval workflow | ❌ | ❌ | ✅ |
| Audit trail | ❌ | ❌ | ✅ |
| Rollback procedures | ❌ | ❌ | ✅ |
| Multi-platform | Web only | IDE only | ✅ Everywhere |
Clawdbot + DevOps Skills = The only AI that can safely manage your infrastructure
- 🌐 Website: clawd.bot
- 📖 Docs: docs.clawd.bot
- 💬 Discord: discord.com/invite/clawd
- 🐙 GitHub: github.com/clawdbot/clawdbot
Help make this the best DevOps skill package for Clawdbot! See CONTRIBUTING.md
Apache 2.0 - See LICENSE
Built with ❤️ by the Clawdbot community