Hefestus is a Go API that uses local Language Models (LLMs) via Ollama to analyze and resolve development errors across multiple domains, such as Kubernetes, GitHub Actions, and ArgoCD.
Hefestus was developed as a study project to explore:
- How to use Golang to create a development error troubleshooting API.
- Ideas for integrating error detection and solution automation in CI/CD pipelines and other tools (e.g., Teams, Slack, GitHub).
You can configure Hefestus to receive errors from endpoints or pipelines and get solutions directly in the console or other integrated systems.
I particularly enjoy the topic of observability and wanted to build something that could maximize the effect of open-source tools like Zabbix and Rundeck.
So I built Hefestus to be part of a solution that monitors, detects, resolves and communicates with teams using only open-source tools, including AI agents in the flow, for problem resolution. Hefestus fits in the middle, interpreting the error and forwarding the response to the next agent, with the power to invoke Rundeck scripts based on error understanding, being more assertive in that first moment of a problem.
For pipeline integration, the idea is similar: capture the error and pass it through the API endpoint, ultimately obtaining a solution suggestion for the end user of the pipeline.
The idea is to use the API as a man-in-the-middle between the log content and the self-healing automation.
Something like this:
flowchart LR
Zabbix(Zabbix) --> |POST /api/errors/domain| Hefestus(Hefestus)
Hefestus --> |Processa Erro| LLM[LLM Processing]
LLM --> |Pattern Match| Decision{Conhecido?}
Decision --> |Sim| Decision2{Pattern Específico?}
Decision2 --> |Executa Self-Healing| Rundeck(Rundeck)
Rundeck --> |Executa| Healing[Self-Healing Script]
Healing --> |Status| Response[Retorna Resultado]
Decision2 --> |Envia Mensagem| SlackTeams[Slack & Teams]
SlackTeams --> |Notificação| Response2[Retorna Notificação]
Decision --> |Não| Response3[Mensagem sem contexto do erro]
| Feature | Description |
|---|---|
| Multi-Domain Architecture | Dynamic processing with specialized prompts for each domain (e.g., Kubernetes, GitHub). |
| LLM Integration | Uses Ollama to process open-source models locally with optimized prompts. |
| Error Dictionaries | Database with error patterns by domain, increasing solution accuracy. |
| Swagger UI | Interactive documentation for testing API endpoints. |
| Parameter Control | Fine-tuning by domain: temperature, tokens, etc. |
- Go 1.21+: Main programming language.
- Gin: Web framework for API construction.
- Ollama: For processing language models locally and making responses specific.
- Swagger: Interactive API documentation, for easy navigation, although currently with only one endpoint.
- Docker: Containerization for easy local deployment.
hefestus/
├── cmd/server/ # entrypoint
├── internal/
│ ├── models/ # data structure
│ └── services/ # Logics
├── pkg/
│ └── ollama/ # ollama clienmt
├── config/
│ └── domains.json # domain configs
└── data/patterns/ # error dictionaries
Make sure you have the following dependencies installed:
- Go (1.21+)
- Docker (for running the application in container, optional)
git clone https://github.com/yourusername/hefestus.git
cd hefestus
cp .env.example .env
go run cmd/server/main.gocurl -X POST http://localhost:8080/api/errors/kubernetes \
-H "Content-Type: application/json" \
-d '{
"error_details": "0/3 nodes are available: insufficient memory",
"context": "Deploying new pod in production cluster"
}'{
"error": {
"causa": "Nodes sem memória disponível",
"solucao": "kubectl describe nodes\nkubectl top nodes"
},
"message": "Resolution retrieved successfully"
}Access the documentation at:
http://localhost:8080/swagger/index.html
docker build -t hefestus:latest .docker run -d \
-p 8080:8080 \
-e OLLAMA_MODEL=qwen2.5:1.5b \
--name hefestus \
hefestus:latestExample of domain configuration (domains.json):
{
"domains": [
{
"name": "kubernetes",
"prompt_template": "Analyze the Kubernetes error and suggest solutions.",
"parameters": {
"temperature": 0.7,
"max_tokens": 150
},
"dictionary_path": "data/patterns/kubernetes.json"
}
]
}{
"patterns": {
"insufficient_resources": {
"pattern": "\\b(insufficient|not enough)\\s+(cpu|memory|resources)\\b",
"category": "RESOURCE_LIMITS",
"solutions": [
"Check cluster resource usage.",
"Consider increasing allocated resources."
]
}
}
}Although this is a study project, contributions are welcome! To contribute:
- Fork the repository.
- Create a branch for your feature or fix: git checkout -b my-feature.
- Commit your changes: git commit -m 'Add my new feature'.
- Push to the branch: git push origin my-feature.
- Open a pull request.
📝 License This project is licensed under the MIT License.
If you have any questions, reach out via GitHub Issues and I'll be happy to help! ^^;
