OpsLens uses Vision-Language Models (VLM) to analyze dashboard screenshots and extract insights. The VLM can identify errors, anomalies, metrics, and important information from images.
curl http://localhost:8000/api/v1/test/vlm/statusExpected response:
{
"api_key_configured": true,
"model": "Qwen/Qwen2-VL-2B-Instruct",
"api_url": "https://api-inference.huggingface.co",
"status": "ready"
}curl -X POST http://localhost:8000/api/v1/test/vlm \
-F "file=@/path/to/your/screenshot.png"This will test the VLM with multiple prompts and return detailed results.
- Create or open an incident at http://localhost:3000
- Go to the Evidence tab
- Click "Upload Screenshot" (if available) or use the API
- Upload a dashboard screenshot
- Wait a few seconds for processing
- View the VLM analysis in the evidence content
# First, get an incident ID
INCIDENT_ID=$(curl -s http://localhost:8000/api/v1/incidents | jq -r '.[0].id')
# Upload screenshot
curl -X POST http://localhost:8000/api/v1/evidence/incident/$INCIDENT_ID/upload-screenshot \
-F "file=@/path/to/screenshot.png"The screenshot will be:
- Saved to the artifacts directory
- Processed by VLM asynchronously
- Analysis added to evidence content
- Embedding generated for RAG search
# Get evidence items for incident
curl http://localhost:8000/api/v1/evidence/incident/$INCIDENT_IDLook for evidence items with evidence_type: "screenshot" and check if content field has VLM analysis.
The VLM is prompted to:
- Describe what it sees in the dashboard
- Identify errors or anomalies
- Extract key metrics and their values
- Summarize important information
Good test images:
- Grafana dashboards with metrics
- Datadog error graphs
- AWS CloudWatch dashboards
- Kubernetes dashboard screenshots
- Application error screens
- Any monitoring/observability dashboard
-
Check API Key:
curl http://localhost:8000/api/v1/test/vlm/status
Ensure
api_key_configuredistrue -
Check Model Availability:
- The model might be loading (first request can take 30-60 seconds)
- Hugging Face free tier has rate limits
- Check Hugging Face model page: https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct
-
Check Logs:
docker-compose logs backend | grep -i vlm docker-compose logs celery-worker | grep -i vlm
-
Test Directly:
# Test endpoint with a simple image curl -X POST http://localhost:8000/api/v1/test/vlm \ -F "file=@test_image.png" | jq
"Model is loading" error:
- Wait 30-60 seconds and try again
- First request to a model loads it into memory
"Rate limit exceeded":
- Hugging Face free tier has limits
- Wait a few minutes between requests
- Consider upgrading to paid tier for production
"Invalid image format":
- Ensure image is in supported format (PNG, JPEG, etc.)
- Check file size (should be < 10MB)
"No analysis generated":
- Check Celery worker is running:
docker-compose ps celery-worker - Check worker logs for errors
- Verify image was saved correctly
-
Use the API endpoint:
import requests incident_id = "your-incident-id" with open("screenshot.png", "rb") as f: response = requests.post( f"http://localhost:8000/api/v1/evidence/incident/{incident_id}/upload-screenshot", files={"file": f} )
-
Check processing status:
evidence_id = response.json()["id"] # Poll for completion while True: evidence = requests.get(f"http://localhost:8000/api/v1/evidence/{evidence_id}").json() if evidence.get("content"): print("VLM analysis:", evidence["content"]) break time.sleep(2)
-
Webhook Integration:
- Set up webhooks in your monitoring tools
- Send screenshots automatically when alerts fire
- OpsLens will process them asynchronously
-
Manual Upload:
- Use the UI to upload screenshots during incidents
- VLM analysis appears automatically
-
Automated Capture:
- Configure monitoring tools to capture screenshots
- Send to OpsLens API automatically
- Get AI-powered insights without manual analysis
Default model: Qwen/Qwen2-VL-2B-Instruct
To change the model, update VLM_MODEL in backend/app/config.py or set environment variable.
Supported models:
Qwen/Qwen2-VL-2B-Instruct(default, fast, good quality)Qwen/Qwen2-VL-7B-Instruct(better quality, slower)- Other Hugging Face vision-language models
- Processing time: 5-30 seconds per image (depends on model and image size)
- Concurrent processing: Handled by Celery workers
- Rate limits: Depends on Hugging Face API tier
- Cost: Free tier available, paid tiers for production use
- Test with your own dashboard screenshots
- Integrate with your monitoring tools
- Set up automated screenshot capture
- Fine-tune prompts for your use case
- Consider self-hosting models for better performance