Description
Refactor the SAM server to separate computer vision capabilities into a dedicated CV agent that can reason about image analysis tasks.
Current State
sam_server.py couples SAM segmentation with visualization
- Detection pipeline: brightness → SAM → Claude verification
- No autonomous CV reasoning
Requirements
1. CV Agent Architecture
- Receives image + intent from orchestrator
- Autonomously selects appropriate CV tools
- Returns structured results with confidence
2. Tool Categories
- Classical CV: thresholding, morphology, edge detection, blob analysis
- SAM Tools: segmentation, mask refinement, point/box prompts
- VLM Tools: Claude Vision queries, object detection prompts
- Measurement: area, intensity, shape metrics
3. Agent Capabilities
- Multi-step analysis (e.g., "find dim embryos" → enhance contrast → detect → filter by brightness)
- Explain reasoning in results
- Suggest alternative approaches if initial method fails
4. Clean SAM Server
- Pure segmentation service (stateless)
- Remove visualization code
- Simple API: image + prompt → masks
Technical Approach
- Create
gently/agent/cv_agent.py with tool-using Claude instance
- Move classical CV to
gently/cv/ module
- Simplify
backend/sam_server.py to pure SAM inference
- CV agent uses SAM server as one of its tools
Key Files
backend/sam_server.py
gently/agent/sam_detection.py
- New:
gently/agent/cv_agent.py
- New:
gently/cv/classical.py, gently/cv/measurement.py
Description
Refactor the SAM server to separate computer vision capabilities into a dedicated CV agent that can reason about image analysis tasks.
Current State
sam_server.pycouples SAM segmentation with visualizationRequirements
1. CV Agent Architecture
2. Tool Categories
3. Agent Capabilities
4. Clean SAM Server
Technical Approach
gently/agent/cv_agent.pywith tool-using Claude instancegently/cv/modulebackend/sam_server.pyto pure SAM inferenceKey Files
backend/sam_server.pygently/agent/sam_detection.pygently/agent/cv_agent.pygently/cv/classical.py,gently/cv/measurement.py