Skip to content

Enhanced AI-Researcher: Improved Docker Support & Error Resilience#63

Open
zhutoutoutousan wants to merge 5 commits intoHKUDS:mainfrom
zhutoutoutousan:main
Open

Enhanced AI-Researcher: Improved Docker Support & Error Resilience#63
zhutoutoutousan wants to merge 5 commits intoHKUDS:mainfrom
zhutoutoutousan:main

Conversation

@zhutoutoutousan
Copy link
Copy Markdown

🚀 Enhanced AI-Researcher: Improved Docker Support & Error Resilience

📋 Overview

This PR addresses critical user experience issues that were causing users to abandon the project due to setup difficulties and network-related failures. The main focus is on making AI-Researcher actually runnable and preventing process halts due to common network/API issues.

🎯 Key Problems Solved

1. Docker Setup Complexity

  • Before: Users struggled with manual Docker setup, environment configuration, and service orchestration
  • After: One-command setup with docker-compose up

2. Network Error Failures

  • Before: Any network issue, API timeout, or service unavailability would halt the entire process
  • After: Graceful error handling with fallback responses and continued execution

3. Agent System Fragility

  • Before: Agent functions would crash on None values, missing context variables, or malformed responses
  • After: Robust error handling with safe defaults and comprehensive try-catch blocks

🔧 Major Improvements

🐳 Docker & Deployment

  • Added docker-compose.yml with pre-configured services for AI-Researcher and Web GUI
  • Added Dockerfile for web application with all dependencies
  • Added requirements-to-pyproject.py script for dependency management
  • Added comprehensive Docker setup guide in README

🛡️ Error Resilience

  • Enhanced all agent functions with proper exception handling
  • Added safe context variable access using .get() with defaults
  • Implemented graceful degradation when services are unavailable
  • Added comprehensive logging for debugging and monitoring

🔄 Process Continuity

  • Network timeout handling - continues execution even if external APIs fail
  • API error recovery - provides fallback responses instead of crashing
  • Context variable safety - prevents KeyError exceptions from halting execution
  • Result validation - handles None values and malformed responses gracefully

📁 Files Modified

Core Infrastructure

  • docker-compose.yml - Complete service orchestration
  • Dockerfile - Web application container
  • pyproject.toml - Updated project configuration
  • requirements-to-pyproject.py - Dependency conversion script

Agent System Enhancements

  • research_agent/inno/agents/inno_agent/*.py - All agent files enhanced with error handling
  • research_agent/inno/core.py - Improved result handling and error recovery
  • research_agent/inno/environment/docker_env.py - Added Docker container detection
  • research_agent/inno/tools/__init__.py - Fixed import paths

Application Logic

  • main_ai_researcher.py - Added comprehensive exception handling
  • research_agent/run_infer_idea.py - Enhanced error resilience
  • research_agent/run_infer_plan.py - Improved error handling
  • web_ai_researcher.py - Better configuration and error management

Configuration

  • research_agent/constant.py - Added platform configuration
  • fast_commit.ps1 - Resolved merge conflicts

🚀 Quick Start (New)

# Clone the repository
git clone https://github.com/HKUDS/AI-Researcher.git
cd AI-Researcher

# Start all services with one command
docker-compose up -d

# Access the web interface
open http://localhost:7860

🧪 Testing

Before This PR

# Users would typically encounter:
# 1. Complex manual Docker setup
# 2. Network timeouts halting execution
# 3. Agent crashes on API failures
# 4. Process termination on missing context variables

After This PR

# Users can now:
# 1. Start with docker-compose up -d
# 2. Continue execution even with network issues
# 3. Get meaningful error messages instead of crashes
# 4. Complete tasks even with partial API failures

🔍 Technical Details

Error Handling Improvements

  • Agent Functions: All agent functions now return Result objects with proper error handling
  • Context Variables: Safe access using .get() methods with sensible defaults
  • Network Calls: Timeout handling and retry logic for external API calls
  • Process Flow: Graceful degradation when components fail

Docker Enhancements

  • Multi-service orchestration with proper networking
  • Volume management for persistent data
  • Environment variable configuration for easy customization
  • Health checks and service dependencies

Configuration Management

  • Environment-based configuration with sensible defaults
  • Docker-aware execution (detects when running in containers)
  • Flexible port and host configuration

📊 Impact

User Experience

  • Reduced setup time from hours to minutes
  • Eliminated common failure points that caused user abandonment
  • Improved error messages for better debugging
  • Graceful degradation instead of complete failure

Developer Experience

  • Simplified deployment with Docker Compose
  • Better error handling for debugging
  • Consistent environment across different systems
  • Comprehensive logging for troubleshooting

🎯 Migration Guide

For Existing Users

  1. Backup your current setup (if any)
  2. Stop existing containers (if running)
  3. Pull the latest changes
  4. Run docker-compose up -d
  5. Access the new web interface

For New Users

  1. Clone the repository
  2. Run docker-compose up -d
  3. Start using AI-Researcher immediately

🔮 Future Improvements

This PR establishes a solid foundation for:

  • Kubernetes deployment support
  • Multi-node scaling capabilities
  • Advanced monitoring and metrics
  • Automated testing in containerized environments

📝 Breaking Changes

None - This PR is fully backward compatible and enhances existing functionality without breaking changes.

🧪 Testing Checklist

  • Docker Compose setup works on multiple platforms
  • Web interface is accessible and functional
  • Agent system handles network errors gracefully
  • Context variables are safely accessed
  • Error messages are informative and actionable
  • Process continues execution after API failures
  • Logging provides useful debugging information

🤝 Contributing

This PR demonstrates our commitment to:

  • User experience - Making AI-Researcher actually usable
  • Error resilience - Building robust systems that don't fail silently
  • Developer experience - Simplifying setup and deployment
  • Community engagement - Addressing real user pain points

This PR transforms AI-Researcher from a "works on my machine" project into a production-ready, user-friendly tool that anyone can run successfully.

@zhutoutoutousan
Copy link
Copy Markdown
Author

66443f8030ad7edb1808343386f80d32

@skylooop
Copy link
Copy Markdown

skylooop commented Sep 22, 2025

Hi! Are you sure that everything is working fine? For example from repo Im getting 'Processing completed, token usage: completion=0, prompt=0, total=0'. Then, after taking gnn example, im getting error:
for input_text, output_text in state_list: ValueError: not enough values to unpack (expected 2, got 1)

On my end it is still irreproducible. I don't get how is this spotlight paper, when you cant basically run it without pain & source code major changes

@KhuongDuy25
Copy link
Copy Markdown

can you give me a way to contact you?

@KhuongDuy25
Copy link
Copy Markdown

KhuongDuy25 commented Nov 10, 2025

lần đầu chạy nó sẽ báo như này ư?
nó cứ báo lỗi "Error in open_local_terminal_output: Not a regular file"
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants