Skip to content

Conversation

@amehrjou
Copy link
Contributor

This PR introduces a reasoning trace tracking and reporting system to Biomni, enabling detailed capture and visualization of agent reasoning processes. It enhances debugging, monitoring, and reporting for biomni's reasoning workflows while maintaining full backward compatibility.

New Features

Enhanced Agent with Trace Tracking

  • A1WithTrace: Extended A1 agent with integrated reasoning trace tracking compatible with its existing functionalities
  • Real-time trace capture: Records reasoning steps, code executions, and tool interactions
  • Interactive HTML reports: Generates collapsible reports with full styling
  • Performance metrics: Tracks execution time, step counts, and code execution patterns

Interactive Agent Capability & Features

  • Command-line interactive mode: Real-time query processing from the terminal
  • Continuous session support: Run multiple queries in one session with stored reports
  • Live trace generation: Immediate HTML report creation for each query
  • Batch and interactive modes: Flexible switching
    • Interactive mode: python tutorials/reasoning_trace_demo.py --interactive
    • Non-interactive mode: python tutorials/reasoning_trace_demo.py (default)
  • Session management: Continuous interaction with quit/exit commands

Comprehensive Reporting System

  • ReasoningTraceReporter: Core class for trace collection and HTML generation
  • Terminal output capture: Logs all console interactions
  • Plot management: Organizes and embeds generated plots

Interactive Demo System

  • Reasoning trace demo: Example queries covering PK/PD modeling, gene regulatory networks, and cell population dynamics
  • Interactive exploration: Immediate trace generation and experimentation
  • Batch processing: Execute multiple queries with consolidated reports

🔧 Technical Implementation

Core Components

  • biomni/agent/a1_with_trace.py: Enhanced agent with trace tracking
  • biomni/evaluation/reasoning_trace_reporter.py: Trace reporter and HTML generator
  • biomni/evaluation/__init__.py: Evaluation module initialization
  • tutorials/reasoning_trace_demo.py: Demonstration script with example queries

Dependencies

  • Added jinja2 for HTML template rendering
  • Added pandas for analysis and export
  • Updated pyproject.toml

HTML Report Generation

  • Collapsible interactive reports
  • Performance metrics visualization
  • Terminal output logging
  • Plot organization and embedding

Export & Analysis

  • Query-specific folder structure
  • Metadata tracking for reproducibility

💡 Use Cases

  • Research monitoring: Detailed reasoning process tracking and analysis
  • Debugging: Step-by-step traces in a more readable format than raw terminal output
  • Performance monitoring: Timing and efficiency assessment
  • Educational & development: Understanding reasoning patterns, testing new features
  • Interactive exploration: Real-time experimentation with agent queries

📎 Example Report

An example reasoning trace report is attached to this PR for easier review and to illustrate the system’s output and layout.
example_report.pdf

@amehrjou amehrjou force-pushed the eval-reasoning-trace-local branch from d8995c6 to 17f1bd7 Compare August 29, 2025 19:55
@kexinhuang12345
Copy link
Collaborator

this is a great feature! i am wondering if it is possible to simply add a parameter in the A1 agent that says "trace_tracking = True" and it will geenerate it? this way, future development on A1 does not need to migrate to A1withTraces as well.

… with interactive agent

- Add A1WithTrace agent with integrated trace tracking capabilities
- Implement ReasoningTraceReporter for detailed HTML report generation
- Add interactive agent mode for real-time command-line query processing
- Add reasoning trace demo with example queries
- Add performance metrics and timing tracking
- Add terminal output capture and complete logging
- Add plot management and visualization organization
- Update dependencies (jinja2, pandas) for HTML generation
- Maintain backward compatibility with existing functionality

The system enables detailed tracking and reporting of AI reasoning processes
with interactive HTML reports, comprehensive performance monitoring, and
real-time interactive query processing from command line for debugging
and analysis purposes.
- Remove A1WithTrace class and integrate trace tracking directly into A1 agent.
- Update initialization parameters to include trace tracking options.
- Modify demo scripts to utilize the updated A1 agent with trace tracking.
- Ensure backward compatibility with existing functionality while enhancing trace reporting capabilities.
@amehrjou amehrjou force-pushed the eval-reasoning-trace-local branch from 17f1bd7 to a95506a Compare September 5, 2025 14:06
@amehrjou
Copy link
Contributor Author

amehrjou commented Sep 5, 2025

this is a great feature! i am wondering if it is possible to simply add a parameter in the A1 agent that says "trace_tracking = True" and it will geenerate it? this way, future development on A1 does not need to migrate to A1withTraces as well.

Thanks, that makes sense. My initial thought was to keep a1.py untouched for now, but I’ve now refactored it (a95506a) so that reasoning trace tracking is available directly in A1 via a switch.

@amehrjou amehrjou changed the title Add comprehensive reasoning trace tracking and reporting system with batch and interactive agent Feature: Add comprehensive reasoning trace tracking and reporting system with batch and interactive agent Sep 5, 2025
@amehrjou
Copy link
Contributor Author

@kexinhuang12345 just a quick follow-up, I’ve incorporated the earlier feedback and updated this PR so that reasoning trace tracking is integrated directly into A1 as a parameter.

I also noticed the new PR #217 for downloading conversation history in PDF. To avoid any conflicts, I wanted to check if there’s anything I should adjust, or if I can help align the two.

@igor-sadalski
Copy link
Contributor

igor-sadalski commented Sep 26, 2025

ok @amehrjou i think you should merge first your changes (@kexinhuang12345), then ill fix mine code to work with your (i.e. ill add your thinking trace to PDF generation) and then we will merge mine. Also happy to help at any stage:)

@kexinhuang12345
Copy link
Collaborator

Great - sounds like a plan! thanks Igor and Arash!

@amehrjou
Copy link
Contributor Author

Thanks @igor-sadalski! I noticed #217 was merged ahead of this. @serena2z could you let me know if any updates are needed here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants