Feature: Add comprehensive reasoning trace tracking and reporting system with batch and interactive agent #177

amehrjou · 2025-08-29T19:28:29Z

This PR introduces a reasoning trace tracking and reporting system to Biomni, enabling detailed capture and visualization of agent reasoning processes. It enhances debugging, monitoring, and reporting for biomni's reasoning workflows while maintaining full backward compatibility.

✨ New Features

Enhanced Agent with Trace Tracking

A1WithTrace: Extended A1 agent with integrated reasoning trace tracking compatible with its existing functionalities
Real-time trace capture: Records reasoning steps, code executions, and tool interactions
Interactive HTML reports: Generates collapsible reports with full styling
Performance metrics: Tracks execution time, step counts, and code execution patterns

Interactive Agent Capability & Features

Command-line interactive mode: Real-time query processing from the terminal
Continuous session support: Run multiple queries in one session with stored reports
Live trace generation: Immediate HTML report creation for each query
Batch and interactive modes: Flexible switching
- Interactive mode: python tutorials/reasoning_trace_demo.py --interactive
- Non-interactive mode: python tutorials/reasoning_trace_demo.py (default)
Session management: Continuous interaction with quit/exit commands

Comprehensive Reporting System

ReasoningTraceReporter: Core class for trace collection and HTML generation
Terminal output capture: Logs all console interactions
Plot management: Organizes and embeds generated plots

Interactive Demo System

Reasoning trace demo: Example queries covering PK/PD modeling, gene regulatory networks, and cell population dynamics
Interactive exploration: Immediate trace generation and experimentation
Batch processing: Execute multiple queries with consolidated reports

🔧 Technical Implementation

Core Components

biomni/agent/a1_with_trace.py: Enhanced agent with trace tracking
biomni/evaluation/reasoning_trace_reporter.py: Trace reporter and HTML generator
biomni/evaluation/__init__.py: Evaluation module initialization
tutorials/reasoning_trace_demo.py: Demonstration script with example queries

Dependencies

Added jinja2 for HTML template rendering
Added pandas for analysis and export
Updated pyproject.toml

HTML Report Generation

Collapsible interactive reports
Performance metrics visualization
Terminal output logging
Plot organization and embedding

Export & Analysis

Query-specific folder structure
Metadata tracking for reproducibility

💡 Use Cases

Research monitoring: Detailed reasoning process tracking and analysis
Debugging: Step-by-step traces in a more readable format than raw terminal output
Performance monitoring: Timing and efficiency assessment
Educational & development: Understanding reasoning patterns, testing new features
Interactive exploration: Real-time experimentation with agent queries

📎 Example Report

An example reasoning trace report is attached to this PR for easier review and to illustrate the system’s output and layout.
example_report.pdf

kexinhuang12345 · 2025-09-01T22:15:18Z

this is a great feature! i am wondering if it is possible to simply add a parameter in the A1 agent that says "trace_tracking = True" and it will geenerate it? this way, future development on A1 does not need to migrate to A1withTraces as well.

… with interactive agent - Add A1WithTrace agent with integrated trace tracking capabilities - Implement ReasoningTraceReporter for detailed HTML report generation - Add interactive agent mode for real-time command-line query processing - Add reasoning trace demo with example queries - Add performance metrics and timing tracking - Add terminal output capture and complete logging - Add plot management and visualization organization - Update dependencies (jinja2, pandas) for HTML generation - Maintain backward compatibility with existing functionality The system enables detailed tracking and reporting of AI reasoning processes with interactive HTML reports, comprehensive performance monitoring, and real-time interactive query processing from command line for debugging and analysis purposes.

- Remove A1WithTrace class and integrate trace tracking directly into A1 agent. - Update initialization parameters to include trace tracking options. - Modify demo scripts to utilize the updated A1 agent with trace tracking. - Ensure backward compatibility with existing functionality while enhancing trace reporting capabilities.

for more information, see https://pre-commit.ci

amehrjou · 2025-09-05T14:10:04Z

this is a great feature! i am wondering if it is possible to simply add a parameter in the A1 agent that says "trace_tracking = True" and it will geenerate it? this way, future development on A1 does not need to migrate to A1withTraces as well.

Thanks, that makes sense. My initial thought was to keep a1.py untouched for now, but I’ve now refactored it (a95506a) so that reasoning trace tracking is available directly in A1 via a switch.

amehrjou · 2025-09-25T14:51:02Z

@kexinhuang12345 just a quick follow-up, I’ve incorporated the earlier feedback and updated this PR so that reasoning trace tracking is integrated directly into A1 as a parameter.

I also noticed the new PR #217 for downloading conversation history in PDF. To avoid any conflicts, I wanted to check if there’s anything I should adjust, or if I can help align the two.

igor-sadalski · 2025-09-26T00:19:36Z

ok @amehrjou i think you should merge first your changes (@kexinhuang12345), then ill fix mine code to work with your (i.e. ill add your thinking trace to PDF generation) and then we will merge mine. Also happy to help at any stage:)

kexinhuang12345 · 2025-09-26T04:57:12Z

Great - sounds like a plan! thanks Igor and Arash!

amehrjou · 2025-09-30T18:52:19Z

Thanks @igor-sadalski! I noticed #217 was merged ahead of this. @serena2z could you let me know if any updates are needed here?

amehrjou force-pushed the eval-reasoning-trace-local branch from d8995c6 to 17f1bd7 Compare August 29, 2025 19:55

amehrjou added 3 commits September 5, 2025 15:34

chore: apply pre-commit fixes (formatting, imports, whitespace)

42e8ec2

amehrjou force-pushed the eval-reasoning-trace-local branch from 17f1bd7 to a95506a Compare September 5, 2025 14:06

[pre-commit.ci] auto fixes from pre-commit.com hooks

56e6272

for more information, see https://pre-commit.ci

amehrjou changed the title ~~Add comprehensive reasoning trace tracking and reporting system with batch and interactive agent~~ Feature: Add comprehensive reasoning trace tracking and reporting system with batch and interactive agent Sep 5, 2025

amehrjou mentioned this pull request Sep 25, 2025

🧰 New Function: Download conversation with your agent as PDF! #217

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: Add comprehensive reasoning trace tracking and reporting system with batch and interactive agent #177

Feature: Add comprehensive reasoning trace tracking and reporting system with batch and interactive agent #177

Uh oh!

amehrjou commented Aug 29, 2025

Uh oh!

kexinhuang12345 commented Sep 1, 2025

Uh oh!

amehrjou commented Sep 5, 2025

Uh oh!

amehrjou commented Sep 25, 2025

Uh oh!

igor-sadalski commented Sep 26, 2025 •

edited

Loading

Uh oh!

kexinhuang12345 commented Sep 26, 2025

Uh oh!

amehrjou commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Feature: Add comprehensive reasoning trace tracking and reporting system with batch and interactive agent #177

Are you sure you want to change the base?

Feature: Add comprehensive reasoning trace tracking and reporting system with batch and interactive agent #177

Uh oh!

Conversation

amehrjou commented Aug 29, 2025

✨ New Features

Enhanced Agent with Trace Tracking

Interactive Agent Capability & Features

Comprehensive Reporting System

Interactive Demo System

🔧 Technical Implementation

Core Components

Dependencies

HTML Report Generation

Export & Analysis

💡 Use Cases

📎 Example Report

Uh oh!

kexinhuang12345 commented Sep 1, 2025

Uh oh!

amehrjou commented Sep 5, 2025

Uh oh!

amehrjou commented Sep 25, 2025

Uh oh!

igor-sadalski commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kexinhuang12345 commented Sep 26, 2025

Uh oh!

amehrjou commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

igor-sadalski commented Sep 26, 2025 •

edited

Loading