From aea3671e634732f949e6defa571b7656e9817a90 Mon Sep 17 00:00:00 2001
From: Joseph Pollack This page documents the API for DeepCritical agents. Module: Purpose: Evaluates research state and identifies knowledge gaps. Evaluates research completeness and identifies outstanding knowledge gaps. Parameters: - Returns: Module: Purpose: Selects appropriate tools for addressing knowledge gaps. Selects tools for addressing knowledge gaps. Parameters: - Returns: Module: Purpose: Generates final reports from research findings. Generates a markdown report from research findings. Parameters: - Returns: Markdown string with numbered citations. Module: Purpose: Long-form report generation with section-by-section writing. Writes the next section of a long-form report. Parameters: - Returns: Generates final report from draft. Parameters: - Returns: Final markdown report string. Module: Purpose: Proofreads and polishes report drafts. Proofreads and polishes a report draft. Parameters: - Returns: Polished markdown string. Module: Purpose: Generates observations from conversation history. Generates observations from conversation history. Parameters: - Returns: Observation string. Module: Purpose: Parses and improves user queries, detects research mode. Parses and improves a user query. Parameters: - Returns: All agents have factory functions in Parameters: - Returns: Agent instance. This page documents the API for DeepCritical agents. Module: Purpose: Evaluates research state and identifies knowledge gaps. Evaluates research completeness and identifies outstanding knowledge gaps. Parameters: - Returns: Module: Purpose: Selects appropriate tools for addressing knowledge gaps. Selects tools for addressing a knowledge gap. Parameters: - Returns: Module: Purpose: Generates final reports from research findings. Generates a markdown report from research findings. Parameters: - Returns: Markdown string with numbered citations. Module: Purpose: Long-form report generation with section-by-section writing. Writes the next section of a long-form report. Parameters: - Returns: Generates final report from draft. Parameters: - Returns: Final markdown report string. Module: Purpose: Proofreads and polishes report drafts. Proofreads and polishes a report draft. Parameters: - Returns: Polished markdown string. Module: Purpose: Generates observations from conversation history. Generates observations from conversation history. Parameters: - Returns: Observation string. Module: Purpose: Parses and improves user queries, detects research mode. Parses and improves a user query. Parameters: - Returns: All agents have factory functions in Parameters: - Returns: Agent instance. This page documents the Pydantic models used throughout DeepCritical. Module: Purpose: Represents evidence from search results. Fields: - Module: Purpose: Citation information for evidence. Fields: - Module: Purpose: Output from knowledge gap evaluation. Fields: - Module: Purpose: Plan for tool/agent selection. Fields: - Module: Purpose: Individual agent task. Fields: - Module: Purpose: Draft structure for long-form reports. Fields: - Module: Purpose: Individual section in a report draft. Fields: - Module: Purpose: Parsed and improved query. Fields: - Module: Purpose: Conversation history with iterations. Fields: - Module: Purpose: Data for a single iteration. Fields: - Module: Purpose: Event emitted during research execution. Fields: - Module: Purpose: Current budget status. Fields: - This page documents the Pydantic models used throughout DeepCritical. Module: Purpose: Represents evidence from search results. Fields: - Module: Purpose: Citation information for evidence. Fields: - Module: Purpose: Output from knowledge gap evaluation. Fields: - Module: Purpose: Plan for tool/agent selection. Fields: - Module: Purpose: Individual agent task. Fields: - Module: Purpose: Draft structure for long-form reports. Fields: - Module: Purpose: Individual section in a report draft. Fields: - Module: Purpose: Parsed and improved query. Fields: - Module: Purpose: Conversation history with iterations. Fields: - Module: Purpose: Data for a single iteration. Fields: - Module: Purpose: Event emitted during research execution. Fields: - Module: Purpose: Current budget status. Fields: - This page documents the API for DeepCritical orchestrators. Module: Purpose: Single-loop research with search-judge-synthesize cycles. Runs iterative research flow. Parameters: - Yields: Module: Purpose: Multi-section parallel research with planning and synthesis. Runs deep research flow. Parameters: - Yields: Module: Purpose: Graph-based execution using Pydantic AI agents as nodes. Runs graph-based research orchestration. Parameters: - Yields: Module: Purpose: Factory for creating orchestrators. Creates an orchestrator instance. Parameters: - Returns: Orchestrator instance. Raises: - Modes: - Module: Purpose: Multi-agent coordination using Microsoft Agent Framework. Runs Magentic orchestration. Parameters: - Yields: Requirements: - This page documents the API for DeepCritical orchestrators. Module: Purpose: Single-loop research with search-judge-synthesize cycles. Runs iterative research flow. Parameters: - Returns: Final report string. Note: Module: Purpose: Multi-section parallel research with planning and synthesis. Runs deep research flow. Parameters: - Returns: Final report string. Note: Module: Purpose: Graph-based execution using Pydantic AI agents as nodes. Runs graph-based research orchestration. Parameters: - Yields: Note: Module: Purpose: Factory for creating orchestrators. Creates an orchestrator instance. Parameters: - Returns: Orchestrator instance. Raises: - Modes: - Module: Purpose: Multi-agent coordination using Microsoft Agent Framework. Runs Magentic orchestration. Parameters: - Yields: Note: Requirements: - This page documents the API for DeepCritical services. Module: Purpose: Local sentence-transformers for semantic search and deduplication. Generates embedding for a text string. Parameters: - Returns: Embedding vector as list of floats. Generates embeddings for multiple texts. Parameters: - Returns: List of embedding vectors. Calculates similarity between two texts. Parameters: - Returns: Similarity score (0.0-1.0). Finds duplicate texts based on similarity threshold. Parameters: - Returns: List of (index1, index2) tuples for duplicate pairs. Returns singleton EmbeddingService instance. Module: Purpose: Retrieval-Augmented Generation using LlamaIndex. Ingests evidence into RAG service. Parameters: - Note: Requires OpenAI API key for embeddings. Retrieves relevant documents for a query. Parameters: - Returns: List of Document objects with metadata. Queries RAG service and returns formatted results. Parameters: - Returns: Formatted query results as string. Returns singleton LlamaIndexRAGService instance, or None if OpenAI key not available. Module: Purpose: Secure execution of AI-generated statistical code. Analyzes a hypothesis using statistical methods. Parameters: - Returns: Note: Requires Modal credentials for sandbox execution. This page documents the API for DeepCritical services. Module: Purpose: Local sentence-transformers for semantic search and deduplication. Generates embedding for a text string. Parameters: - Returns: Embedding vector as list of floats. Generates embeddings for multiple texts. Parameters: - Returns: List of embedding vectors. Calculates similarity between two texts. Parameters: - Returns: Similarity score (0.0-1.0). Finds duplicate texts based on similarity threshold. Parameters: - Returns: List of (index1, index2) tuples for duplicate pairs. Adds evidence to vector store for semantic search. Parameters: - Finds semantically similar evidence. Parameters: - Returns: List of dictionaries with Removes semantically duplicate evidence. Parameters: - Returns: List of unique evidence items (not already in vector store). Returns singleton EmbeddingService instance. Module: Purpose: Retrieval-Augmented Generation using LlamaIndex. Ingests evidence into RAG service. Parameters: - Note: Supports multiple embedding providers (OpenAI, local sentence-transformers, Hugging Face). Retrieves relevant documents for a query. Parameters: - Returns: List of dictionaries with Queries RAG service and returns synthesized response. Parameters: - Returns: Synthesized response string. Raises: - Ingests raw LlamaIndex Documents. Parameters: - Clears all documents from the collection. Get or create a RAG service instance. Parameters: - Returns: Configured LlamaIndexRAGService instance. Note: By default, uses local embeddings (sentence-transformers) which require no API keys. Module: Purpose: Secure execution of AI-generated statistical code. Analyzes a research question using statistical methods. Parameters: - Returns: Note: Requires Modal credentials for sandbox execution. This page documents the API for DeepCritical search tools. All tools implement the Module: Purpose: Search peer-reviewed biomedical literature from PubMed. Returns tool name: Searches PubMed for articles. Parameters: - Returns: List of Raises: - Module: Purpose: Search ClinicalTrials.gov for interventional studies. Returns tool name: Searches ClinicalTrials.gov for trials. Parameters: - Returns: List of Note: Only returns interventional studies with status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION Raises: - Module: Purpose: Search Europe PMC for preprints and peer-reviewed articles. Returns tool name: Searches Europe PMC for articles and preprints. Parameters: - Returns: List of Note: Includes both preprints (marked with Raises: - Module: Purpose: Semantic search within collected evidence. Returns tool name: Searches collected evidence using semantic similarity. Parameters: - Returns: List of Note: Requires evidence to be ingested into RAG service first. Module: Purpose: Orchestrates parallel searches across multiple tools. Searches multiple tools in parallel. Parameters: - Returns: Note: Uses This page documents the API for DeepCritical search tools. All tools implement the Module: Purpose: Search peer-reviewed biomedical literature from PubMed. Returns tool name: Searches PubMed for articles. Parameters: - Returns: List of Raises: - Note: Uses NCBI E-utilities (ESearch → EFetch). Rate limit: 0.34s between requests. Handles single vs. multiple articles. Module: Purpose: Search ClinicalTrials.gov for interventional studies. Returns tool name: Searches ClinicalTrials.gov for trials. Parameters: - Returns: List of Note: Only returns interventional studies with status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION. Uses Raises: - Module: Purpose: Search Europe PMC for preprints and peer-reviewed articles. Returns tool name: Searches Europe PMC for articles and preprints. Parameters: - Returns: List of Note: Includes both preprints (marked with Raises: - Module: Purpose: Semantic search within collected evidence. Parameters: - Returns tool name: Searches collected evidence using semantic similarity. Parameters: - Returns: List of Raises: - Note: Requires evidence to be ingested into RAG service first. Wraps Module: Purpose: Orchestrates parallel searches across multiple tools. Parameters: - Searches multiple tools in parallel. Parameters: - Returns: Raises: - Note: Uses DeepCritical uses Pydantic AI agents for all AI-powered operations. All agents follow a consistent pattern and use structured output types. All agents use the Pydantic AI Agents use The model selection is based on the configured Agents return fallback values on failure rather than raising exceptions: All errors are logged with context using structlog. All agents validate inputs: Agents use structured output types from For text output (writer agents), agents return File: Purpose: Evaluates research state and identifies knowledge gaps. Output: Methods: - File: Purpose: Selects appropriate tools for addressing knowledge gaps. Output: Available Agents: - File: Purpose: Generates final reports from research findings. Output: Markdown string with numbered citations. Methods: - Features: - Validates inputs - Truncates very long findings (max 50000 chars) with warning - Retry logic for transient failures (3 retries) - Citation validation before returning File: Purpose: Long-form report generation with section-by-section writing. Input/Output: Uses Methods: - Features: - Writes sections iteratively - Aggregates references across sections - Reformats section headings and references - Deduplicates and renumbers references File: Purpose: Proofreads and polishes report drafts. Input: Methods: - Features: - Removes duplicate content across sections - Adds executive summary if multiple sections - Preserves all references and citations - Improves flow and readability File: Purpose: Generates observations from conversation history. Output: Observation string Methods: - File: Purpose: Parses and improves user queries, detects research mode. Output: All agents have factory functions in Factory functions: - Use DeepCritical uses Pydantic AI agents for all AI-powered operations. All agents follow a consistent pattern and use structured output types. Pydantic AI agents use the Note: Factory functions accept an optional Agents use The model selection is based on the configured Agents return fallback values on failure rather than raising exceptions: All errors are logged with context using structlog. All agents validate inputs: Agents use structured output types from For text output (writer agents), agents return File: Purpose: Evaluates research state and identifies knowledge gaps. Output: Methods: - File: Purpose: Selects appropriate tools for addressing knowledge gaps. Output: Available Agents: - File: Purpose: Generates final reports from research findings. Output: Markdown string with numbered citations. Methods: - Features: - Validates inputs - Truncates very long findings (max 50000 chars) with warning - Retry logic for transient failures (3 retries) - Citation validation before returning File: Purpose: Long-form report generation with section-by-section writing. Input/Output: Uses Methods: - Features: - Writes sections iteratively - Aggregates references across sections - Reformats section headings and references - Deduplicates and renumbers references File: Purpose: Proofreads and polishes report drafts. Input: Methods: - Features: - Removes duplicate content across sections - Adds executive summary if multiple sections - Preserves all references and citations - Improves flow and readability File: Purpose: Generates observations from conversation history. Output: Observation string Methods: - File: Purpose: Parses and improves user queries, detects research mode. Output: The following agents use the File: Purpose: Generates mechanistic hypotheses based on evidence. Pattern: Methods: - Features: - Uses internal Pydantic AI File: Purpose: Wraps Pattern: Methods: - Features: - Executes searches via File: Purpose: Performs statistical analysis using Modal sandbox. Pattern: Methods: - Features: - Wraps File: Purpose: Generates structured scientific reports from evidence and hypotheses. Pattern: Methods: - Features: - Uses internal Pydantic AI File: Purpose: Evaluates evidence quality and determines if sufficient for synthesis. Pattern: Methods: - Features: - Wraps DeepCritical uses two distinct agent patterns: These agents use the Pydantic AI These agents use the Note: Magentic agents are used exclusively with the All agents have factory functions in Factory functions: - Use Phase 4 implements a graph-based orchestration system for research workflows using Pydantic AI agents as nodes. This enables better parallel execution, conditional routing, and state management compared to simple agent chains. Graph nodes represent different stages in the research workflow: Examples: State Nodes: Update or read workflow state Examples: Update evidence, update conversation history Decision Nodes: Make routing decisions based on conditions Examples: Continue research vs. complete research Parallel Nodes: Execute multiple nodes concurrently Edges define transitions between nodes: Condition: None (always True) Conditional Edges: Traversed based on condition Example: If research complete → go to writer, else → continue loop Parallel Edges: Used for parallel execution branches State is managed via State transitions occur at state nodes, which update the global workflow state. Decision nodes evaluate conditions and return next node IDs: Parallel nodes execute multiple nodes concurrently: Budget constraints are enforced at decision nodes: If any budget is exceeded, execution routes to exit node. Errors are handled at multiple levels: Errors are logged and yield error events for UI. Graph execution is optional via feature flag: This allows gradual migration and fallback if needed. DeepCritical implements a graph-based orchestration system for research workflows using Pydantic AI agents as nodes. This enables better parallel execution, conditional routing, and state management compared to simple agent chains. The iterative research graph follows this pattern: Node IDs: Special Node Handling: - The deep research graph follows this pattern: Node IDs: Special Node Handling: - Graph nodes represent different stages in the research workflow: Examples: State Nodes: Update or read workflow state Examples: Update evidence, update conversation history Decision Nodes: Make routing decisions based on conditions Examples: Continue research vs. complete research Parallel Nodes: Execute multiple nodes concurrently Edges define transitions between nodes: Condition: None (always True) Conditional Edges: Traversed based on condition Example: If research complete → go to writer, else → continue loop Parallel Edges: Used for parallel execution branches State is managed via State transitions occur at state nodes, which update the global workflow state. Decision nodes evaluate conditions and return next node IDs: Parallel nodes execute multiple nodes concurrently: Budget constraints are enforced at decision nodes: If any budget is exceeded, execution routes to exit node. Errors are handled at multiple levels: Errors are logged and yield error events for UI. Graph execution is optional via feature flag: This allows gradual migration and fallback if needed. Graph nodes represent different stages in the research workflow: Examples: State Nodes: Update or read workflow state Examples: Update evidence, update conversation history Decision Nodes: Make routing decisions based on conditions Examples: Continue research vs. complete research Parallel Nodes: Execute multiple nodes concurrently Edges define transitions between nodes: Condition: None (always True) Conditional Edges: Traversed based on condition Example: If research complete → go to writer, else → continue loop Parallel Edges: Used for parallel execution branches State is managed via State transitions occur at state nodes, which update the global workflow state. The Methods: - Decision nodes evaluate conditions and return next node IDs: Parallel nodes execute multiple nodes concurrently: Budget constraints are enforced at decision nodes: If any budget is exceeded, execution routes to exit node. Errors are handled at multiple levels: Errors are logged and yield error events for UI. Graph execution is optional via feature flag: This allows gradual migration and fallback if needed. DeepCritical uses middleware for state management, budget tracking, and workflow coordination. File: Purpose: Thread-safe state management for research workflows Implementation: Uses State Components: - Methods: - Initialization: Access: File: Purpose: Coordinates parallel research loops Methods: - Features: - Uses Usage: File: Purpose: Tracks and enforces resource limits Budget Components: - Tokens: LLM token usage - Time: Elapsed time in seconds - Iterations: Number of iterations Methods: - Token Estimation: - Usage: All middleware models are defined in All middleware components use DeepCritical uses middleware for state management, budget tracking, and workflow coordination. File: Purpose: Thread-safe state management for research workflows Implementation: Uses State Components: - Methods: - Initialization: Access: File: Purpose: Coordinates parallel research loops Methods: - Features: - Uses Usage: File: Purpose: Tracks and enforces resource limits Budget Components: - Tokens: LLM token usage - Time: Elapsed time in seconds - Iterations: Number of iterations Methods: - Token Estimation: - Usage: All middleware models are defined in All middleware components use DeepCritical supports multiple orchestration patterns for research workflows. File: Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete Agents Used: - Features: - Tracks iterations, time, budget - Supports graph execution ( Usage: File: Pattern: Planner → Parallel iterative loops per section → Synthesizer Agents Used: - Features: - Uses Usage: File: Purpose: Graph-based execution using Pydantic AI agents as nodes Features: - Uses Pydantic AI Graphs (when available) or agent chains (fallback) - Routes based on research mode (iterative/deep/auto) - Streams Node Types: - Agent Nodes: Execute Pydantic AI agents - State Nodes: Update or read workflow state - Decision Nodes: Make routing decisions - Parallel Nodes: Execute multiple nodes concurrently Edge Types: - Sequential Edges: Always traversed - Conditional Edges: Traversed based on condition - Parallel Edges: Used for parallel execution branches File: Purpose: Factory for creating orchestrators Modes: - Simple: Legacy orchestrator (backward compatible) - Advanced: Magentic orchestrator (requires OpenAI API key) - Auto-detect: Chooses based on API key availability Usage: File: Purpose: Multi-agent coordination using Microsoft Agent Framework Features: - Uses Requirements: - File: Purpose: Hierarchical orchestrator using middleware and sub-teams Features: - Uses File: Purpose: Linear search-judge-synthesize loop Features: - Uses All orchestrators must initialize workflow state: All orchestrators yield Event Types: - Event Structure: DeepCritical supports multiple orchestration patterns for research workflows. File: Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete Agents Used: - Features: - Tracks iterations, time, budget - Supports graph execution ( Usage: File: Pattern: Planner → Parallel iterative loops per section → Synthesizer Agents Used: - Features: - Uses Usage: File: Purpose: Graph-based execution using Pydantic AI agents as nodes Features: - Uses graph execution ( Node Types: - Agent Nodes: Execute Pydantic AI agents - State Nodes: Update or read workflow state - Decision Nodes: Make routing decisions - Parallel Nodes: Execute multiple nodes concurrently Edge Types: - Sequential Edges: Always traversed - Conditional Edges: Traversed based on condition - Parallel Edges: Used for parallel execution branches Special Node Handling: The GraphExecutionContext: The orchestrator uses File: Purpose: Factory for creating orchestrators Modes: - Simple: Legacy orchestrator (backward compatible) - Advanced: Magentic orchestrator (requires OpenAI API key) - Auto-detect: Chooses based on API key availability Usage: File: Purpose: Multi-agent coordination using Microsoft Agent Framework Features: - Uses Event Processing: The orchestrator processes Magentic events and converts them to Requirements: - File: Purpose: Hierarchical orchestrator using middleware and sub-teams Features: - Uses File: Purpose: Linear search-judge-synthesize loop Features: - Uses All orchestrators must initialize workflow state: All orchestrators yield Event Types: - Event Structure: DeepCritical provides several services for embeddings, RAG, and statistical analysis. File: Purpose: Local sentence-transformers for semantic search and deduplication Features: - No API Key Required: Uses local sentence-transformers models - Async-Safe: All operations use Model: Configurable via Methods: - Usage: File: Purpose: Retrieval-Augmented Generation using LlamaIndex Features: - OpenAI Embeddings: Requires Methods: - Usage: File: Purpose: Secure execution of AI-generated statistical code Features: - Modal Sandbox: Secure, isolated execution environment - Code Generation: Generates Python code via LLM - Library Pinning: Version-pinned libraries in Libraries Available: - pandas, numpy, scipy - matplotlib, scikit-learn - statsmodels Output: Usage: All services use the singleton pattern with This ensures: - Single instance per process - Lazy initialization - No dependencies required at import time Services check availability before use: DeepCritical provides several services for embeddings, RAG, and statistical analysis. File: Purpose: Local sentence-transformers for semantic search and deduplication Features: - No API Key Required: Uses local sentence-transformers models - Async-Safe: All operations use Model: Configurable via Methods: - Usage: File: Purpose: Retrieval-Augmented Generation using LlamaIndex Features: - Multiple Embedding Providers: OpenAI embeddings (requires Initialization Parameters: - Methods: - Usage: File: Purpose: Secure execution of AI-generated statistical code Features: - Modal Sandbox: Secure, isolated execution environment - Code Generation: Generates Python code via LLM - Library Pinning: Version-pinned libraries in Libraries Available: - pandas, numpy, scipy - matplotlib, scikit-learn - statsmodels Output: Usage: Services use singleton patterns for lazy initialization: EmbeddingService: Uses a global variable pattern: LlamaIndexRAGService: Direct instantiation (no caching): This ensures: - Single instance per process - Lazy initialization - No dependencies required at import time Services check availability before use: DeepCritical implements a protocol-based search tool system for retrieving evidence from multiple sources. All tools implement the All tools use the Tools with API rate limits implement Tools raise custom exceptions: Tools handle HTTP errors (429, 500, timeout) and return empty lists on non-critical errors (with warning logs). Tools use All tools convert API responses to Missing fields are handled gracefully with defaults. File: API: NCBI E-utilities (ESearch → EFetch) Rate Limiting: - 0.34s between requests (3 req/sec without API key) - 0.1s between requests (10 req/sec with NCBI API key) Features: - XML parsing with File: API: ClinicalTrials.gov API v2 Important: Uses Execution: Runs in thread pool: Filtering: - Only interventional studies - Status: Features: - Parses nested JSON structure - Extracts trial metadata - Evidence conversion File: API: Europe PMC REST API Features: - Handles preprint markers: File: Purpose: Semantic search within collected evidence Implementation: Wraps Features: - Returns Evidence from RAG results - Handles evidence ingestion - Semantic similarity search - Metadata preservation File: Purpose: Orchestrates parallel searches across multiple tools Features: - Uses Tools are registered in the search handler: DeepCritical implements a protocol-based search tool system for retrieving evidence from multiple sources. All tools implement the All tools use the Tools with API rate limits implement Tools raise custom exceptions: Tools handle HTTP errors (429, 500, timeout) and return empty lists on non-critical errors (with warning logs). Tools use All tools convert API responses to Missing fields are handled gracefully with defaults. File: API: NCBI E-utilities (ESearch → EFetch) Rate Limiting: - 0.34s between requests (3 req/sec without API key) - 0.1s between requests (10 req/sec with NCBI API key) Features: - XML parsing with File: API: ClinicalTrials.gov API v2 Important: Uses Execution: Runs in thread pool: Filtering: - Only interventional studies - Status: Features: - Parses nested JSON structure - Extracts trial metadata - Evidence conversion File: API: Europe PMC REST API Features: - Handles preprint markers: File: Purpose: Semantic search within collected evidence Implementation: Wraps Features: - Returns Evidence from RAG results - Handles evidence ingestion - Semantic similarity search - Metadata preservation File: Purpose: Orchestrates parallel searches across multiple tools Initialization Parameters: - Methods: - Features: - Uses Tools are registered in the search handler: Architecture Pattern: Microsoft Magentic Orchestration Design Philosophy: Simple, dynamic, manager-driven coordination Key Innovation: Intelligent manager replaces rigid sequential phases Architecture Pattern: Microsoft Magentic Orchestration Design Philosophy: Simple, dynamic, manager-driven coordination Key Innovation: Intelligent manager replaces rigid sequential phases Simple 4-Agent Setup: Manager handles quality assessment in its instructions: - Checks hypothesis quality (testable, novel, clear) - Validates search results (relevant, authoritative, recent) - Assesses analysis soundness (methodology, evidence, conclusions) - Ensures report completeness (all sections, proper citations) No separate Judge Agent needed - manager does it all! Document Version: 2.0 (Magentic Simplified) Last Updated: 2025-11-24 Architecture: Microsoft Magentic Orchestration Pattern Agents: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager License: MIT Simple 4-Agent Setup: Manager handles quality assessment in its instructions: - Checks hypothesis quality (testable, novel, clear) - Validates search results (relevant, authoritative, recent) - Assesses analysis soundness (methodology, evidence, conclusions) - Ensures report completeness (all sections, proper citations) No separate Judge Agent needed - manager does it all! Document Version: 2.0 (Magentic Simplified) Last Updated: 2025-11-24 Architecture: Microsoft Magentic Orchestration Pattern Agents: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager License: MIT Architecture Pattern: Microsoft Magentic Orchestration Design Philosophy: Simple, dynamic, manager-driven coordination Key Innovation: Intelligent manager replaces rigid sequential phases Simple 4-Agent Setup: Manager handles quality assessment in its instructions: - Checks hypothesis quality (testable, novel, clear) - Validates search results (relevant, authoritative, recent) - Assesses analysis soundness (methodology, evidence, conclusions) - Ensures report completeness (all sections, proper citations) No separate Judge Agent needed - manager does it all! Document Version: 2.0 (Magentic Simplified) Last Updated: 2025-11-24 Architecture: Microsoft Magentic Orchestration Pattern Agents: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager License: MIT DeepCritical uses Pydantic Settings for centralized configuration management. All settings are defined in the The configuration system provides: The A global Access configuration throughout the codebase: You must configure at least one LLM provider. The system supports: The default model is defined in the The default model is defined in the HuggingFace can work without an API key for public models, but an API key provides higher rate limits: The HuggingFace token can be set via either environment variable: DeepCritical supports multiple embedding providers for semantic search and RAG: The embedding provider configuration: Note: OpenAI embeddings require DeepCritical supports multiple web search providers: The web search provider configuration: Note: DuckDuckGo is the default and requires no API key, making it ideal for development and testing. PubMed search supports optional NCBI API key for higher rate limits: The PubMed tool uses this configuration: Control agent behavior and research loop execution: The agent configuration fields: Control resource limits for research loops: The budget configuration with validation: Configure the Retrieval-Augmented Generation service: The RAG configuration: Configure the vector database for embeddings and RAG: The ChromaDB configuration: Modal is used for secure sandbox execution of statistical analysis: The Modal configuration: Configure structured logging: The logging configuration: Logging is configured via the The Check which API keys are available: Usage: Check if external services are configured: Usage: Get the API key for the configured provider: For OpenAI-specific operations (e.g., Magentic mode): The configuration system is used throughout the codebase: The LLM factory uses settings to create appropriate models: The embedding service uses local embedding model configuration: The orchestrator factory uses settings to determine mode: Settings are validated on load using Pydantic validation: The The Configuration errors raise The following configurations are planned for future phases: DeepCritical uses Pydantic Settings for centralized configuration management. All settings are defined in the The configuration system provides: The [ A global Access configuration throughout the codebase: You must configure at least one LLM provider. The system supports: The default model is defined in the The default model is defined in the HuggingFace can work without an API key for public models, but an API key provides higher rate limits: The HuggingFace token can be set via either environment variable: DeepCritical supports multiple embedding providers for semantic search and RAG: The embedding provider configuration: Note: OpenAI embeddings require DeepCritical supports multiple web search providers: The web search provider configuration: Note: DuckDuckGo is the default and requires no API key, making it ideal for development and testing. PubMed search supports optional NCBI API key for higher rate limits: The PubMed tool uses this configuration: Control agent behavior and research loop execution: The agent configuration fields: Control resource limits for research loops: The budget configuration with validation: Configure the Retrieval-Augmented Generation service: The RAG configuration: Configure the vector database for embeddings and RAG: The ChromaDB configuration: Modal is used for secure sandbox execution of statistical analysis: The Modal configuration: Configure structured logging: The logging configuration: Logging is configured via the The Check which API keys are available: Usage: Check if external services are configured: Usage: Get the API key for the configured provider: For OpenAI-specific operations (e.g., Magentic mode): The configuration system is used throughout the codebase: The LLM factory uses settings to create appropriate models: The embedding service uses local embedding model configuration: The orchestrator factory uses settings to determine mode: Settings are validated on load using Pydantic validation: The The Configuration errors raise ```22:25:src/utils/exceptions.py class ConfigurationError(DeepCriticalError): """Raised when configuration is invalid.""" ``` The following configurations are planned for future phases: DeepCritical uses Pydantic Settings for centralized configuration management. All settings are defined in the The configuration system provides: The [ A global Access configuration throughout the codebase: You must configure at least one LLM provider. The system supports: The default model is defined in the The default model is defined in the HuggingFace can work without an API key for public models, but an API key provides higher rate limits: The HuggingFace token can be set via either environment variable: DeepCritical supports multiple embedding providers for semantic search and RAG: The embedding provider configuration: Note: OpenAI embeddings require DeepCritical supports multiple web search providers: The web search provider configuration: Note: DuckDuckGo is the default and requires no API key, making it ideal for development and testing. PubMed search supports optional NCBI API key for higher rate limits: The PubMed tool uses this configuration: Control agent behavior and research loop execution: The agent configuration fields: Control resource limits for research loops: The budget configuration with validation: Configure the Retrieval-Augmented Generation service: The RAG configuration: Configure the vector database for embeddings and RAG: The ChromaDB configuration: Modal is used for secure sandbox execution of statistical analysis: The Modal configuration: Configure structured logging: The logging configuration: Logging is configured via the The Check which API keys are available: Usage: Check if external services are configured: Usage: Get the API key for the configured provider: For OpenAI-specific operations (e.g., Magentic mode): The configuration system is used throughout the codebase: The LLM factory uses settings to create appropriate models: The embedding service uses local embedding model configuration: The orchestrator factory uses settings to determine mode: Settings are validated on load using Pydantic validation: The The Configuration errors raise ```22:25:src/utils/exceptions.py class ConfigurationError(DeepCriticalError): """Raised when configuration is invalid.""" ``` The following configurations are planned for future phases: This document outlines code quality standards and documentation requirements. Example: This document outlines code quality standards and documentation requirements for The DETERMINATOR. Pre-commit hooks run automatically on commit to ensure code quality. Configuration is in Note: The following hooks run automatically on commit: Auto-fixes: Yes ruff-format: Formats code with ruff Auto-fixes: Yes mypy: Type checking Additional dependencies: pydantic, pydantic-settings, tenacity, pydantic-ai pytest-unit: Runs unit tests (excludes OpenAI and embedding_provider tests) Always runs: Yes (not just on changed files) pytest-local-embeddings: Runs local embedding tests To run pre-commit hooks manually (without committing): Documentation is built using MkDocs. Source files are in The documentation site is published at: https://deepcritical.github.io/GradioDemo/ Example: This document outlines the code style and conventions for DeepCritical. This document outlines the code style and conventions for The DETERMINATOR. This project uses All development commands should use This ensures commands run in the correct virtual environment managed by This document outlines error handling and logging conventions for DeepCritical. Use custom exception hierarchy ( Always preserve exception context: This document outlines error handling and logging conventions for The DETERMINATOR. Use custom exception hierarchy ( Always preserve exception context: This document outlines common implementation patterns used in DeepCritical. All tools implement Example pattern: Use This document outlines common implementation patterns used in The DETERMINATOR. All tools implement Example pattern: Use Thank you for your interest in contributing to DeepCritical! This guide will help you get started. Thank you for contributing to DeepCritical! Thank you for your interest in contributing to The DETERMINATOR! This guide will help you get started. Note on Project Names: "The DETERMINATOR" is the product name, "DeepCritical" is the organization/project name, and "determinator" is the Python package name. This project uses a dual repository setup: When cloning, set up remotes as follows: Important: Never push directly to This project uses The project uses pytest markers to categorize tests. See Testing Guidelines for details: Note: The Fork the repository on GitHub: Clone your fork: Make your changes following the guidelines below Run checks: Thank you for contributing to The DETERMINATOR! This document outlines prompt engineering guidelines and citation validation rules. This document outlines prompt engineering guidelines and citation validation rules. This document outlines testing requirements and guidelines for DeepCritical. This document outlines testing requirements and guidelines for The DETERMINATOR. The project uses pytest markers to categorize tests. These markers are defined in Note: The This shows coverage with missing lines highlighted in the terminal output. This generates an HTML coverage report in This page provides examples of using DeepCritical for various research tasks. Query: What DeepCritical Does: 1. Searches PubMed for recent papers 2. Searches ClinicalTrials.gov for active trials 3. Evaluates evidence quality 4. Synthesizes findings into a comprehensive report Query: What DeepCritical Does: 1. Searches ClinicalTrials.gov for relevant trials 2. Searches PubMed for supporting literature 3. Provides trial details and status 4. Summarizes findings Query: What DeepCritical Does: 1. Uses deep research mode (multi-section) 2. Searches multiple sources in parallel 3. Generates sections on: - Clinical trials - Mechanisms of action - Safety profile 4. Synthesizes comprehensive report Query: What DeepCritical Does: 1. Generates testable hypotheses 2. Searches for supporting/contradicting evidence 3. Performs statistical analysis (if Modal configured) 4. Provides verdict: SUPPORTED, REFUTED, or INCONCLUSIVE Single-loop research with search-judge-synthesize cycles: Multi-section parallel research: This page provides examples of using The DETERMINATOR for various research tasks. Query: What The DETERMINATOR Does: 1. Searches PubMed for recent papers 2. Searches ClinicalTrials.gov for active trials 3. Evaluates evidence quality 4. Synthesizes findings into a comprehensive report Query: What The DETERMINATOR Does: Query: What The DETERMINATOR Does: 1. Uses deep research mode (multi-section) 2. Searches multiple sources in parallel 3. Generates sections on: - Clinical trials - Mechanisms of action - Safety profile 4. Synthesizes comprehensive report Query: What The DETERMINATOR Does: 1. Generates testable hypotheses 2. Searches for supporting/contradicting evidence 3. Performs statistical analysis (if Modal configured) 4. Provides verdict: SUPPORTED, REFUTED, or INCONCLUSIVE Single-loop research with search-judge-synthesize cycles: Multi-section parallel research: This guide will help you install and set up DeepCritical on your system. Using Using For embeddings support (local sentence-transformers): For Modal sandbox execution: For Magentic orchestration: Install all extras: Create a See the Configuration Guide for all available options. Run the application: Open your browser to For development, install dev dependencies: Install pre-commit hooks: Import Errors: - Ensure you've installed all required dependencies - Check that Python 3.11+ is being used API Key Errors: - Verify your Module Not Found: - Run Port Already in Use: - Change the port in <<<<<<< Updated upstream ======= Stashed changes Stashed changes This guide will help you install and set up DeepCritical on your system. Unix/macOS/Linux: Windows (PowerShell): Alternative methods: After installation, restart your terminal or add Using Using For embeddings support (local sentence-transformers): For Modal sandbox execution: For Magentic orchestration: Install all extras: Create a See the Configuration Guide for all available options. Run the application: Open your browser to For development, install dev dependencies: Install pre-commit hooks: Import Errors: - Ensure you've installed all required dependencies - Check that Python 3.11+ is being used API Key Errors: - Verify your Module Not Found: - Run Port Already in Use: - Change the port in DeepCritical exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients. The Model Context Protocol (MCP) is a standard for connecting AI assistants to external tools and data sources. DeepCritical implements an MCP server that exposes its search capabilities as MCP tools. When running locally: macOS: Windows: Linux: Edit Close and restart Claude Desktop for changes to take effect. In Claude Desktop, you should see DeepCritical tools available: - Search peer-reviewed biomedical literature from PubMed. Parameters: - Example: Search ClinicalTrials.gov for interventional studies. Parameters: - Example: Search bioRxiv/medRxiv preprints via Europe PMC. Parameters: - Example: Search all sources simultaneously (PubMed, ClinicalTrials.gov, Europe PMC). Parameters: - Example: Perform secure statistical analysis using Modal sandboxes. Parameters: - Example: Once configured, you can ask Claude to use DeepCritical tools: Claude will automatically: 1. Call the appropriate DeepCritical tool 2. Retrieve results 3. Use the results in its response Server Not Found: - Ensure DeepCritical is running ( Tools Not Appearing: - Restart Claude Desktop after configuration changes - Check Claude Desktop logs for errors - Verify MCP server is accessible at the configured URL If DeepCritical requires authentication: - Configure API keys in DeepCritical settings - Use HuggingFace OAuth login - Ensure API keys are valid If running on a different port, update the URL: You can configure multiple DeepCritical instances: The DETERMINATOR exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients. The Model Context Protocol (MCP) is a standard for connecting AI assistants to external tools and data sources. The DETERMINATOR implements an MCP server that exposes its search capabilities as MCP tools. When running locally: macOS: Windows: Linux: Edit Close and restart Claude Desktop for changes to take effect. In Claude Desktop, you should see The DETERMINATOR tools available: - Search peer-reviewed biomedical literature from PubMed. Parameters: - Example: Search ClinicalTrials.gov for interventional studies. Parameters: - Example: Search bioRxiv/medRxiv preprints via Europe PMC. Parameters: - Example: Search all sources simultaneously (PubMed, ClinicalTrials.gov, Europe PMC). Parameters: - Example: Perform secure statistical analysis using Modal sandboxes. Parameters: - Example: Once configured, you can ask Claude to use DeepCritical tools: Claude will automatically: 1. Call the appropriate DeepCritical tool 2. Retrieve results 3. Use the results in its response Server Not Found: - Ensure DeepCritical is running ( Tools Not Appearing: - Restart Claude Desktop after configuration changes - Check Claude Desktop logs for errors - Verify MCP server is accessible at the configured URL If DeepCritical requires authentication: - Configure API keys in DeepCritical settings - Use HuggingFace OAuth login - Ensure API keys are valid If running on a different port, update the URL: You can configure multiple DeepCritical instances: Get up and running with DeepCritical in minutes. Open your browser to Type your research question in the chat interface, for example: - "What are the latest treatments for Alzheimer's disease?" - "Review the evidence for metformin in cancer prevention" - "What clinical trials are investigating COVID-19 vaccines?" Click "Submit" or press Enter. The system will: - Generate observations about your query - Identify knowledge gaps - Search multiple sources (PubMed, ClinicalTrials.gov, Europe PMC) - Evaluate evidence quality - Synthesize findings into a report Watch the real-time progress in the chat interface: - Search operations and results - Evidence evaluation - Report generation - Final research report with citations Deploy with docker instandly with a single command : Open your browser to Type your research question in the chat interface, for example: - "What are the latest treatments for Alzheimer's disease?" - "Review the evidence for metformin in cancer prevention" - "What clinical trials are investigating COVID-19 vaccines?" Click "Submit" or press Enter. The system will: - Generate observations about your query - Identify knowledge gaps - Search multiple sources (PubMed, ClinicalTrials.gov, Europe PMC) - Evaluate evidence quality - Synthesize findings into a report Watch the real-time progress in the chat interface: - Search operations and results - Evidence evaluation - Report generation - Final research report with citations AI-Native Drug Repurposing Research Agent DeepCritical is a deep research agent system that uses iterative search-and-judge loops to comprehensively answer research questions. The system supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming. Open your browser to For detailed installation and setup instructions, see the Getting Started Guide. DeepCritical uses a Vertical Slice Architecture: The system supports three main research patterns: Learn more about the Architecture. Generalist Deep Research Agent - Stops at Nothing Until Finding Precise Answers The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations). Key Features: - Generalist: Handles queries from any domain (medical, technical, business, scientific, etc.) - Automatic Source Selection: Automatically determines if medical knowledge sources (PubMed, ClinicalTrials.gov) are needed - Multi-Source Search: Web search, PubMed, ClinicalTrials.gov, Europe PMC, RAG - Iterative Refinement: Continues searching and refining until precise answers are found - Evidence Synthesis: Comprehensive reports with proper citations Important: The DETERMINATOR is a research tool that synthesizes evidence. It cannot provide medical advice or answer medical questions directly. Open your browser to For detailed installation and setup instructions, see the Getting Started Guide. The DETERMINATOR uses a Vertical Slice Architecture: The system supports three main research patterns: Learn more about the Architecture. DeepCritical is licensed under the MIT License. Copyright (c) 2024 DeepCritical Team Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. DeepCritical is licensed under the MIT License. Copyright (c) 2024 DeepCritical Team Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. DeepCritical is a deep research agent system that uses iterative search-and-judge loops to comprehensively answer research questions. The system supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming. Fallback to agent chains when graph execution is disabled Deep Research Flow ( State synchronization across parallel loops Iterative Research Flow ( Supports graph execution and agent chains Magentic Orchestrator ( Supports long-running workflows with max rounds and stall/reset handling Hierarchical Orchestrator ( Supports sub-iteration patterns for complex research tasks Legacy Simple Mode ( The system is designed for long-running research tasks with comprehensive state management and streaming: Metadata includes iteration numbers, tool names, result counts, durations Budget Tracking ( Budget summaries for monitoring Workflow Manager ( Evidence deduplication across parallel loops State Management ( Supports both iterative and deep research patterns Gradio UI ( The graph orchestrator ( Node Types: Edge Types: Graph Patterns: Execution Flow: The system supports complex research workflows through: Handles loop failures gracefully Deep Research Pattern: Breaks complex queries into sections Final synthesis combines all section results State Synchronization: Thread-safe evidence sharing Lazy imports for optional dependencies Research Modes: Execution Modes: The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations). The system automatically determines if medical knowledge sources are needed and adapts its search strategy accordingly. It supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming. Fallback to agent chains when graph execution is disabled Deep Research Flow ( State synchronization across parallel loops Iterative Research Flow ( Supports graph execution and agent chains Magentic Orchestrator ( Supports long-running workflows with max rounds and stall/reset handling Hierarchical Orchestrator ( Supports sub-iteration patterns for complex research tasks Legacy Simple Mode ( The system is designed for long-running research tasks with comprehensive state management and streaming: Metadata includes iteration numbers, tool names, result counts, durations Budget Tracking ( Budget summaries for monitoring Workflow Manager ( Evidence deduplication across parallel loops State Management ( Supports both iterative and deep research patterns Gradio UI ( The graph orchestrator ( Node Types: Edge Types: Graph Patterns: Execution Flow: The system supports complex research workflows through: Handles loop failures gracefully Deep Research Pattern: Breaks complex queries into sections Final synthesis combines all section results State Synchronization: Thread-safe evidence sharing Lazy imports for optional dependencies Orchestrator Modes (selected in UI or via factory): Graph Research Modes (used within graph orchestrator, separate from orchestrator mode): Execution Modes: Note: The UI provides separate controls for orchestrator mode and graph research mode. When using graph-based orchestrators (iterative/deep/auto), the graph research mode determines the specific pattern used within the graph execution. DeepCritical provides a comprehensive set of features for AI-assisted research: The DETERMINATOR provides a comprehensive set of features for AI-assisted research: Orchestrator Modes: - Graph Research Modes (used within graph orchestrator): - Execution Modes: - Get started with DeepCritical in minutes. Open your browser to HuggingFace OAuth Login: - Click the "Sign in with HuggingFace" button at the top of the app - Your HuggingFace API token will be automatically used for AI inference - No need to manually enter API keys when logged in Manual API Key (BYOK): - Provide your own API key in the Settings accordion - Supports HuggingFace, OpenAI, or Anthropic API keys - Manual keys take priority over OAuth tokens Connect DeepCritical to Claude Desktop: Add to your Restart Claude Desktop Get started with DeepCritical in minutes. Open your browser to Authentication is mandatory - you must authenticate before using the application. The app will display an error message if you try to use it without authentication. HuggingFace OAuth Login (Recommended): - Click the "Sign in with HuggingFace" button at the top of the app - Your HuggingFace API token will be automatically used for AI inference - No need to manually enter API keys when logged in Manual API Key (Alternative): - Set environment variable Multimodal Features: - Configure image/audio input and output in the sidebar settings - Image OCR and audio STT/TTS can be enabled/disabled independently - TTS voice and speed can be customized in the Audio Output settings Connect DeepCritical to Claude Desktop: Add to your Restart Claude Desktop Note: The application automatically uses all available search tools (Neo4j, PubMed, ClinicalTrials.gov, Europe PMC, Web search, RAG) based on query analysis. Neo4j knowledge graph search is included by default for biomedical queries. AI-Native Drug Repurposing Research Agent DeepCritical is a deep research agent system that uses iterative search-and-judge loops to comprehensively answer research questions. The system supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming. Open your browser to For detailed installation and setup instructions, see the Getting Started Guide. DeepCritical uses a Vertical Slice Architecture: The system supports three main research patterns: Learn more about the Architecture. Thank you for your interest in contributing to DeepCritical! This guide will help you get started. Note: Additional sections (Code Style, Error Handling, Testing, Implementation Patterns, Code Quality, and Prompt Engineering) are available as separate pages in the navigation sidebar.404 - Not found
404 - Not found
Agents API Reference¶
KnowledgeGapAgent¶
src.agents.knowledge_gapMethods¶
evaluate¶async def evaluate(
- self,
- query: str,
- background_context: str,
- conversation_history: Conversation,
- iteration: int,
- time_elapsed_minutes: float,
- max_time_minutes: float
-) -> KnowledgeGapOutput
-query: Research query string - background_context: Background context for the query - conversation_history: Conversation history with previous iterations - iteration: Current iteration number - time_elapsed_minutes: Elapsed time in minutes - max_time_minutes: Maximum time limit in minutesKnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gapsToolSelectorAgent¶
src.agents.tool_selectorMethods¶
select_tools¶async def select_tools(
- self,
- query: str,
- knowledge_gaps: list[str],
- available_tools: list[str]
-) -> AgentSelectionPlan
-query: Research query string - knowledge_gaps: List of knowledge gaps to address - available_tools: List of available tool namesAgentSelectionPlan with list of AgentTask objects.WriterAgent¶
src.agents.writerMethods¶
write_report¶async def write_report(
- self,
- query: str,
- findings: str,
- output_length: str = "medium",
- output_instructions: str | None = None
-) -> str
-query: Research query string - findings: Research findings to include in report - output_length: Desired output length ("short", "medium", "long") - output_instructions: Additional instructions for report generationLongWriterAgent¶
src.agents.long_writerMethods¶
write_next_section¶async def write_next_section(
- self,
- query: str,
- draft: ReportDraft,
- section_title: str,
- section_content: str
-) -> LongWriterOutput
-query: Research query string - draft: Current report draft - section_title: Title of the section to write - section_content: Content/guidance for the sectionLongWriterOutput with updated draft.write_report¶async def write_report(
- self,
- query: str,
- report_title: str,
- report_draft: ReportDraft
-) -> str
-query: Research query string - report_title: Title of the report - report_draft: Complete report draftProofreaderAgent¶
src.agents.proofreaderMethods¶
proofread¶async def proofread(
- self,
- query: str,
- report_title: str,
- report_draft: ReportDraft
-) -> str
-query: Research query string - report_title: Title of the report - report_draft: Report draft to proofreadThinkingAgent¶
src.agents.thinkingMethods¶
generate_observations¶async def generate_observations(
- self,
- query: str,
- background_context: str,
- conversation_history: Conversation
-) -> str
-query: Research query string - background_context: Background context - conversation_history: Conversation historyInputParserAgent¶
src.agents.input_parserMethods¶
parse_query¶query: Original query stringParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: "iterative" or "deep" - key_entities: List of key entities - research_questions: List of research questionsFactory Functions¶
src.agent_factory.agents:model: Optional Pydantic AI model. If None, uses get_model() from settings.See Also¶
Agents API Reference¶
KnowledgeGapAgent¶
src.agents.knowledge_gapMethods¶
evaluate¶query: Research query string - background_context: Background context for the query (default: "") - conversation_history: History of actions, findings, and thoughts as string (default: "") - iteration: Current iteration number (default: 0) - time_elapsed_minutes: Elapsed time in minutes (default: 0.0) - max_time_minutes: Maximum time limit in minutes (default: 10)KnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gapsToolSelectorAgent¶
src.agents.tool_selectorMethods¶
select_tools¶gap: The knowledge gap to address - query: Research query string - background_context: Optional background context (default: "") - conversation_history: History of actions, findings, and thoughts as string (default: "")AgentSelectionPlan with list of AgentTask objects.WriterAgent¶
src.agents.writerMethods¶
write_report¶query: Research query string - findings: Research findings to include in report - output_length: Optional description of desired output length (default: "") - output_instructions: Optional additional instructions for report generation (default: "")LongWriterAgent¶
src.agents.long_writerMethods¶
write_next_section¶original_query: The original research query - report_draft: Current report draft as string (all sections written so far) - next_section_title: Title of the section to write - next_section_draft: Draft content for the next sectionLongWriterOutput with formatted section and references.write_report¶query: Research query string - report_title: Title of the report - report_draft: Complete report draftProofreaderAgent¶
src.agents.proofreaderMethods¶
proofread¶query: Research query string - report_title: Title of the report - report_draft: Report draft to proofreadThinkingAgent¶
src.agents.thinkingMethods¶
generate_observations¶query: Research query string - background_context: Optional background context (default: "") - conversation_history: History of actions, findings, and thoughts as string (default: "") - iteration: Current iteration number (default: 1)InputParserAgent¶
src.agents.input_parserMethods¶
parse¶query: Original query stringParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: "iterative" or "deep" - key_entities: List of key entities - research_questions: List of research questionsFactory Functions¶
src.agent_factory.agents:model: Optional Pydantic AI model. If None, uses get_model() from settings. - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars)See Also¶
Models API Reference¶
Evidence¶
src.utils.modelscitation: Citation information (title, URL, date, authors) - content: Evidence text content - relevance_score: Relevance score (0.0-1.0) - metadata: Additional metadata dictionaryCitation¶
src.utils.modelstitle: Article/trial title - url: Source URL - date: Publication date (optional) - authors: List of authors (optional)KnowledgeGapOutput¶
src.utils.modelsresearch_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gapsAgentSelectionPlan¶
src.utils.modelstasks: List of agent tasks to executeAgentTask¶
src.utils.modelsagent_name: Name of agent to use - query: Task query - context: Additional context dictionaryReportDraft¶
src.utils.modelstitle: Report title - sections: List of report sections - references: List of citationsReportSection¶
src.utils.modelstitle: Section title - content: Section content - order: Section order numberParsedQuery¶
src.utils.modelsoriginal_query: Original query string - improved_query: Refined query string - research_mode: Research mode ("iterative" or "deep") - key_entities: List of key entities - research_questions: List of research questionsConversation¶
src.utils.modelsiterations: List of iteration dataIterationData¶
src.utils.modelsiteration: Iteration number - observations: Generated observations - knowledge_gaps: Identified knowledge gaps - tool_calls: Tool calls made - findings: Findings from tools - thoughts: Agent thoughtsAgentEvent¶
src.utils.modelstype: Event type (e.g., "started", "search_complete", "complete") - iteration: Iteration number (optional) - data: Event data dictionaryBudgetStatus¶
src.utils.modelstokens_used: Tokens used so far - tokens_limit: Token limit - time_elapsed_seconds: Elapsed time in seconds - time_limit_seconds: Time limit in seconds - iterations: Current iteration count - iterations_limit: Iteration limitSee Also¶
Models API Reference¶
Evidence¶
src.utils.modelscitation: Citation information (title, URL, date, authors) - content: Evidence text content - relevance: Relevance score (0.0-1.0) - metadata: Additional metadata dictionaryCitation¶
src.utils.modelssource: Source name (e.g., "pubmed", "clinicaltrials", "europepmc", "web", "rag") - title: Article/trial title - url: Source URL - date: Publication date (YYYY-MM-DD or "Unknown") - authors: List of authors (optional)KnowledgeGapOutput¶
src.utils.modelsresearch_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gapsAgentSelectionPlan¶
src.utils.modelstasks: List of agent tasks to executeAgentTask¶
src.utils.modelsgap: The knowledge gap being addressed (optional) - agent: Name of agent to use - query: The specific query for the agent - entity_website: The website of the entity being researched, if known (optional)ReportDraft¶
src.utils.modelssections: List of report sectionsReportSection¶
src.utils.modelssection_title: The title of the section - section_content: The content of the sectionParsedQuery¶
src.utils.modelsoriginal_query: Original query string - improved_query: Refined query string - research_mode: Research mode ("iterative" or "deep") - key_entities: List of key entities - research_questions: List of research questionsConversation¶
src.utils.modelshistory: List of iteration dataIterationData¶
src.utils.modelsgap: The gap addressed in the iteration - tool_calls: The tool calls made - findings: The findings collected from tool calls - thought: The thinking done to reflect on the success of the iteration and next stepsAgentEvent¶
src.utils.modelstype: Event type (e.g., "started", "search_complete", "complete") - iteration: Iteration number (optional) - data: Event data dictionaryBudgetStatus¶
src.utils.modelstokens_used: Total tokens used - tokens_limit: Token budget limit - time_elapsed_seconds: Time elapsed in seconds - time_limit_seconds: Time budget limit (default: 600.0 seconds / 10 minutes) - iterations: Number of iterations completed - iterations_limit: Maximum iterations (default: 10) - iteration_tokens: Tokens used per iteration (iteration number -> token count)See Also¶
Orchestrators API Reference¶
IterativeResearchFlow¶
src.orchestrator.research_flowMethods¶
run¶async def run(
- self,
- query: str,
- background_context: str = "",
- max_iterations: int | None = None,
- max_time_minutes: float | None = None,
- token_budget: int | None = None
-) -> AsyncGenerator[AgentEvent, None]
-query: Research query string - background_context: Background context (default: "") - max_iterations: Maximum iterations (default: from settings) - max_time_minutes: Maximum time in minutes (default: from settings) - token_budget: Token budget (default: from settings)AgentEvent objects for: - started: Research started - search_complete: Search completed - judge_complete: Evidence evaluation completed - synthesizing: Generating report - complete: Research completed - error: Error occurredDeepResearchFlow¶
src.orchestrator.research_flowMethods¶
run¶async def run(
- self,
- query: str,
- background_context: str = "",
- max_iterations_per_section: int | None = None,
- max_time_minutes: float | None = None,
- token_budget: int | None = None
-) -> AsyncGenerator[AgentEvent, None]
-query: Research query string - background_context: Background context (default: "") - max_iterations_per_section: Maximum iterations per section (default: from settings) - max_time_minutes: Maximum time in minutes (default: from settings) - token_budget: Token budget (default: from settings)AgentEvent objects for: - started: Research started - planning: Creating research plan - looping: Running parallel research loops - synthesizing: Synthesizing results - complete: Research completed - error: Error occurredGraphOrchestrator¶
src.orchestrator.graph_orchestratorMethods¶
run¶async def run(
- self,
- query: str,
- research_mode: str = "auto",
- use_graph: bool = True
-) -> AsyncGenerator[AgentEvent, None]
-query: Research query string - research_mode: Research mode ("iterative", "deep", or "auto") - use_graph: Whether to use graph execution (default: True)AgentEvent objects during graph execution.Orchestrator Factory¶
src.orchestrator_factoryFunctions¶
create_orchestrator¶def create_orchestrator(
- search_handler: SearchHandlerProtocol,
- judge_handler: JudgeHandlerProtocol,
- config: dict[str, Any],
- mode: str | None = None
-) -> Any
-search_handler: Search handler protocol implementation - judge_handler: Judge handler protocol implementation - config: Configuration dictionary - mode: Orchestrator mode ("simple", "advanced", "magentic", or None for auto-detect)ValueError: If requirements not met"simple": Legacy orchestrator - "advanced" or "magentic": Magentic orchestrator (requires OpenAI API key) - None: Auto-detect based on API key availabilityMagenticOrchestrator¶
src.orchestrator_magenticMethods¶
run¶async def run(
- self,
- query: str,
- max_rounds: int = 15,
- max_stalls: int = 3
-) -> AsyncGenerator[AgentEvent, None]
-query: Research query string - max_rounds: Maximum rounds (default: 15) - max_stalls: Maximum stalls before reset (default: 3)AgentEvent objects converted from Magentic events.agent-framework-core package - OpenAI API keySee Also¶
Orchestrators API Reference¶
IterativeResearchFlow¶
src.orchestrator.research_flowMethods¶
run¶query: Research query string - background_context: Background context (default: "") - output_length: Optional description of desired output length (default: "") - output_instructions: Optional additional instructions for report generation (default: "")max_iterations, max_time_minutes, and token_budget are constructor parameters, not run() parameters.DeepResearchFlow¶
src.orchestrator.research_flowMethods¶
run¶query: Research query stringmax_iterations_per_section, max_time_minutes, and token_budget are constructor parameters, not run() parameters.GraphOrchestrator¶
src.orchestrator.graph_orchestratorMethods¶
run¶query: Research query stringAgentEvent objects during graph execution.research_mode and use_graph are constructor parameters, not run() parameters.Orchestrator Factory¶
src.orchestrator_factoryFunctions¶
create_orchestrator¶search_handler: Search handler protocol implementation (optional, required for simple mode) - judge_handler: Judge handler protocol implementation (optional, required for simple mode) - config: Configuration object (optional) - mode: Orchestrator mode ("simple", "advanced", "magentic", "iterative", "deep", "auto", or None for auto-detect) - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars)ValueError: If requirements not met"simple": Legacy orchestrator - "advanced" or "magentic": Magentic orchestrator (requires OpenAI API key) - None: Auto-detect based on API key availabilityMagenticOrchestrator¶
src.orchestrator_magenticMethods¶
run¶query: Research query stringAgentEvent objects converted from Magentic events.max_rounds and max_stalls are constructor parameters, not run() parameters.agent-framework-core package - OpenAI API keySee Also¶
Services API Reference¶
EmbeddingService¶
src.services.embeddingsMethods¶
embed¶text: Text to embedembed_batch¶texts: List of texts to embedsimilarity¶text1: First text - text2: Second textfind_duplicates¶async def find_duplicates(
- self,
- texts: list[str],
- threshold: float = 0.85
-) -> list[tuple[int, int]]
-texts: List of texts to check - threshold: Similarity threshold (default: 0.85)Factory Function¶
get_embedding_service¶LlamaIndexRAGService¶
src.services.ragMethods¶
ingest_evidence¶evidence: List of Evidence objects to ingestretrieve¶query: Search query string - top_k: Number of top results to return (default: 5)query¶query: Search query string - top_k: Number of top results to return (default: 5)Factory Function¶
get_rag_service¶StatisticalAnalyzer¶
src.services.statistical_analyzerMethods¶
analyze¶async def analyze(
- self,
- hypothesis: str,
- evidence: list[Evidence],
- data_description: str | None = None
-) -> AnalysisResult
-hypothesis: Hypothesis to analyze - evidence: List of Evidence objects - data_description: Optional data descriptionAnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - code: Generated analysis code - output: Execution output - error: Error message if execution failedSee Also¶
Services API Reference¶
EmbeddingService¶
src.services.embeddingsMethods¶
embed¶text: Text to embedembed_batch¶texts: List of texts to embedsimilarity¶text1: First text - text2: Second textfind_duplicates¶async def find_duplicates(
+ self,
+ texts: list[str],
+ threshold: float = 0.85
+) -> list[tuple[int, int]]
+texts: List of texts to check - threshold: Similarity threshold (default: 0.85)add_evidence¶async def add_evidence(
+ self,
+ evidence_id: str,
+ content: str,
+ metadata: dict[str, Any]
+) -> None
+evidence_id: Unique identifier for the evidence - content: Evidence text content - metadata: Additional metadata dictionarysearch_similar¶query: Search query string - n_results: Number of results to return (default: 5)id, content, metadata, and distance keys.deduplicate¶async def deduplicate(
+ self,
+ new_evidence: list[Evidence],
+ threshold: float = 0.9
+) -> list[Evidence]
+new_evidence: List of evidence items to deduplicate - threshold: Similarity threshold (default: 0.9, where 0.9 = 90% similar is duplicate)Factory Function¶
get_embedding_service¶LlamaIndexRAGService¶
src.services.ragMethods¶
ingest_evidence¶evidence_list: List of Evidence objects to ingestretrieve¶query: Search query string - top_k: Number of top results to return (defaults to similarity_top_k from constructor)text, score, and metadata keys.query¶query_str: Query string - top_k: Number of results to use (defaults to similarity_top_k from constructor)ConfigurationError: If no LLM API key is available for query synthesisingest_documents¶documents: List of LlamaIndex Document objectsclear_collection¶Factory Function¶
get_rag_service¶def get_rag_service(
+ collection_name: str = "deepcritical_evidence",
+ oauth_token: str | None = None,
+ **kwargs: Any
+) -> LlamaIndexRAGService
+collection_name: Name of the ChromaDB collection (default: "deepcritical_evidence") - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars) - **kwargs: Additional arguments for LlamaIndexRAGService (e.g., use_openai_embeddings=False)StatisticalAnalyzer¶
src.services.statistical_analyzerMethods¶
analyze¶async def analyze(
+ self,
+ query: str,
+ evidence: list[Evidence],
+ hypothesis: dict[str, Any] | None = None
+) -> AnalysisResult
+query: The research question - evidence: List of Evidence objects to analyze - hypothesis: Optional hypothesis dict with drug, target, pathway, effect, confidence keysAnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - confidence: Confidence in verdict (0.0-1.0) - statistical_evidence: Summary of statistical findings - code_generated: Python code that was executed - execution_output: Output from code execution - key_takeaways: Key takeaways from analysis - limitations: List of limitationsSee Also¶
Tools API Reference¶
SearchTool Protocol¶
SearchTool protocol:class SearchTool(Protocol):
- @property
- def name(self) -> str: ...
-
- async def search(
- self,
- query: str,
- max_results: int = 10
- ) -> list[Evidence]: ...
-PubMedTool¶
src.tools.pubmedProperties¶
name¶"pubmed"Methods¶
search¶query: Search query string - max_results: Maximum number of results to return (default: 10)Evidence objects with PubMed articles.SearchError: If search fails - RateLimitError: If rate limit is exceededClinicalTrialsTool¶
src.tools.clinicaltrialsProperties¶
name¶"clinicaltrials"Methods¶
search¶query: Search query string - max_results: Maximum number of results to return (default: 10)Evidence objects with clinical trials.SearchError: If search failsEuropePMCTool¶
src.tools.europepmcProperties¶
name¶"europepmc"Methods¶
search¶query: Search query string - max_results: Maximum number of results to return (default: 10)Evidence objects with articles/preprints.[PREPRINT - Not peer-reviewed]) and peer-reviewed articles.SearchError: If search failsRAGTool¶
src.tools.rag_toolProperties¶
name¶"rag"Methods¶
search¶query: Search query string - max_results: Maximum number of results to return (default: 10)Evidence objects from collected evidence.SearchHandler¶
src.tools.search_handlerMethods¶
search¶async def search(
- self,
- query: str,
- tools: list[SearchTool] | None = None,
- max_results_per_tool: int = 10
-) -> SearchResult
-query: Search query string - tools: List of tools to use (default: all available tools) - max_results_per_tool: Maximum results per tool (default: 10)SearchResult with: - evidence: Aggregated list of evidence - tool_results: Results per tool - total_count: Total number of resultsasyncio.gather() for parallel execution. Handles tool failures gracefully.See Also¶
Tools API Reference¶
SearchTool Protocol¶
SearchTool protocol:class SearchTool(Protocol):
+ @property
+ def name(self) -> str: ...
+
+ async def search(
+ self,
+ query: str,
+ max_results: int = 10
+ ) -> list[Evidence]: ...
+PubMedTool¶
src.tools.pubmedProperties¶
name¶"pubmed"Methods¶
search¶query: Search query string - max_results: Maximum number of results to return (default: 10)Evidence objects with PubMed articles.SearchError: If search fails (timeout, HTTP error, XML parsing error) - RateLimitError: If rate limit is exceeded (429 status code)ClinicalTrialsTool¶
src.tools.clinicaltrialsProperties¶
name¶"clinicaltrials"Methods¶
search¶query: Search query string - max_results: Maximum number of results to return (default: 10)Evidence objects with clinical trials.requests library (NOT httpx - WAF blocks httpx). Runs in thread pool for async compatibility.SearchError: If search fails (HTTP error, request exception)EuropePMCTool¶
src.tools.europepmcProperties¶
name¶"europepmc"Methods¶
search¶query: Search query string - max_results: Maximum number of results to return (default: 10)Evidence objects with articles/preprints.[PREPRINT - Not peer-reviewed]) and peer-reviewed articles. Handles preprint markers. Builds URLs from DOI or PMID.SearchError: If search fails (HTTP error, connection error)RAGTool¶
src.tools.rag_toolInitialization¶
def __init__(
+ self,
+ rag_service: LlamaIndexRAGService | None = None,
+ oauth_token: str | None = None
+) -> None
+rag_service: Optional RAG service instance. If None, will be lazy-initialized. - oauth_token: Optional OAuth token from HuggingFace login (for RAG LLM)Properties¶
name¶"rag"Methods¶
search¶query: Search query string - max_results: Maximum number of results to return (default: 10)Evidence objects from collected evidence.ConfigurationError: If RAG service is unavailableLlamaIndexRAGService. Returns Evidence from RAG results.SearchHandler¶
src.tools.search_handlerInitialization¶
def __init__(
+ self,
+ tools: list[SearchTool],
+ timeout: float = 30.0,
+ include_rag: bool = False,
+ auto_ingest_to_rag: bool = True,
+ oauth_token: str | None = None
+) -> None
+tools: List of search tools to use - timeout: Timeout for each search in seconds (default: 30.0) - include_rag: Whether to include RAG tool in searches (default: False) - auto_ingest_to_rag: Whether to automatically ingest results into RAG (default: True) - oauth_token: Optional OAuth token from HuggingFace login (for RAG LLM)Methods¶
execute¶query: Search query string - max_results_per_tool: Maximum results per tool (default: 10)SearchResult with: - query: The search query - evidence: Aggregated list of evidence - sources_searched: List of source names searched - total_found: Total number of results - errors: List of error messages from failed toolsSearchError: If search times outasyncio.gather() for parallel execution. Handles tool failures gracefully (returns errors in SearchResult.errors). Automatically ingests evidence into RAG if enabled.See Also¶
Agents Architecture¶
Agent Pattern¶
Agent class with the following structure:
__init__(model: Any | None = None)async def evaluate(), async def write_report())def create_agent_name(model: Any | None = None) -> AgentNameModel Initialization¶
get_model() from src/agent_factory/judges.py if no model is provided. This supports:
LLM_PROVIDER in settings.Error Handling¶
KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])Input Validation¶
Output Types¶
src/utils/models.py:
KnowledgeGapOutput: Research completeness evaluationAgentSelectionPlan: Tool selection planReportDraft: Long-form report structureParsedQuery: Query parsing and mode detectionstr directly.Agent Types¶
Knowledge Gap Agent¶
src/agents/knowledge_gap.pyKnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gapsasync def evaluate(query, background_context, conversation_history, iteration, time_elapsed_minutes, max_time_minutes) -> KnowledgeGapOutputTool Selector Agent¶
src/agents/tool_selector.pyAgentSelectionPlan with list of AgentTask objects.WebSearchAgent: General web search for fresh information - SiteCrawlerAgent: Research specific entities/companies - RAGAgent: Semantic search within collected evidenceWriter Agent¶
src/agents/writer.pyasync def write_report(query, findings, output_length, output_instructions) -> strLong Writer Agent¶
src/agents/long_writer.pyReportDraft models.async def write_next_section(query, draft, section_title, section_content) -> LongWriterOutput - async def write_report(query, report_title, report_draft) -> strProofreader Agent¶
src/agents/proofreader.pyReportDraft Output: Polished markdown stringasync def proofread(query, report_title, report_draft) -> strThinking Agent¶
src/agents/thinking.pyasync def generate_observations(query, background_context, conversation_history) -> strInput Parser Agent¶
src/agents/input_parser.pyParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: "iterative" or "deep" - key_entities: List of key entities - research_questions: List of research questionsFactory Functions¶
src/agent_factory/agents.py:get_model() if no model provided - Raise ConfigurationError if creation fails - Log agent creationSee Also¶
Agents Architecture¶
Agent Pattern¶
Pydantic AI Agents¶
Agent class with the following structure:
__init__(model: Any | None = None)async def evaluate(), async def write_report())def create_agent_name(model: Any | None = None, oauth_token: str | None = None) -> AgentNameoauth_token parameter for HuggingFace authentication, which takes priority over environment variables.Model Initialization¶
get_model() from src/agent_factory/judges.py if no model is provided. This supports:
LLM_PROVIDER in settings.Error Handling¶
KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])Input Validation¶
Output Types¶
src/utils/models.py:
KnowledgeGapOutput: Research completeness evaluationAgentSelectionPlan: Tool selection planReportDraft: Long-form report structureParsedQuery: Query parsing and mode detectionstr directly.Agent Types¶
Knowledge Gap Agent¶
src/agents/knowledge_gap.pyKnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gapsasync def evaluate(query, background_context, conversation_history, iteration, time_elapsed_minutes, max_time_minutes) -> KnowledgeGapOutputTool Selector Agent¶
src/agents/tool_selector.pyAgentSelectionPlan with list of AgentTask objects.WebSearchAgent: General web search for fresh information - SiteCrawlerAgent: Research specific entities/companies - RAGAgent: Semantic search within collected evidenceWriter Agent¶
src/agents/writer.pyasync def write_report(query, findings, output_length, output_instructions) -> strLong Writer Agent¶
src/agents/long_writer.pyReportDraft models.async def write_next_section(query, draft, section_title, section_content) -> LongWriterOutput - async def write_report(query, report_title, report_draft) -> strProofreader Agent¶
src/agents/proofreader.pyReportDraft Output: Polished markdown stringasync def proofread(query, report_title, report_draft) -> strThinking Agent¶
src/agents/thinking.pyasync def generate_observations(query, background_context, conversation_history) -> strInput Parser Agent¶
src/agents/input_parser.pyParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: "iterative" or "deep" - key_entities: List of key entities - research_questions: List of research questionsMagentic Agents¶
BaseAgent pattern from agent-framework and are used exclusively with MagenticOrchestrator:Hypothesis Agent¶
src/agents/hypothesis_agent.pyBaseAgent from agent-frameworkasync def run(messages, thread, **kwargs) -> AgentRunResponseAgent with HypothesisAssessment output type - Accesses shared evidence_store for evidence - Uses embedding service for diverse evidence selection (MMR algorithm) - Stores hypotheses in shared contextSearch Agent¶
src/agents/search_agent.pySearchHandler as an agent for Magentic orchestrator.BaseAgent from agent-frameworkasync def run(messages, thread, **kwargs) -> AgentRunResponseSearchHandlerProtocol - Deduplicates evidence using embedding service - Searches for semantically related evidence - Updates shared evidence storeAnalysis Agent¶
src/agents/analysis_agent.pyBaseAgent from agent-frameworkasync def run(messages, thread, **kwargs) -> AgentRunResponseStatisticalAnalyzer service - Analyzes evidence and hypotheses - Returns verdict (SUPPORTED/REFUTED/INCONCLUSIVE) - Stores analysis results in shared contextReport Agent (Magentic)¶
src/agents/report_agent.pyBaseAgent from agent-frameworkasync def run(messages, thread, **kwargs) -> AgentRunResponseAgent with ResearchReport output type - Accesses shared evidence store and hypotheses - Validates citations before returning - Formats report as markdownJudge Agent¶
src/agents/judge_agent.pyBaseAgent from agent-frameworkasync def run(messages, thread, **kwargs) -> AgentRunResponse - async def run_stream(messages, thread, **kwargs) -> AsyncIterable[AgentRunResponseUpdate]JudgeHandlerProtocol - Accesses shared evidence store - Returns JudgeAssessment with sufficient flag, confidence, and recommendationAgent Patterns¶
1. Pydantic AI Agents (Traditional Pattern)¶
Agent class directly and are used in iterative and deep research flows:
Agent(model, output_type, system_prompt)__init__(model: Any | None = None)async def evaluate(), async def write_report())KnowledgeGapAgent, ToolSelectorAgent, WriterAgent, LongWriterAgent, ProofreaderAgent, ThinkingAgent, InputParserAgent2. Magentic Agents (Agent-Framework Pattern)¶
BaseAgent class from agent-framework and are used in Magentic orchestrator:
BaseAgent from agent-framework with async def run() method__init__(evidence_store, embedding_service, ...)async def run(messages, thread, **kwargs) -> AgentRunResponseHypothesisAgent, SearchAgent, AnalysisAgent, ReportAgent, JudgeAgentMagenticOrchestrator and follow the agent-framework protocol for multi-agent coordination.Factory Functions¶
src/agent_factory/agents.py:get_model() if no model provided - Accept oauth_token parameter for HuggingFace authentication - Raise ConfigurationError if creation fails - Log agent creationSee Also¶
Graph Orchestration Architecture¶
Overview¶
Graph Structure¶
Nodes¶
KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgentEdges¶
Graph Patterns¶
Iterative Research Graph¶
[Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?]
- ↓ No ↓ Yes
- [Tool Selector] [Writer]
- ↓
- [Execute Tools] → [Loop Back]
-Deep Research Graph¶
[Input] → [Planner] → [Parallel Iterative Loops] → [Synthesizer]
- ↓ ↓ ↓
- [Loop1] [Loop2] [Loop3]
-State Management¶
WorkflowState using ContextVar for thread-safe isolation:
Execution Flow¶
asyncio.gather() for parallel nodesConditional Routing¶
research_complete → writer, else → tool selectorParallel Execution¶
Budget Enforcement¶
Error Handling¶
Backward Compatibility¶
USE_GRAPH_EXECUTION=true: Use graph-based executionUSE_GRAPH_EXECUTION=false: Use agent chain execution (existing)
Graph Orchestration Architecture¶
Graph Patterns¶
Iterative Research Graph¶
[Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?]
- ↓ No ↓ Yes
- [Tool Selector] [Writer]
- ↓
- [Execute Tools] → [Loop Back]
-Deep Research Graph¶
[Input] → [Planner] → [Parallel Iterative Loops] → [Synthesizer]
- ↓ ↓ ↓
- [Loop1] [Loop2] [Loop3]
-Deep Research¶
+
Graph Orchestration Architecture¶
Overview¶
Graph Patterns¶
Iterative Research Graph¶
[Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?]
+ ↓ No ↓ Yes
+ [Tool Selector] [Writer]
+ ↓
+ [Execute Tools] → [Loop Back]
+thinking → knowledge_gap → continue_decision → tool_selector/writer → execute_tools → (loop back to thinking)execute_tools: State node that uses search_handler to execute searches and add evidence to workflow state - continue_decision: Decision node that routes based on research_complete flag from KnowledgeGapOutputDeep Research Graph¶
[Input] → [Planner] → [Store Plan] → [Parallel Loops] → [Collect Drafts] → [Synthesizer]
+ ↓ ↓ ↓
+ [Loop1] [Loop2] [Loop3]
+planner → store_plan → parallel_loops → collect_drafts → synthesizerplanner: Agent node that creates ReportPlan with report outline - store_plan: State node that stores ReportPlan in context for parallel loops - parallel_loops: Parallel node that executes IterativeResearchFlow instances for each section - collect_drafts: State node that collects section drafts from parallel loops - synthesizer: Agent node that calls LongWriterAgent.write_report() directly with ReportDraftDeep Research¶
sequenceDiagram
actor User
participant GraphOrchestrator
@@ -41,7 +41,7 @@
end
GraphOrchestrator->>User: AsyncGenerator[AgentEvent]
-Iterative Research¶
sequenceDiagram
+Iterative Research¶
sequenceDiagram
participant IterativeFlow
participant ThinkingAgent
participant KnowledgeGapAgent
@@ -72,4 +72,4 @@
IterativeFlow->>JudgeHandler: assess_evidence()
JudgeHandler-->>IterativeFlow: should_continue
end
- endGraph Structure¶
Nodes¶
KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgentEdges¶
State Management¶
WorkflowState using ContextVar for thread-safe isolation:
Execution Flow¶
asyncio.gather() for parallel nodesConditional Routing¶
research_complete → writer, else → tool selectorParallel Execution¶
Budget Enforcement¶
Error Handling¶
Backward Compatibility¶
USE_GRAPH_EXECUTION=true: Use graph-based executionUSE_GRAPH_EXECUTION=false: Use agent chain execution (existing)See Also¶
Graph Structure¶
Nodes¶
KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgentEdges¶
State Management¶
WorkflowState using ContextVar for thread-safe isolation:
Execution Flow¶
create_iterative_graph() or create_deep_graph()ResearchGraph.validate_structure()GraphOrchestrator._execute_graph()agent.run() with transformed inputstate_updater functiondecision_function to get next node IDasyncio.gather()asyncio.gather() for parallel nodesGraphExecutionContext.update_state()AgentEvent objects during execution for UIGraphExecutionContext¶
GraphExecutionContext class manages execution state during graph traversal:
WorkflowState instanceBudgetTracker instance for budget enforcementset_node_result(node_id, result): Store result from node execution - get_node_result(node_id): Retrieve stored result - has_visited(node_id): Check if node was visited - mark_visited(node_id): Mark node as visited - update_state(updater, data): Update workflow stateConditional Routing¶
research_complete → writer, else → tool selectorParallel Execution¶
Budget Enforcement¶
Error Handling¶
Backward Compatibility¶
USE_GRAPH_EXECUTION=true: Use graph-based executionUSE_GRAPH_EXECUTION=false: Use agent chain execution (existing)See Also¶
Middleware Architecture¶
State Management¶
WorkflowState¶
src/middleware/state_machine.pyContextVar for thread-safe isolationevidence: list[Evidence]: Collected evidence from searches - conversation: Conversation: Iteration history (gaps, tool calls, findings, thoughts) - embedding_service: Any: Embedding service for semantic searchadd_evidence(evidence: Evidence): Adds evidence with URL-based deduplication - async search_related(query: str, top_k: int = 5) -> list[Evidence]: Semantic searchWorkflow Manager¶
src/middleware/workflow_manager.pyadd_loop(loop: ResearchLoop): Add a research loop to manage - async run_loops_parallel() -> list[ResearchLoop]: Run all loops in parallel - update_loop_status(loop_id: str, status: str): Update loop status - sync_loop_evidence_to_state(): Synchronize evidence from loops to global stateasyncio.gather() for parallel execution - Handles errors per loop (doesn't fail all if one fails) - Tracks loop status: pending, running, completed, failed, cancelled - Evidence deduplication across parallel loopsfrom src.middleware.workflow_manager import WorkflowManager
-
-manager = WorkflowManager()
-manager.add_loop(loop1)
-manager.add_loop(loop2)
-completed_loops = await manager.run_loops_parallel()
-Budget Tracker¶
src/middleware/budget_tracker.pycreate_budget(token_limit, time_limit_seconds, iterations_limit) -> BudgetStatus - add_tokens(tokens: int): Add token usage - start_timer(): Start time tracking - update_timer(): Update elapsed time - increment_iteration(): Increment iteration count - check_budget() -> BudgetStatus: Check current budget status - can_continue() -> bool: Check if research can continueestimate_tokens(text: str) -> int: ~4 chars per token - estimate_llm_call_tokens(prompt: str, response: str) -> int: Estimate LLM call tokensfrom src.middleware.budget_tracker import BudgetTracker
-
-tracker = BudgetTracker()
-budget = tracker.create_budget(
- token_limit=100000,
- time_limit_seconds=600,
- iterations_limit=10
-)
-tracker.start_timer()
-# ... research operations ...
-if not tracker.can_continue():
- # Budget exceeded, stop research
- pass
-Models¶
src/utils/models.py:
IterationData: Data for a single iterationConversation: Conversation history with iterationsResearchLoop: Research loop state and configurationBudgetStatus: Current budget statusThread Safety¶
ContextVar for thread-safe isolation:
See Also¶
Middleware Architecture¶
State Management¶
WorkflowState¶
src/middleware/state_machine.pyContextVar for thread-safe isolationevidence: list[Evidence]: Collected evidence from searches - conversation: Conversation: Iteration history (gaps, tool calls, findings, thoughts) - embedding_service: Any: Embedding service for semantic searchadd_evidence(new_evidence: list[Evidence]) -> int: Adds evidence with URL-based deduplication. Returns the number of new items added (excluding duplicates). - async search_related(query: str, n_results: int = 5) -> list[Evidence]: Semantic search for related evidence using embedding serviceWorkflow Manager¶
src/middleware/workflow_manager.pyasync add_loop(loop_id: str, query: str) -> ResearchLoop: Add a new research loop to manage - async run_loops_parallel(loop_configs: list[dict], loop_func: Callable, judge_handler: Any | None = None, budget_tracker: Any | None = None) -> list[Any]: Run multiple research loops in parallel. Takes configuration dicts and a loop function. - async update_loop_status(loop_id: str, status: LoopStatus, error: str | None = None): Update loop status - async sync_loop_evidence_to_state(loop_id: str): Synchronize evidence from a specific loop to global stateasyncio.gather() for parallel execution - Handles errors per loop (doesn't fail all if one fails) - Tracks loop status: pending, running, completed, failed, cancelled - Evidence deduplication across parallel loopsfrom src.middleware.workflow_manager import WorkflowManager
+
+manager = WorkflowManager()
+await manager.add_loop("loop1", "Research query 1")
+await manager.add_loop("loop2", "Research query 2")
+
+async def run_research(config: dict) -> str:
+ loop_id = config["loop_id"]
+ query = config["query"]
+ # ... research logic ...
+ return "report"
+
+results = await manager.run_loops_parallel(
+ loop_configs=[
+ {"loop_id": "loop1", "query": "Research query 1"},
+ {"loop_id": "loop2", "query": "Research query 2"},
+ ],
+ loop_func=run_research,
+)
+Budget Tracker¶
src/middleware/budget_tracker.pycreate_budget(loop_id: str, tokens_limit: int = 100000, time_limit_seconds: float = 600.0, iterations_limit: int = 10) -> BudgetStatus: Create a budget for a specific loop - add_tokens(loop_id: str, tokens: int): Add token usage to a loop's budget - start_timer(loop_id: str): Start time tracking for a loop - update_timer(loop_id: str): Update elapsed time for a loop - increment_iteration(loop_id: str): Increment iteration count for a loop - check_budget(loop_id: str) -> tuple[bool, str]: Check if a loop's budget has been exceeded. Returns (exceeded: bool, reason: str) - can_continue(loop_id: str) -> bool: Check if a loop can continue based on budgetestimate_tokens(text: str) -> int: ~4 chars per token - estimate_llm_call_tokens(prompt: str, response: str) -> int: Estimate LLM call tokensfrom src.middleware.budget_tracker import BudgetTracker
+
+tracker = BudgetTracker()
+budget = tracker.create_budget(
+ loop_id="research_loop",
+ tokens_limit=100000,
+ time_limit_seconds=600,
+ iterations_limit=10
+)
+tracker.start_timer("research_loop")
+# ... research operations ...
+tracker.add_tokens("research_loop", 5000)
+tracker.update_timer("research_loop")
+exceeded, reason = tracker.check_budget("research_loop")
+if exceeded:
+ # Budget exceeded, stop research
+ pass
+if not tracker.can_continue("research_loop"):
+ # Budget exceeded, stop research
+ pass
+Models¶
src/utils/models.py:
IterationData: Data for a single iterationConversation: Conversation history with iterationsResearchLoop: Research loop state and configurationBudgetStatus: Current budget statusThread Safety¶
ContextVar for thread-safe isolation:
See Also¶
Orchestrators Architecture¶
Research Flows¶
IterativeResearchFlow¶
src/orchestrator/research_flow.pyKnowledgeGapAgent: Evaluates research completeness - ToolSelectorAgent: Selects tools for addressing gaps - ThinkingAgent: Generates observations - WriterAgent: Creates final report - JudgeHandler: Assesses evidence sufficiencyuse_graph=True) and agent chains (use_graph=False) - Iterates until research complete or constraints metDeepResearchFlow¶
src/orchestrator/research_flow.pyPlannerAgent: Breaks query into report sections - IterativeResearchFlow: Per-section research (parallel) - LongWriterAgent or ProofreaderAgent: Final synthesisWorkflowManager for parallel execution - Budget tracking per section and globally - State synchronization across parallel loops - Supports graph execution and agent chainsGraph Orchestrator¶
src/orchestrator/graph_orchestrator.pyAgentEvent objects for UIOrchestrator Factory¶
src/orchestrator_factory.pyMagentic Orchestrator¶
src/orchestrator_magentic.pyagent-framework-core - ChatAgent pattern with internal LLMs per agent - MagenticBuilder with participants: searcher, hypothesizer, judge, reporter - Manager orchestrates agents via OpenAIChatClient - Requires OpenAI API key (function calling support) - Event-driven: converts Magentic events to AgentEvent for UI streamingagent-framework-core package - OpenAI API keyHierarchical Orchestrator¶
src/orchestrator_hierarchical.pySubIterationMiddleware with ResearchTeam and LLMSubIterationJudge - Adapts Magentic ChatAgent to SubIterationTeam protocol - Event-driven via asyncio.Queue for coordination - Supports sub-iteration patterns for complex research tasksLegacy Simple Mode¶
src/legacy_orchestrator.pySearchHandlerProtocol and JudgeHandlerProtocol - Generator-based design yielding AgentEvent objects - Backward compatibility for simple use casesState Initialization¶
Event Streaming¶
AgentEvent objects:started: Research started - search_complete: Search completed - judge_complete: Evidence evaluation completed - hypothesizing: Generating hypotheses - synthesizing: Synthesizing results - complete: Research completed - error: Error occurredSee Also¶
Orchestrators Architecture¶
Research Flows¶
IterativeResearchFlow¶
src/orchestrator/research_flow.pyKnowledgeGapAgent: Evaluates research completeness - ToolSelectorAgent: Selects tools for addressing gaps - ThinkingAgent: Generates observations - WriterAgent: Creates final report - JudgeHandler: Assesses evidence sufficiencyuse_graph=True) and agent chains (use_graph=False) - Iterates until research complete or constraints metDeepResearchFlow¶
src/orchestrator/research_flow.pyPlannerAgent: Breaks query into report sections - IterativeResearchFlow: Per-section research (parallel) - LongWriterAgent or ProofreaderAgent: Final synthesisWorkflowManager for parallel execution - Budget tracking per section and globally - State synchronization across parallel loops - Supports graph execution and agent chainsGraph Orchestrator¶
src/orchestrator/graph_orchestrator.pyuse_graph=True) or agent chains (use_graph=False) as fallback - Routes based on research mode (iterative/deep/auto) - Streams AgentEvent objects for UI - Uses GraphExecutionContext to manage execution stateGraphOrchestrator has special handling for certain nodes:
execute_tools node: State node that uses search_handler to execute searches and add evidence to workflow stateparallel_loops node: Parallel node that executes IterativeResearchFlow instances for each section in deep research modesynthesizer node: Agent node that calls LongWriterAgent.write_report() directly with ReportDraft instead of using agent.run()writer node: Agent node that calls WriterAgent.write_report() directly with findings instead of using agent.run()GraphExecutionContext to manage execution state: - Tracks current node, visited nodes, and node results - Manages workflow state and budget tracker - Provides methods to store and retrieve node execution resultsOrchestrator Factory¶
src/orchestrator_factory.pyMagentic Orchestrator¶
src/orchestrator_magentic.pyagent-framework-core - ChatAgent pattern with internal LLMs per agent - MagenticBuilder with participants: - searcher: SearchAgent (wraps SearchHandler) - hypothesizer: HypothesisAgent (generates hypotheses) - judge: JudgeAgent (evaluates evidence) - reporter: ReportAgent (generates final report) - Manager orchestrates agents via chat client (OpenAI or HuggingFace) - Event-driven: converts Magentic events to AgentEvent for UI streaming via _process_event() method - Supports max rounds, stall detection, and reset handlingAgentEvent: - MagenticOrchestratorMessageEvent → AgentEvent with type based on message content - MagenticAgentMessageEvent → AgentEvent with type based on agent name - MagenticAgentDeltaEvent → AgentEvent for streaming updates - MagenticFinalResultEvent → AgentEvent with type "complete"agent-framework-core package - OpenAI API key or HuggingFace authenticationHierarchical Orchestrator¶
src/orchestrator_hierarchical.pySubIterationMiddleware with ResearchTeam and LLMSubIterationJudge - Adapts Magentic ChatAgent to SubIterationTeam protocol - Event-driven via asyncio.Queue for coordination - Supports sub-iteration patterns for complex research tasksLegacy Simple Mode¶
src/legacy_orchestrator.pySearchHandlerProtocol and JudgeHandlerProtocol - Generator-based design yielding AgentEvent objects - Backward compatibility for simple use casesState Initialization¶
Event Streaming¶
AgentEvent objects:started: Research started - searching: Search in progress - search_complete: Search completed - judging: Evidence evaluation in progress - judge_complete: Evidence evaluation completed - looping: Iteration in progress - hypothesizing: Generating hypotheses - analyzing: Statistical analysis in progress - analysis_complete: Statistical analysis completed - synthesizing: Synthesizing results - complete: Research completed - error: Error occurred - streaming: Streaming update (delta events)See Also¶
Services Architecture¶
Embedding Service¶
src/services/embeddings.pyrun_in_executor() to avoid blocking - ChromaDB Storage: Vector storage for embeddings - Deduplication: 0.85 similarity threshold (85% similarity = duplicate)settings.local_embedding_model (default: all-MiniLM-L6-v2)async def embed(text: str) -> list[float]: Generate embeddings - async def embed_batch(texts: list[str]) -> list[list[float]]: Batch embedding - async def similarity(text1: str, text2: str) -> float: Calculate similarity - async def find_duplicates(texts: list[str], threshold: float = 0.85) -> list[tuple[int, int]]: Find duplicatesfrom src.services.embeddings import get_embedding_service
-
-service = get_embedding_service()
-embedding = await service.embed("text to embed")
-LlamaIndex RAG Service¶
src/services/rag.pyOPENAI_API_KEY - ChromaDB Storage: Vector database for document storage - Metadata Preservation: Preserves source, title, URL, date, authors - Lazy Initialization: Graceful fallback if OpenAI key not availableasync def ingest_evidence(evidence: list[Evidence]) -> None: Ingest evidence into RAG - async def retrieve(query: str, top_k: int = 5) -> list[Document]: Retrieve relevant documents - async def query(query: str, top_k: int = 5) -> str: Query with RAGfrom src.services.rag import get_rag_service
-
-service = get_rag_service()
-if service:
- documents = await service.retrieve("query", top_k=5)
-Statistical Analyzer¶
src/services/statistical_analyzer.pySANDBOX_LIBRARIES - Network Isolation: block_network=True by defaultAnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - code: Generated analysis code - output: Execution output - error: Error message if execution failedfrom src.services.statistical_analyzer import StatisticalAnalyzer
-
-analyzer = StatisticalAnalyzer()
-result = await analyzer.analyze(
- hypothesis="Metformin reduces cancer risk",
- evidence=evidence_list
-)
-Singleton Pattern¶
@lru_cache(maxsize=1):@lru_cache(maxsize=1)
-def get_embedding_service() -> EmbeddingService:
- return EmbeddingService()
-Service Availability¶
from src.utils.config import settings
-
-if settings.modal_available:
- # Use Modal sandbox
- pass
-
-if settings.has_openai_key:
- # Use OpenAI embeddings for RAG
- pass
-See Also¶
Services Architecture¶
Embedding Service¶
src/services/embeddings.pyrun_in_executor() to avoid blocking the event loop - ChromaDB Storage: In-memory vector storage for embeddings - Deduplication: 0.9 similarity threshold by default (90% similarity = duplicate, configurable)settings.local_embedding_model (default: all-MiniLM-L6-v2)async def embed(text: str) -> list[float]: Generate embeddings (async-safe via run_in_executor()) - async def embed_batch(texts: list[str]) -> list[list[float]]: Batch embedding (more efficient) - async def add_evidence(evidence_id: str, content: str, metadata: dict[str, Any]) -> None: Add evidence to vector store - async def search_similar(query: str, n_results: int = 5) -> list[dict[str, Any]]: Find semantically similar evidence - async def deduplicate(new_evidence: list[Evidence], threshold: float = 0.9) -> list[Evidence]: Remove semantically duplicate evidencefrom src.services.embeddings import get_embedding_service
+
+service = get_embedding_service()
+embedding = await service.embed("text to embed")
+LlamaIndex RAG Service¶
src/services/llamaindex_rag.pyOPENAI_API_KEY) or local sentence-transformers (no API key) - Multiple LLM Providers: HuggingFace LLM (preferred) or OpenAI LLM (fallback) for query synthesis - ChromaDB Storage: Vector database for document storage (supports in-memory mode) - Metadata Preservation: Preserves source, title, URL, date, authors - Lazy Initialization: Graceful fallback if dependencies not availableuse_openai_embeddings: bool | None: Force OpenAI embeddings (None = auto-detect) - use_in_memory: bool: Use in-memory ChromaDB client (useful for tests) - oauth_token: str | None: Optional OAuth token from HuggingFace login (takes priority over env vars)async def ingest_evidence(evidence: list[Evidence]) -> None: Ingest evidence into RAG - async def retrieve(query: str, top_k: int = 5) -> list[Document]: Retrieve relevant documents - async def query(query: str, top_k: int = 5) -> str: Query with RAGfrom src.services.llamaindex_rag import get_rag_service
+
+service = get_rag_service(
+ use_openai_embeddings=False, # Use local embeddings
+ use_in_memory=True, # Use in-memory ChromaDB
+ oauth_token=token # Optional HuggingFace token
+)
+if service:
+ documents = await service.retrieve("query", top_k=5)
+Statistical Analyzer¶
src/services/statistical_analyzer.pySANDBOX_LIBRARIES - Network Isolation: block_network=True by defaultAnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - code: Generated analysis code - output: Execution output - error: Error message if execution failedfrom src.services.statistical_analyzer import StatisticalAnalyzer
+
+analyzer = StatisticalAnalyzer()
+result = await analyzer.analyze(
+ hypothesis="Metformin reduces cancer risk",
+ evidence=evidence_list
+)
+Singleton Pattern¶
Service Availability¶
from src.utils.config import settings
+
+if settings.modal_available:
+ # Use Modal sandbox
+ pass
+
+if settings.has_openai_key:
+ # Use OpenAI embeddings for RAG
+ pass
+See Also¶
Tools Architecture¶
SearchTool Protocol¶
SearchTool protocol from src/tools/base.py:Rate Limiting¶
@retry decorator from tenacity:@retry(
- stop=stop_after_attempt(3),
- wait=wait_exponential(...)
-)
-async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
- # Implementation
-_rate_limit() method and use shared rate limiters from src/tools/rate_limiter.py.Error Handling¶
SearchError: General search failuresRateLimitError: Rate limit exceededQuery Preprocessing¶
preprocess_query() from src/tools/query_utils.py to:
Evidence Conversion¶
Evidence objects with:
Citation: Title, URL, date, authorscontent: Evidence textrelevance_score: 0.0-1.0 relevance scoremetadata: Additional metadataTool Implementations¶
PubMed Tool¶
src/tools/pubmed.pyxmltodict - Handles single vs. multiple articles - Query preprocessing - Evidence conversion with metadata extractionClinicalTrials Tool¶
src/tools/clinicaltrials.pyrequests library (NOT httpx) because WAF blocks httpx TLS fingerprint.await asyncio.to_thread(requests.get, ...)COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATIONEurope PMC Tool¶
src/tools/europepmc.py[PREPRINT - Not peer-reviewed] - Builds URLs from DOI or PMID - Checks pubTypeList for preprint detection - Includes both preprints and peer-reviewed articlesRAG Tool¶
src/tools/rag_tool.pyLlamaIndexRAGServiceSearch Handler¶
src/tools/search_handler.pyasyncio.gather() with return_exceptions=True - Aggregates results into SearchResult - Handles tool failures gracefully - Deduplicates results by URLTool Registration¶
from src.tools.pubmed import PubMedTool
-from src.tools.clinicaltrials import ClinicalTrialsTool
-from src.tools.europepmc import EuropePMCTool
-
-search_handler = SearchHandler(
- tools=[
- PubMedTool(),
- ClinicalTrialsTool(),
- EuropePMCTool(),
- ]
-)
-See Also¶
Tools Architecture¶
SearchTool Protocol¶
SearchTool protocol from src/tools/base.py:Rate Limiting¶
@retry decorator from tenacity:_rate_limit() method and use shared rate limiters from src/tools/rate_limiter.py.Error Handling¶
SearchError: General search failuresRateLimitError: Rate limit exceededQuery Preprocessing¶
preprocess_query() from src/tools/query_utils.py to:
Evidence Conversion¶
Evidence objects with:
Citation: Title, URL, date, authorscontent: Evidence textrelevance_score: 0.0-1.0 relevance scoremetadata: Additional metadataTool Implementations¶
PubMed Tool¶
src/tools/pubmed.pyxmltodict - Handles single vs. multiple articles - Query preprocessing - Evidence conversion with metadata extractionClinicalTrials Tool¶
src/tools/clinicaltrials.pyrequests library (NOT httpx) because WAF blocks httpx TLS fingerprint.await asyncio.to_thread(requests.get, ...)COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATIONEurope PMC Tool¶
src/tools/europepmc.py[PREPRINT - Not peer-reviewed] - Builds URLs from DOI or PMID - Checks pubTypeList for preprint detection - Includes both preprints and peer-reviewed articlesRAG Tool¶
src/tools/rag_tool.pyLlamaIndexRAGServiceSearch Handler¶
src/tools/search_handler.pytools: list[SearchTool]: List of search tools to use - timeout: float = 30.0: Timeout for each search in seconds - include_rag: bool = False: Whether to include RAG tool in searches - auto_ingest_to_rag: bool = True: Whether to automatically ingest results into RAG - oauth_token: str | None = None: Optional OAuth token from HuggingFace login (for RAG LLM)async def execute(query: str, max_results_per_tool: int = 10) -> SearchResult: Execute search across all tools in parallelasyncio.gather() with return_exceptions=True for parallel execution - Aggregates results into SearchResult with evidence and metadata - Handles tool failures gracefully (continues with other tools) - Deduplicates results by URL - Automatically ingests results into RAG if auto_ingest_to_rag=True - Can add RAG tool dynamically via add_rag_tool() methodTool Registration¶
from src.tools.pubmed import PubMedTool
+from src.tools.clinicaltrials import ClinicalTrialsTool
+from src.tools.europepmc import EuropePMCTool
+from src.tools.search_handler import SearchHandler
+
+search_handler = SearchHandler(
+ tools=[
+ PubMedTool(),
+ ClinicalTrialsTool(),
+ EuropePMCTool(),
+ ],
+ include_rag=True, # Include RAG tool for semantic search
+ auto_ingest_to_rag=True, # Automatically ingest results into RAG
+ oauth_token=token # Optional HuggingFace token for RAG LLM
+)
+
+# Execute search
+result = await search_handler.execute("query", max_results_per_tool=10)
+See Also¶
DeepCritical Workflow - Simplified Magentic Architecture¶
1. High-Level Magentic Workflow¶
flowchart TD
+
DeepCritical Workflow - Simplified Magentic Architecture¶
1. High-Level Magentic Workflow¶
flowchart TD
Start([User Query]) --> Manager[Magentic Manager<br/>Plan • Select • Assess • Adapt]
Manager -->|Plans| Task1[Task Decomposition]
@@ -31,7 +31,7 @@
style ReportAgent fill:#fff4e6
style Decision fill:#ffd6d6
style Synthesis fill:#d4edda
- style Output fill:#e1f5e12. Magentic Manager: The 6-Phase Cycle¶
flowchart LR
+ style Output fill:#e1f5e12. Magentic Manager: The 6-Phase Cycle¶
flowchart LR
P1[1. Planning<br/>Analyze task<br/>Create strategy] --> P2[2. Agent Selection<br/>Pick best agent<br/>for subtask]
P2 --> P3[3. Execution<br/>Run selected<br/>agent with tools]
P3 --> P4[4. Assessment<br/>Evaluate quality<br/>Check progress]
@@ -47,7 +47,7 @@
style P4 fill:#ffd6d6
style P5 fill:#fff3cd
style P6 fill:#d4edda
- style Done fill:#e1f5e13. Simplified Agent Architecture¶
graph TB
+ style Done fill:#e1f5e13. Simplified Agent Architecture¶
graph TB
subgraph "Orchestration Layer"
Manager[Magentic Manager<br/>• Plans workflow<br/>• Selects agents<br/>• Assesses quality<br/>• Adapts strategy]
SharedContext[(Shared Context<br/>• Hypotheses<br/>• Search Results<br/>• Analysis<br/>• Progress)]
@@ -93,7 +93,7 @@
style WebSearch fill:#e6f3ff
style CodeExec fill:#e6f3ff
style RAG fill:#e6f3ff
- style Viz fill:#e6f3ff4. Dynamic Workflow Example¶
sequenceDiagram
+ style Viz fill:#e6f3ff4. Dynamic Workflow Example¶
sequenceDiagram
participant User
participant Manager
participant HypAgent
@@ -129,7 +129,7 @@
ReportAgent-->>Manager: Returns formatted report
Note over Manager: SYNTHESIZE: Combine all results
- Manager->>User: Final Research Report5. Manager Decision Logic¶
flowchart TD
+ Manager->>User: Final Research Report5. Manager Decision Logic¶
flowchart TD
Start([Manager Receives Task]) --> Plan[Create Initial Plan]
Plan --> Select[Select Agent for Next Subtask]
@@ -164,7 +164,7 @@
style Q3 fill:#ffe6e6
style Q4 fill:#ffe6e6
style Synth fill:#d4edda
- style Done fill:#e1f5e16. Hypothesis Agent Workflow¶
flowchart LR
+ style Done fill:#e1f5e16. Hypothesis Agent Workflow¶
flowchart LR
Input[Research Query] --> Domain[Identify Domain<br/>& Key Concepts]
Domain --> Context[Retrieve Background<br/>Knowledge]
Context --> Generate[Generate 3-5<br/>Initial Hypotheses]
@@ -176,7 +176,7 @@
style Input fill:#e1f5e1
style Output fill:#fff4e6
- style Struct fill:#e6f3ff7. Search Agent Workflow¶
flowchart TD
+ style Struct fill:#e6f3ff7. Search Agent Workflow¶
flowchart TD
Input[Hypotheses] --> Strategy[Formulate Search<br/>Strategy per Hypothesis]
Strategy --> Multi[Multi-Source Search]
@@ -199,7 +199,7 @@
style Input fill:#fff4e6
style Multi fill:#ffe6e6
style Vector fill:#ffe6f0
- style Output fill:#e6f3ff8. Analysis Agent Workflow¶
flowchart TD
+ style Output fill:#e6f3ff8. Analysis Agent Workflow¶
flowchart TD
Input1[Hypotheses] --> Extract
Input2[Search Results] --> Extract[Extract Evidence<br/>per Hypothesis]
@@ -224,7 +224,7 @@
style Input1 fill:#fff4e6
style Input2 fill:#e6f3ff
style Execute fill:#ffe6e6
- style Output fill:#e6ffe69. Report Agent Workflow¶
flowchart TD
+ style Output fill:#e6ffe69. Report Agent Workflow¶
flowchart TD
Input1[Query] --> Assemble
Input2[Hypotheses] --> Assemble
Input3[Search Results] --> Assemble
@@ -264,7 +264,7 @@
style Input2 fill:#fff4e6
style Input3 fill:#e6f3ff
style Input4 fill:#e6ffe6
- style Output fill:#d4edda10. Data Flow & Event Streaming¶
flowchart TD
+ style Output fill:#d4edda10. Data Flow & Event Streaming¶
flowchart TD
User[👤 User] -->|Research Query| UI[Gradio UI]
UI -->|Submit| Manager[Magentic Manager]
@@ -303,7 +303,7 @@
style Context fill:#ffe6f0
style VectorDB fill:#ffe6f0
style WebSearch fill:#f0f0f0
- style CodeExec fill:#f0f0f011. MCP Tool Architecture¶
graph TB
+ style CodeExec fill:#f0f0f011. MCP Tool Architecture¶
graph TB
subgraph "Agent Layer"
Manager[Magentic Manager]
HypAgent[Hypothesis Agent]
@@ -351,7 +351,7 @@
style Server1 fill:#e6f3ff
style Server2 fill:#e6f3ff
style Server3 fill:#e6f3ff
- style Server4 fill:#e6f3ff12. Progress Tracking & Stall Detection¶
stateDiagram-v2
+ style Server4 fill:#e6f3ff12. Progress Tracking & Stall Detection¶
stateDiagram-v2
[*] --> Initialization: User Query
Initialization --> Planning: Manager starts
@@ -391,7 +391,7 @@
Stall = no new progress
after agent execution
Triggers plan reset
- end note13. Gradio UI Integration¶
graph TD
+ end note13. Gradio UI Integration¶
graph TD
App[Gradio App<br/>DeepCritical Research Agent]
App --> Input[Input Section]
@@ -424,7 +424,7 @@
style Input fill:#fff4e6
style Status fill:#e6f3ff
style Output fill:#e6ffe6
- style Workflow fill:#ffe6e614. Complete System Context¶
graph LR
+ style Workflow fill:#ffe6e614. Complete System Context¶
graph LR
User[👤 Researcher<br/>Asks research questions] -->|Submits query| DC[DeepCritical<br/>Magentic Workflow]
DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
@@ -453,7 +453,7 @@
style Claude fill:#ffd6d6
style Modal fill:#f0f0f0
style Chroma fill:#ffe6f0
- style HF fill:#d4edda15. Workflow Timeline (Simplified)¶
gantt
+ style HF fill:#d4edda15. Workflow Timeline (Simplified)¶
gantt
title DeepCritical Magentic Workflow - Typical Execution
dateFormat mm:ss
axisFormat %M:%S
@@ -485,19 +485,4 @@
Formatting :r3, after r2, 10s
section Manager Synthesis
- Final synthesis :f1, after r3, 10s
Key Differences from Original Design¶
Aspect Original (Judge-in-Loop) New (Magentic) Control Flow Fixed sequential phases Dynamic agent selection Quality Control Separate Judge Agent Manager assessment built-in Retry Logic Phase-level with feedback Agent-level with adaptation Flexibility Rigid 4-phase pipeline Adaptive workflow Complexity 5 agents (including Judge) 4 agents (no Judge) Progress Tracking Manual state management Built-in round/stall detection Agent Coordination Sequential handoff Manager-driven dynamic selection Error Recovery Retry same phase Try different agent or replan
Simplified Design Principles¶
Legend¶
Implementation Highlights¶
workflow = (
- MagenticBuilder()
- .participants(
- hypothesis=HypothesisAgent(tools=[background_tool]),
- search=SearchAgent(tools=[web_search, rag_tool]),
- analysis=AnalysisAgent(tools=[code_execution]),
- report=ReportAgent(tools=[code_execution, visualization])
- )
- .with_standard_manager(
- chat_client=AnthropicClient(model="claude-sonnet-4"),
- max_round_count=15, # Prevent infinite loops
- max_stall_count=3 # Detect stuck workflows
- )
- .build()
-)
-
See Also¶
Key Differences from Original Design¶
Aspect Original (Judge-in-Loop) New (Magentic) Control Flow Fixed sequential phases Dynamic agent selection Quality Control Separate Judge Agent Manager assessment built-in Retry Logic Phase-level with feedback Agent-level with adaptation Flexibility Rigid 4-phase pipeline Adaptive workflow Complexity 5 agents (including Judge) 4 agents (no Judge) Progress Tracking Manual state management Built-in round/stall detection Agent Coordination Sequential handoff Manager-driven dynamic selection Error Recovery Retry same phase Try different agent or replan
Simplified Design Principles¶
Legend¶
Implementation Highlights¶
See Also¶
DeepCritical Workflow - Simplified Magentic Architecture¶
1. High-Level Magentic Workflow¶
flowchart TD
- Start([User Query]) --> Manager[Magentic Manager<br/>Plan • Select • Assess • Adapt]
-
- Manager -->|Plans| Task1[Task Decomposition]
- Task1 --> Manager
-
- Manager -->|Selects & Executes| HypAgent[Hypothesis Agent]
- Manager -->|Selects & Executes| SearchAgent[Search Agent]
- Manager -->|Selects & Executes| AnalysisAgent[Analysis Agent]
- Manager -->|Selects & Executes| ReportAgent[Report Agent]
-
- HypAgent -->|Results| Manager
- SearchAgent -->|Results| Manager
- AnalysisAgent -->|Results| Manager
- ReportAgent -->|Results| Manager
-
- Manager -->|Assesses Quality| Decision{Good Enough?}
- Decision -->|No - Refine| Manager
- Decision -->|No - Different Agent| Manager
- Decision -->|No - Stalled| Replan[Reset Plan]
- Replan --> Manager
-
- Decision -->|Yes| Synthesis[Synthesize Final Result]
- Synthesis --> Output([Research Report])
-
- style Start fill:#e1f5e1
- style Manager fill:#ffe6e6
- style HypAgent fill:#fff4e6
- style SearchAgent fill:#fff4e6
- style AnalysisAgent fill:#fff4e6
- style ReportAgent fill:#fff4e6
- style Decision fill:#ffd6d6
- style Synthesis fill:#d4edda
- style Output fill:#e1f5e12. Magentic Manager: The 6-Phase Cycle¶
flowchart LR
- P1[1. Planning<br/>Analyze task<br/>Create strategy] --> P2[2. Agent Selection<br/>Pick best agent<br/>for subtask]
- P2 --> P3[3. Execution<br/>Run selected<br/>agent with tools]
- P3 --> P4[4. Assessment<br/>Evaluate quality<br/>Check progress]
- P4 --> Decision{Quality OK?<br/>Progress made?}
- Decision -->|Yes| P6[6. Synthesis<br/>Combine results<br/>Generate report]
- Decision -->|No| P5[5. Iteration<br/>Adjust plan<br/>Try again]
- P5 --> P2
- P6 --> Done([Complete])
-
- style P1 fill:#fff4e6
- style P2 fill:#ffe6e6
- style P3 fill:#e6f3ff
- style P4 fill:#ffd6d6
- style P5 fill:#fff3cd
- style P6 fill:#d4edda
- style Done fill:#e1f5e13. Simplified Agent Architecture¶
graph TB
- subgraph "Orchestration Layer"
- Manager[Magentic Manager<br/>• Plans workflow<br/>• Selects agents<br/>• Assesses quality<br/>• Adapts strategy]
- SharedContext[(Shared Context<br/>• Hypotheses<br/>• Search Results<br/>• Analysis<br/>• Progress)]
- Manager <--> SharedContext
- end
-
- subgraph "Specialist Agents"
- HypAgent[Hypothesis Agent<br/>• Domain understanding<br/>• Hypothesis generation<br/>• Testability refinement]
- SearchAgent[Search Agent<br/>• Multi-source search<br/>• RAG retrieval<br/>• Result ranking]
- AnalysisAgent[Analysis Agent<br/>• Evidence extraction<br/>• Statistical analysis<br/>• Code execution]
- ReportAgent[Report Agent<br/>• Report assembly<br/>• Visualization<br/>• Citation formatting]
- end
-
- subgraph "MCP Tools"
- WebSearch[Web Search<br/>PubMed • arXiv • bioRxiv]
- CodeExec[Code Execution<br/>Sandboxed Python]
- RAG[RAG Retrieval<br/>Vector DB • Embeddings]
- Viz[Visualization<br/>Charts • Graphs]
- end
-
- Manager -->|Selects & Directs| HypAgent
- Manager -->|Selects & Directs| SearchAgent
- Manager -->|Selects & Directs| AnalysisAgent
- Manager -->|Selects & Directs| ReportAgent
-
- HypAgent --> SharedContext
- SearchAgent --> SharedContext
- AnalysisAgent --> SharedContext
- ReportAgent --> SharedContext
-
- SearchAgent --> WebSearch
- SearchAgent --> RAG
- AnalysisAgent --> CodeExec
- ReportAgent --> CodeExec
- ReportAgent --> Viz
-
- style Manager fill:#ffe6e6
- style SharedContext fill:#ffe6f0
- style HypAgent fill:#fff4e6
- style SearchAgent fill:#fff4e6
- style AnalysisAgent fill:#fff4e6
- style ReportAgent fill:#fff4e6
- style WebSearch fill:#e6f3ff
- style CodeExec fill:#e6f3ff
- style RAG fill:#e6f3ff
- style Viz fill:#e6f3ff4. Dynamic Workflow Example¶
sequenceDiagram
- participant User
- participant Manager
- participant HypAgent
- participant SearchAgent
- participant AnalysisAgent
- participant ReportAgent
-
- User->>Manager: "Research protein folding in Alzheimer's"
-
- Note over Manager: PLAN: Generate hypotheses → Search → Analyze → Report
-
- Manager->>HypAgent: Generate 3 hypotheses
- HypAgent-->>Manager: Returns 3 hypotheses
- Note over Manager: ASSESS: Good quality, proceed
-
- Manager->>SearchAgent: Search literature for hypothesis 1
- SearchAgent-->>Manager: Returns 15 papers
- Note over Manager: ASSESS: Good results, continue
-
- Manager->>SearchAgent: Search for hypothesis 2
- SearchAgent-->>Manager: Only 2 papers found
- Note over Manager: ASSESS: Insufficient, refine search
-
- Manager->>SearchAgent: Refined query for hypothesis 2
- SearchAgent-->>Manager: Returns 12 papers
- Note over Manager: ASSESS: Better, proceed
-
- Manager->>AnalysisAgent: Analyze evidence for all hypotheses
- AnalysisAgent-->>Manager: Returns analysis with code
- Note over Manager: ASSESS: Complete, generate report
-
- Manager->>ReportAgent: Create comprehensive report
- ReportAgent-->>Manager: Returns formatted report
- Note over Manager: SYNTHESIZE: Combine all results
-
- Manager->>User: Final Research Report5. Manager Decision Logic¶
flowchart TD
- Start([Manager Receives Task]) --> Plan[Create Initial Plan]
-
- Plan --> Select[Select Agent for Next Subtask]
- Select --> Execute[Execute Agent]
- Execute --> Collect[Collect Results]
-
- Collect --> Assess[Assess Quality & Progress]
-
- Assess --> Q1{Quality Sufficient?}
- Q1 -->|No| Q2{Same Agent Can Fix?}
- Q2 -->|Yes| Feedback[Provide Specific Feedback]
- Feedback --> Execute
- Q2 -->|No| Different[Try Different Agent]
- Different --> Select
-
- Q1 -->|Yes| Q3{Task Complete?}
- Q3 -->|No| Q4{Making Progress?}
- Q4 -->|Yes| Select
- Q4 -->|No - Stalled| Replan[Reset Plan & Approach]
- Replan --> Plan
-
- Q3 -->|Yes| Synth[Synthesize Final Result]
- Synth --> Done([Return Report])
-
- style Start fill:#e1f5e1
- style Plan fill:#fff4e6
- style Select fill:#ffe6e6
- style Execute fill:#e6f3ff
- style Assess fill:#ffd6d6
- style Q1 fill:#ffe6e6
- style Q2 fill:#ffe6e6
- style Q3 fill:#ffe6e6
- style Q4 fill:#ffe6e6
- style Synth fill:#d4edda
- style Done fill:#e1f5e16. Hypothesis Agent Workflow¶
flowchart LR
- Input[Research Query] --> Domain[Identify Domain<br/>& Key Concepts]
- Domain --> Context[Retrieve Background<br/>Knowledge]
- Context --> Generate[Generate 3-5<br/>Initial Hypotheses]
- Generate --> Refine[Refine for<br/>Testability]
- Refine --> Rank[Rank by<br/>Quality Score]
- Rank --> Output[Return Top<br/>Hypotheses]
-
- Output --> Struct[Hypothesis Structure:<br/>• Statement<br/>• Rationale<br/>• Testability Score<br/>• Data Requirements<br/>• Expected Outcomes]
-
- style Input fill:#e1f5e1
- style Output fill:#fff4e6
- style Struct fill:#e6f3ff7. Search Agent Workflow¶
flowchart TD
- Input[Hypotheses] --> Strategy[Formulate Search<br/>Strategy per Hypothesis]
-
- Strategy --> Multi[Multi-Source Search]
-
- Multi --> PubMed[PubMed Search<br/>via MCP]
- Multi --> ArXiv[arXiv Search<br/>via MCP]
- Multi --> BioRxiv[bioRxiv Search<br/>via MCP]
-
- PubMed --> Aggregate[Aggregate Results]
- ArXiv --> Aggregate
- BioRxiv --> Aggregate
-
- Aggregate --> Filter[Filter & Rank<br/>by Relevance]
- Filter --> Dedup[Deduplicate<br/>Cross-Reference]
- Dedup --> Embed[Embed Documents<br/>via MCP]
- Embed --> Vector[(Vector DB)]
- Vector --> RAGRetrieval[RAG Retrieval<br/>Top-K per Hypothesis]
- RAGRetrieval --> Output[Return Contextualized<br/>Search Results]
-
- style Input fill:#fff4e6
- style Multi fill:#ffe6e6
- style Vector fill:#ffe6f0
- style Output fill:#e6f3ff8. Analysis Agent Workflow¶
flowchart TD
- Input1[Hypotheses] --> Extract
- Input2[Search Results] --> Extract[Extract Evidence<br/>per Hypothesis]
-
- Extract --> Methods[Determine Analysis<br/>Methods Needed]
-
- Methods --> Branch{Requires<br/>Computation?}
- Branch -->|Yes| GenCode[Generate Python<br/>Analysis Code]
- Branch -->|No| Qual[Qualitative<br/>Synthesis]
-
- GenCode --> Execute[Execute Code<br/>via MCP Sandbox]
- Execute --> Interpret1[Interpret<br/>Results]
- Qual --> Interpret2[Interpret<br/>Findings]
-
- Interpret1 --> Synthesize[Synthesize Evidence<br/>Across Sources]
- Interpret2 --> Synthesize
-
- Synthesize --> Verdict[Determine Verdict<br/>per Hypothesis]
- Verdict --> Support[• Supported<br/>• Refuted<br/>• Inconclusive]
- Support --> Gaps[Identify Knowledge<br/>Gaps & Limitations]
- Gaps --> Output[Return Analysis<br/>Report]
-
- style Input1 fill:#fff4e6
- style Input2 fill:#e6f3ff
- style Execute fill:#ffe6e6
- style Output fill:#e6ffe69. Report Agent Workflow¶
flowchart TD
- Input1[Query] --> Assemble
- Input2[Hypotheses] --> Assemble
- Input3[Search Results] --> Assemble
- Input4[Analysis] --> Assemble[Assemble Report<br/>Sections]
-
- Assemble --> Exec[Executive Summary]
- Assemble --> Intro[Introduction]
- Assemble --> Methods[Methods]
- Assemble --> Results[Results per<br/>Hypothesis]
- Assemble --> Discussion[Discussion]
- Assemble --> Future[Future Directions]
- Assemble --> Refs[References]
-
- Results --> VizCheck{Needs<br/>Visualization?}
- VizCheck -->|Yes| GenViz[Generate Viz Code]
- GenViz --> ExecViz[Execute via MCP<br/>Create Charts]
- ExecViz --> Combine
- VizCheck -->|No| Combine[Combine All<br/>Sections]
-
- Exec --> Combine
- Intro --> Combine
- Methods --> Combine
- Discussion --> Combine
- Future --> Combine
- Refs --> Combine
-
- Combine --> Format[Format Output]
- Format --> MD[Markdown]
- Format --> PDF[PDF]
- Format --> JSON[JSON]
-
- MD --> Output[Return Final<br/>Report]
- PDF --> Output
- JSON --> Output
-
- style Input1 fill:#e1f5e1
- style Input2 fill:#fff4e6
- style Input3 fill:#e6f3ff
- style Input4 fill:#e6ffe6
- style Output fill:#d4edda10. Data Flow & Event Streaming¶
flowchart TD
- User[👤 User] -->|Research Query| UI[Gradio UI]
- UI -->|Submit| Manager[Magentic Manager]
-
- Manager -->|Event: Planning| UI
- Manager -->|Select Agent| HypAgent[Hypothesis Agent]
- HypAgent -->|Event: Delta/Message| UI
- HypAgent -->|Hypotheses| Context[(Shared Context)]
-
- Context -->|Retrieved by| Manager
- Manager -->|Select Agent| SearchAgent[Search Agent]
- SearchAgent -->|MCP Request| WebSearch[Web Search Tool]
- WebSearch -->|Results| SearchAgent
- SearchAgent -->|Event: Delta/Message| UI
- SearchAgent -->|Documents| Context
- SearchAgent -->|Embeddings| VectorDB[(Vector DB)]
-
- Context -->|Retrieved by| Manager
- Manager -->|Select Agent| AnalysisAgent[Analysis Agent]
- AnalysisAgent -->|MCP Request| CodeExec[Code Execution Tool]
- CodeExec -->|Results| AnalysisAgent
- AnalysisAgent -->|Event: Delta/Message| UI
- AnalysisAgent -->|Analysis| Context
-
- Context -->|Retrieved by| Manager
- Manager -->|Select Agent| ReportAgent[Report Agent]
- ReportAgent -->|MCP Request| CodeExec
- ReportAgent -->|Event: Delta/Message| UI
- ReportAgent -->|Report| Context
-
- Manager -->|Event: Final Result| UI
- UI -->|Display| User
-
- style User fill:#e1f5e1
- style UI fill:#e6f3ff
- style Manager fill:#ffe6e6
- style Context fill:#ffe6f0
- style VectorDB fill:#ffe6f0
- style WebSearch fill:#f0f0f0
- style CodeExec fill:#f0f0f011. MCP Tool Architecture¶
graph TB
- subgraph "Agent Layer"
- Manager[Magentic Manager]
- HypAgent[Hypothesis Agent]
- SearchAgent[Search Agent]
- AnalysisAgent[Analysis Agent]
- ReportAgent[Report Agent]
- end
-
- subgraph "MCP Protocol Layer"
- Registry[MCP Tool Registry<br/>• Discovers tools<br/>• Routes requests<br/>• Manages connections]
- end
-
- subgraph "MCP Servers"
- Server1[Web Search Server<br/>localhost:8001<br/>• PubMed<br/>• arXiv<br/>• bioRxiv]
- Server2[Code Execution Server<br/>localhost:8002<br/>• Sandboxed Python<br/>• Package management]
- Server3[RAG Server<br/>localhost:8003<br/>• Vector embeddings<br/>• Similarity search]
- Server4[Visualization Server<br/>localhost:8004<br/>• Chart generation<br/>• Plot rendering]
- end
-
- subgraph "External Services"
- PubMed[PubMed API]
- ArXiv[arXiv API]
- BioRxiv[bioRxiv API]
- Modal[Modal Sandbox]
- ChromaDB[(ChromaDB)]
- end
-
- SearchAgent -->|Request| Registry
- AnalysisAgent -->|Request| Registry
- ReportAgent -->|Request| Registry
-
- Registry --> Server1
- Registry --> Server2
- Registry --> Server3
- Registry --> Server4
-
- Server1 --> PubMed
- Server1 --> ArXiv
- Server1 --> BioRxiv
- Server2 --> Modal
- Server3 --> ChromaDB
-
- style Manager fill:#ffe6e6
- style Registry fill:#fff4e6
- style Server1 fill:#e6f3ff
- style Server2 fill:#e6f3ff
- style Server3 fill:#e6f3ff
- style Server4 fill:#e6f3ff12. Progress Tracking & Stall Detection¶
stateDiagram-v2
- [*] --> Initialization: User Query
-
- Initialization --> Planning: Manager starts
-
- Planning --> AgentExecution: Select agent
-
- AgentExecution --> Assessment: Collect results
-
- Assessment --> QualityCheck: Evaluate output
-
- QualityCheck --> AgentExecution: Poor quality<br/>(retry < max_rounds)
- QualityCheck --> Planning: Poor quality<br/>(try different agent)
- QualityCheck --> NextAgent: Good quality<br/>(task incomplete)
- QualityCheck --> Synthesis: Good quality<br/>(task complete)
-
- NextAgent --> AgentExecution: Select next agent
-
- state StallDetection <<choice>>
- Assessment --> StallDetection: Check progress
- StallDetection --> Planning: No progress<br/>(stall count < max)
- StallDetection --> ErrorRecovery: No progress<br/>(max stalls reached)
-
- ErrorRecovery --> PartialReport: Generate partial results
- PartialReport --> [*]
-
- Synthesis --> FinalReport: Combine all outputs
- FinalReport --> [*]
-
- note right of QualityCheck
- Manager assesses:
- • Output completeness
- • Quality metrics
- • Progress made
- end note
-
- note right of StallDetection
- Stall = no new progress
- after agent execution
- Triggers plan reset
- end note13. Gradio UI Integration¶
graph TD
- App[Gradio App<br/>DeepCritical Research Agent]
-
- App --> Input[Input Section]
- App --> Status[Status Section]
- App --> Output[Output Section]
-
- Input --> Query[Research Question<br/>Text Area]
- Input --> Controls[Controls]
- Controls --> MaxHyp[Max Hypotheses: 1-10]
- Controls --> MaxRounds[Max Rounds: 5-20]
- Controls --> Submit[Start Research Button]
-
- Status --> Log[Real-time Event Log<br/>• Manager planning<br/>• Agent selection<br/>• Execution updates<br/>• Quality assessment]
- Status --> Progress[Progress Tracker<br/>• Current agent<br/>• Round count<br/>• Stall count]
-
- Output --> Tabs[Tabbed Results]
- Tabs --> Tab1[Hypotheses Tab<br/>Generated hypotheses with scores]
- Tabs --> Tab2[Search Results Tab<br/>Papers & sources found]
- Tabs --> Tab3[Analysis Tab<br/>Evidence & verdicts]
- Tabs --> Tab4[Report Tab<br/>Final research report]
- Tab4 --> Download[Download Report<br/>MD / PDF / JSON]
-
- Submit -.->|Triggers| Workflow[Magentic Workflow]
- Workflow -.->|MagenticOrchestratorMessageEvent| Log
- Workflow -.->|MagenticAgentDeltaEvent| Log
- Workflow -.->|MagenticAgentMessageEvent| Log
- Workflow -.->|MagenticFinalResultEvent| Tab4
-
- style App fill:#e1f5e1
- style Input fill:#fff4e6
- style Status fill:#e6f3ff
- style Output fill:#e6ffe6
- style Workflow fill:#ffe6e614. Complete System Context¶
graph LR
- User[👤 Researcher<br/>Asks research questions] -->|Submits query| DC[DeepCritical<br/>Magentic Workflow]
-
- DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
- DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]
- DC -->|Biology search| BioRxiv[bioRxiv API<br/>Biology preprints]
- DC -->|Agent reasoning| Claude[Claude API<br/>Sonnet 4 / Opus]
- DC -->|Code execution| Modal[Modal Sandbox<br/>Safe Python env]
- DC -->|Vector storage| Chroma[ChromaDB<br/>Embeddings & RAG]
-
- DC -->|Deployed on| HF[HuggingFace Spaces<br/>Gradio 6.0]
-
- PubMed -->|Results| DC
- ArXiv -->|Results| DC
- BioRxiv -->|Results| DC
- Claude -->|Responses| DC
- Modal -->|Output| DC
- Chroma -->|Context| DC
-
- DC -->|Research report| User
-
- style User fill:#e1f5e1
- style DC fill:#ffe6e6
- style PubMed fill:#e6f3ff
- style ArXiv fill:#e6f3ff
- style BioRxiv fill:#e6f3ff
- style Claude fill:#ffd6d6
- style Modal fill:#f0f0f0
- style Chroma fill:#ffe6f0
- style HF fill:#d4edda15. Workflow Timeline (Simplified)¶
gantt
- title DeepCritical Magentic Workflow - Typical Execution
- dateFormat mm:ss
- axisFormat %M:%S
-
- section Manager Planning
- Initial planning :p1, 00:00, 10s
-
- section Hypothesis Agent
- Generate hypotheses :h1, after p1, 30s
- Manager assessment :h2, after h1, 5s
-
- section Search Agent
- Search hypothesis 1 :s1, after h2, 20s
- Search hypothesis 2 :s2, after s1, 20s
- Search hypothesis 3 :s3, after s2, 20s
- RAG processing :s4, after s3, 15s
- Manager assessment :s5, after s4, 5s
-
- section Analysis Agent
- Evidence extraction :a1, after s5, 15s
- Code generation :a2, after a1, 20s
- Code execution :a3, after a2, 25s
- Synthesis :a4, after a3, 20s
- Manager assessment :a5, after a4, 5s
-
- section Report Agent
- Report assembly :r1, after a5, 30s
- Visualization :r2, after r1, 15s
- Formatting :r3, after r2, 10s
-
- section Manager Synthesis
- Final synthesis :f1, after r3, 10s
Key Differences from Original Design¶
Aspect Original (Judge-in-Loop) New (Magentic) Control Flow Fixed sequential phases Dynamic agent selection Quality Control Separate Judge Agent Manager assessment built-in Retry Logic Phase-level with feedback Agent-level with adaptation Flexibility Rigid 4-phase pipeline Adaptive workflow Complexity 5 agents (including Judge) 4 agents (no Judge) Progress Tracking Manual state management Built-in round/stall detection Agent Coordination Sequential handoff Manager-driven dynamic selection Error Recovery Retry same phase Try different agent or replan
Simplified Design Principles¶
Legend¶
Implementation Highlights¶
workflow = (
- MagenticBuilder()
- .participants(
- hypothesis=HypothesisAgent(tools=[background_tool]),
- search=SearchAgent(tools=[web_search, rag_tool]),
- analysis=AnalysisAgent(tools=[code_execution]),
- report=ReportAgent(tools=[code_execution, visualization])
- )
- .with_standard_manager(
- chat_client=AnthropicClient(model="claude-sonnet-4"),
- max_round_count=15, # Prevent infinite loops
- max_stall_count=3 # Detect stuck workflows
- )
- .build()
-)
-
Configuration Guide¶
Overview¶
Settings class in src/utils/config.py and can be configured via environment variables or a .env file.
.env file (if present)settings instance for easy access throughout the codebaseQuick Start¶
.env file in the project rootOPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN)Configuration System Architecture¶
Settings Class¶
Settings class extends BaseSettings from pydantic_settings and defines all application configuration:Singleton Instance¶
settings instance is available for import:Usage Pattern¶
from src.utils.config import settings
-
-# Check if API keys are available
-if settings.has_openai_key:
- # Use OpenAI
- pass
-
-# Access configuration values
-max_iterations = settings.max_iterations
-web_search_provider = settings.web_search_provider
-Required Configuration¶
LLM Provider¶
OPENAI_API_KEYANTHROPIC_API_KEYHF_TOKEN or HUGGINGFACE_API_KEY (can work without key for public models)OpenAI Configuration¶
Settings class:Anthropic Configuration¶
LLM_PROVIDER=anthropic
-ANTHROPIC_API_KEY=your_anthropic_api_key_here
-ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
-Settings class:HuggingFace Configuration¶
# Option 1: Using HF_TOKEN (preferred)
-HF_TOKEN=your_huggingface_token_here
-
-# Option 2: Using HUGGINGFACE_API_KEY (alternative)
-HUGGINGFACE_API_KEY=your_huggingface_api_key_here
-
-# Default model
-HUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct
-Optional Configuration¶
Embedding Configuration¶
# Embedding Provider: "openai", "local", or "huggingface"
-EMBEDDING_PROVIDER=local
-
-# OpenAI Embedding Model (used by LlamaIndex RAG)
-OPENAI_EMBEDDING_MODEL=text-embedding-3-small
-
-# Local Embedding Model (sentence-transformers, used by EmbeddingService)
-LOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2
-
-# HuggingFace Embedding Model
-HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
-OPENAI_API_KEY. The local provider (default) uses sentence-transformers and requires no API key.Web Search Configuration¶
# Web Search Provider: "serper", "searchxng", "brave", "tavily", or "duckduckgo"
-# Default: "duckduckgo" (no API key required)
-WEB_SEARCH_PROVIDER=duckduckgo
-
-# Serper API Key (for Google search via Serper)
-SERPER_API_KEY=your_serper_api_key_here
-
-# SearchXNG Host URL (for self-hosted search)
-SEARCHXNG_HOST=http://localhost:8080
-
-# Brave Search API Key
-BRAVE_API_KEY=your_brave_api_key_here
-
-# Tavily API Key
-TAVILY_API_KEY=your_tavily_api_key_here
-PubMed Configuration¶
# NCBI API Key (optional, for higher rate limits: 10 req/sec vs 3 req/sec)
-NCBI_API_KEY=your_ncbi_api_key_here
-Agent Configuration¶
# Maximum iterations per research loop (1-50, default: 10)
-MAX_ITERATIONS=10
-
-# Search timeout in seconds
-SEARCH_TIMEOUT=30
-
-# Use graph-based execution for research flows
-USE_GRAPH_EXECUTION=false
-Budget & Rate Limiting Configuration¶
# Default token budget per research loop (1000-1000000, default: 100000)
-DEFAULT_TOKEN_LIMIT=100000
-
-# Default time limit per research loop in minutes (1-120, default: 10)
-DEFAULT_TIME_LIMIT_MINUTES=10
-
-# Default iterations limit per research loop (1-50, default: 10)
-DEFAULT_ITERATIONS_LIMIT=10
-RAG Service Configuration¶
# ChromaDB collection name for RAG
-RAG_COLLECTION_NAME=deepcritical_evidence
-
-# Number of top results to retrieve from RAG (1-50, default: 5)
-RAG_SIMILARITY_TOP_K=5
-
-# Automatically ingest evidence into RAG
-RAG_AUTO_INGEST=true
-ChromaDB Configuration¶
# ChromaDB storage path
-CHROMA_DB_PATH=./chroma_db
-
-# Whether to persist ChromaDB to disk
-CHROMA_DB_PERSIST=true
-
-# ChromaDB server host (for remote ChromaDB, optional)
-CHROMA_DB_HOST=localhost
-
-# ChromaDB server port (for remote ChromaDB, optional)
-CHROMA_DB_PORT=8000
-External Services¶
Modal Configuration¶
# Modal Token ID (for Modal sandbox execution)
-MODAL_TOKEN_ID=your_modal_token_id_here
-
-# Modal Token Secret
-MODAL_TOKEN_SECRET=your_modal_token_secret_here
-Logging Configuration¶
configure_logging() function:Configuration Properties¶
Settings class provides helpful properties for checking configuration state:API Key Availability¶
from src.utils.config import settings
-
-# Check API key availability
-if settings.has_openai_key:
- # Use OpenAI
- pass
-
-if settings.has_anthropic_key:
- # Use Anthropic
- pass
-
-if settings.has_huggingface_key:
- # Use HuggingFace
- pass
-
-if settings.has_any_llm_key:
- # At least one LLM is available
- pass
-Service Availability¶
from src.utils.config import settings
-
-# Check service availability
-if settings.modal_available:
- # Use Modal sandbox
- pass
-
-if settings.web_search_available:
- # Web search is configured
- pass
-API Key Retrieval¶
Configuration Usage in Codebase¶
LLM Factory¶
Embedding Service¶
Orchestrator Factory¶
Environment Variables Reference¶
Required (at least one LLM)¶
OPENAI_API_KEY - OpenAI API key (required for OpenAI provider)ANTHROPIC_API_KEY - Anthropic API key (required for Anthropic provider)HF_TOKEN or HUGGINGFACE_API_KEY - HuggingFace API token (optional, can work without for public models)LLM Configuration Variables¶
LLM_PROVIDER - Provider to use: "openai", "anthropic", or "huggingface" (default: "huggingface")OPENAI_MODEL - OpenAI model name (default: "gpt-5.1")ANTHROPIC_MODEL - Anthropic model name (default: "claude-sonnet-4-5-20250929")HUGGINGFACE_MODEL - HuggingFace model ID (default: "meta-llama/Llama-3.1-8B-Instruct")Embedding Configuration Variables¶
EMBEDDING_PROVIDER - Provider: "openai", "local", or "huggingface" (default: "local")OPENAI_EMBEDDING_MODEL - OpenAI embedding model (default: "text-embedding-3-small")LOCAL_EMBEDDING_MODEL - Local sentence-transformers model (default: "all-MiniLM-L6-v2")HUGGINGFACE_EMBEDDING_MODEL - HuggingFace embedding model (default: "sentence-transformers/all-MiniLM-L6-v2")Web Search Configuration Variables¶
WEB_SEARCH_PROVIDER - Provider: "serper", "searchxng", "brave", "tavily", or "duckduckgo" (default: "duckduckgo")SERPER_API_KEY - Serper API key (required for Serper provider)SEARCHXNG_HOST - SearchXNG host URL (required for SearchXNG provider)BRAVE_API_KEY - Brave Search API key (required for Brave provider)TAVILY_API_KEY - Tavily API key (required for Tavily provider)PubMed Configuration Variables¶
NCBI_API_KEY - NCBI API key (optional, increases rate limit from 3 to 10 req/sec)Agent Configuration Variables¶
MAX_ITERATIONS - Maximum iterations per research loop (1-50, default: 10)SEARCH_TIMEOUT - Search timeout in seconds (default: 30)USE_GRAPH_EXECUTION - Use graph-based execution (default: false)Budget Configuration Variables¶
DEFAULT_TOKEN_LIMIT - Default token budget per research loop (1000-1000000, default: 100000)DEFAULT_TIME_LIMIT_MINUTES - Default time limit in minutes (1-120, default: 10)DEFAULT_ITERATIONS_LIMIT - Default iterations limit (1-50, default: 10)RAG Configuration Variables¶
RAG_COLLECTION_NAME - ChromaDB collection name (default: "deepcritical_evidence")RAG_SIMILARITY_TOP_K - Number of top results to retrieve (1-50, default: 5)RAG_AUTO_INGEST - Automatically ingest evidence into RAG (default: true)ChromaDB Configuration Variables¶
CHROMA_DB_PATH - ChromaDB storage path (default: "./chroma_db")CHROMA_DB_PERSIST - Whether to persist ChromaDB to disk (default: true)CHROMA_DB_HOST - ChromaDB server host (optional, for remote ChromaDB)CHROMA_DB_PORT - ChromaDB server port (optional, for remote ChromaDB)External Services Variables¶
MODAL_TOKEN_ID - Modal token ID (optional, for Modal sandbox execution)MODAL_TOKEN_SECRET - Modal token secret (optional, for Modal sandbox execution)Logging Configuration Variables¶
LOG_LEVEL - Log level: "DEBUG", "INFO", "WARNING", or "ERROR" (default: "INFO")Validation¶
ge=1, le=50 for max_iterations)Literal["openai", "anthropic", "huggingface"])get_api_key() or get_openai_api_key()Validation Examples¶
max_iterations field has range validation:llm_provider field has literal validation:Error Handling¶
ConfigurationError from src/utils/exceptions.py:Error Handling Example¶
from src.utils.config import settings
-from src.utils.exceptions import ConfigurationError
-
-try:
- api_key = settings.get_api_key()
-except ConfigurationError as e:
- print(f"Configuration error: {e}")
-Common Configuration Errors¶
get_api_key() is called but the required API key is not setllm_provider is set to an unsupported valueConfiguration Best Practices¶
.env File: Store sensitive keys in .env file (add to .gitignore)has_openai_key before accessing API keysConfigurationError when calling get_api_key()Future Enhancements¶
Configuration Guide¶
Overview¶
Settings class in src/utils/config.py and can be configured via environment variables or a .env file.
.env file (if present)settings instance for easy access throughout the codebaseQuick Start¶
.env file in the project rootOPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN)Configuration System Architecture¶
Settings Class¶
Settings][settings-class] class extends BaseSettings from pydantic_settings and defines all application configuration:Singleton Instance¶
settings instance is available for import:Usage Pattern¶
from src.utils.config import settings
-
-# Check if API keys are available
-if settings.has_openai_key:
- # Use OpenAI
- pass
-
-# Access configuration values
-max_iterations = settings.max_iterations
-web_search_provider = settings.web_search_provider
-Required Configuration¶
LLM Provider¶
OPENAI_API_KEYANTHROPIC_API_KEYHF_TOKEN or HUGGINGFACE_API_KEY (can work without key for public models)OpenAI Configuration¶
Settings class:Anthropic Configuration¶
LLM_PROVIDER=anthropic
-ANTHROPIC_API_KEY=your_anthropic_api_key_here
-ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
-Settings class:HuggingFace Configuration¶
# Option 1: Using HF_TOKEN (preferred)
-HF_TOKEN=your_huggingface_token_here
-
-# Option 2: Using HUGGINGFACE_API_KEY (alternative)
-HUGGINGFACE_API_KEY=your_huggingface_api_key_here
-
-# Default model
-HUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct
-Optional Configuration¶
Embedding Configuration¶
# Embedding Provider: "openai", "local", or "huggingface"
-EMBEDDING_PROVIDER=local
-
-# OpenAI Embedding Model (used by LlamaIndex RAG)
-OPENAI_EMBEDDING_MODEL=text-embedding-3-small
-
-# Local Embedding Model (sentence-transformers, used by EmbeddingService)
-LOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2
-
-# HuggingFace Embedding Model
-HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
-OPENAI_API_KEY. The local provider (default) uses sentence-transformers and requires no API key.Web Search Configuration¶
# Web Search Provider: "serper", "searchxng", "brave", "tavily", or "duckduckgo"
-# Default: "duckduckgo" (no API key required)
-WEB_SEARCH_PROVIDER=duckduckgo
-
-# Serper API Key (for Google search via Serper)
-SERPER_API_KEY=your_serper_api_key_here
-
-# SearchXNG Host URL (for self-hosted search)
-SEARCHXNG_HOST=http://localhost:8080
-
-# Brave Search API Key
-BRAVE_API_KEY=your_brave_api_key_here
-
-# Tavily API Key
-TAVILY_API_KEY=your_tavily_api_key_here
-PubMed Configuration¶
# NCBI API Key (optional, for higher rate limits: 10 req/sec vs 3 req/sec)
-NCBI_API_KEY=your_ncbi_api_key_here
-Agent Configuration¶
# Maximum iterations per research loop (1-50, default: 10)
-MAX_ITERATIONS=10
-
-# Search timeout in seconds
-SEARCH_TIMEOUT=30
-
-# Use graph-based execution for research flows
-USE_GRAPH_EXECUTION=false
-Budget & Rate Limiting Configuration¶
# Default token budget per research loop (1000-1000000, default: 100000)
-DEFAULT_TOKEN_LIMIT=100000
-
-# Default time limit per research loop in minutes (1-120, default: 10)
-DEFAULT_TIME_LIMIT_MINUTES=10
-
-# Default iterations limit per research loop (1-50, default: 10)
-DEFAULT_ITERATIONS_LIMIT=10
-RAG Service Configuration¶
# ChromaDB collection name for RAG
-RAG_COLLECTION_NAME=deepcritical_evidence
-
-# Number of top results to retrieve from RAG (1-50, default: 5)
-RAG_SIMILARITY_TOP_K=5
-
-# Automatically ingest evidence into RAG
-RAG_AUTO_INGEST=true
-ChromaDB Configuration¶
# ChromaDB storage path
-CHROMA_DB_PATH=./chroma_db
-
-# Whether to persist ChromaDB to disk
-CHROMA_DB_PERSIST=true
-
-# ChromaDB server host (for remote ChromaDB, optional)
-CHROMA_DB_HOST=localhost
-
-# ChromaDB server port (for remote ChromaDB, optional)
-CHROMA_DB_PORT=8000
-External Services¶
Modal Configuration¶
# Modal Token ID (for Modal sandbox execution)
-MODAL_TOKEN_ID=your_modal_token_id_here
-
-# Modal Token Secret
-MODAL_TOKEN_SECRET=your_modal_token_secret_here
-Logging Configuration¶
configure_logging() function:Configuration Properties¶
Settings class provides helpful properties for checking configuration state:API Key Availability¶
from src.utils.config import settings
-
-# Check API key availability
-if settings.has_openai_key:
- # Use OpenAI
- pass
-
-if settings.has_anthropic_key:
- # Use Anthropic
- pass
-
-if settings.has_huggingface_key:
- # Use HuggingFace
- pass
-
-if settings.has_any_llm_key:
- # At least one LLM is available
- pass
-Service Availability¶
from src.utils.config import settings
-
-# Check service availability
-if settings.modal_available:
- # Use Modal sandbox
- pass
-
-if settings.web_search_available:
- # Web search is configured
- pass
-API Key Retrieval¶
Configuration Usage in Codebase¶
LLM Factory¶
Embedding Service¶
Orchestrator Factory¶
Environment Variables Reference¶
Required (at least one LLM)¶
OPENAI_API_KEY - OpenAI API key (required for OpenAI provider)ANTHROPIC_API_KEY - Anthropic API key (required for Anthropic provider)HF_TOKEN or HUGGINGFACE_API_KEY - HuggingFace API token (optional, can work without for public models)LLM Configuration Variables¶
LLM_PROVIDER - Provider to use: "openai", "anthropic", or "huggingface" (default: "huggingface")OPENAI_MODEL - OpenAI model name (default: "gpt-5.1")ANTHROPIC_MODEL - Anthropic model name (default: "claude-sonnet-4-5-20250929")HUGGINGFACE_MODEL - HuggingFace model ID (default: "meta-llama/Llama-3.1-8B-Instruct")Embedding Configuration Variables¶
EMBEDDING_PROVIDER - Provider: "openai", "local", or "huggingface" (default: "local")OPENAI_EMBEDDING_MODEL - OpenAI embedding model (default: "text-embedding-3-small")LOCAL_EMBEDDING_MODEL - Local sentence-transformers model (default: "all-MiniLM-L6-v2")HUGGINGFACE_EMBEDDING_MODEL - HuggingFace embedding model (default: "sentence-transformers/all-MiniLM-L6-v2")Web Search Configuration Variables¶
WEB_SEARCH_PROVIDER - Provider: "serper", "searchxng", "brave", "tavily", or "duckduckgo" (default: "duckduckgo")SERPER_API_KEY - Serper API key (required for Serper provider)SEARCHXNG_HOST - SearchXNG host URL (required for SearchXNG provider)BRAVE_API_KEY - Brave Search API key (required for Brave provider)TAVILY_API_KEY - Tavily API key (required for Tavily provider)PubMed Configuration Variables¶
NCBI_API_KEY - NCBI API key (optional, increases rate limit from 3 to 10 req/sec)Agent Configuration Variables¶
MAX_ITERATIONS - Maximum iterations per research loop (1-50, default: 10)SEARCH_TIMEOUT - Search timeout in seconds (default: 30)USE_GRAPH_EXECUTION - Use graph-based execution (default: false)Budget Configuration Variables¶
DEFAULT_TOKEN_LIMIT - Default token budget per research loop (1000-1000000, default: 100000)DEFAULT_TIME_LIMIT_MINUTES - Default time limit in minutes (1-120, default: 10)DEFAULT_ITERATIONS_LIMIT - Default iterations limit (1-50, default: 10)RAG Configuration Variables¶
RAG_COLLECTION_NAME - ChromaDB collection name (default: "deepcritical_evidence")RAG_SIMILARITY_TOP_K - Number of top results to retrieve (1-50, default: 5)RAG_AUTO_INGEST - Automatically ingest evidence into RAG (default: true)ChromaDB Configuration Variables¶
CHROMA_DB_PATH - ChromaDB storage path (default: "./chroma_db")CHROMA_DB_PERSIST - Whether to persist ChromaDB to disk (default: true)CHROMA_DB_HOST - ChromaDB server host (optional, for remote ChromaDB)CHROMA_DB_PORT - ChromaDB server port (optional, for remote ChromaDB)External Services Variables¶
MODAL_TOKEN_ID - Modal token ID (optional, for Modal sandbox execution)MODAL_TOKEN_SECRET - Modal token secret (optional, for Modal sandbox execution)Logging Configuration Variables¶
LOG_LEVEL - Log level: "DEBUG", "INFO", "WARNING", or "ERROR" (default: "INFO")Validation¶
ge=1, le=50 for max_iterations)Literal["openai", "anthropic", "huggingface"])get_api_key() or get_openai_api_key()Validation Examples¶
max_iterations field has range validation:llm_provider field has literal validation:Error Handling¶
ConfigurationError from src/utils/exceptions.py:pass
-Error Handling Example¶
python from src.utils.config import settings from src.utils.exceptions import ConfigurationError try: api_key = settings.get_api_key() except ConfigurationError as e: print(f"Configuration error: {e}")Common Configuration Errors¶
get_api_key() is called but the required API key is not setllm_provider is set to an unsupported valueConfiguration Best Practices¶
.env File: Store sensitive keys in .env file (add to .gitignore)has_openai_key before accessing API keysConfigurationError when calling get_api_key()Future Enhancements¶
Configuration Guide¶
Overview¶
Settings class in src/utils/config.py and can be configured via environment variables or a .env file.
.env file (if present)settings instance for easy access throughout the codebaseQuick Start¶
.env file in the project rootOPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN)Configuration System Architecture¶
Settings Class¶
Settings][settings-class] class extends BaseSettings from pydantic_settings and defines all application configuration:Singleton Instance¶
settings instance is available for import:Usage Pattern¶
from src.utils.config import settings
+
+# Check if API keys are available
+if settings.has_openai_key:
+ # Use OpenAI
+ pass
+
+# Access configuration values
+max_iterations = settings.max_iterations
+web_search_provider = settings.web_search_provider
+Required Configuration¶
LLM Provider¶
OPENAI_API_KEYANTHROPIC_API_KEYHF_TOKEN or HUGGINGFACE_API_KEY (can work without key for public models)OpenAI Configuration¶
Settings class:Anthropic Configuration¶
LLM_PROVIDER=anthropic
+ANTHROPIC_API_KEY=your_anthropic_api_key_here
+ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
+Settings class:HuggingFace Configuration¶
# Option 1: Using HF_TOKEN (preferred)
+HF_TOKEN=your_huggingface_token_here
+
+# Option 2: Using HUGGINGFACE_API_KEY (alternative)
+HUGGINGFACE_API_KEY=your_huggingface_api_key_here
+
+# Default model
+HUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct
+Optional Configuration¶
Embedding Configuration¶
# Embedding Provider: "openai", "local", or "huggingface"
+EMBEDDING_PROVIDER=local
+
+# OpenAI Embedding Model (used by LlamaIndex RAG)
+OPENAI_EMBEDDING_MODEL=text-embedding-3-small
+
+# Local Embedding Model (sentence-transformers, used by EmbeddingService)
+LOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2
+
+# HuggingFace Embedding Model
+HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
+OPENAI_API_KEY. The local provider (default) uses sentence-transformers and requires no API key.Web Search Configuration¶
# Web Search Provider: "serper", "searchxng", "brave", "tavily", or "duckduckgo"
+# Default: "duckduckgo" (no API key required)
+WEB_SEARCH_PROVIDER=duckduckgo
+
+# Serper API Key (for Google search via Serper)
+SERPER_API_KEY=your_serper_api_key_here
+
+# SearchXNG Host URL (for self-hosted search)
+SEARCHXNG_HOST=http://localhost:8080
+
+# Brave Search API Key
+BRAVE_API_KEY=your_brave_api_key_here
+
+# Tavily API Key
+TAVILY_API_KEY=your_tavily_api_key_here
+PubMed Configuration¶
# NCBI API Key (optional, for higher rate limits: 10 req/sec vs 3 req/sec)
+NCBI_API_KEY=your_ncbi_api_key_here
+Agent Configuration¶
# Maximum iterations per research loop (1-50, default: 10)
+MAX_ITERATIONS=10
+
+# Search timeout in seconds
+SEARCH_TIMEOUT=30
+
+# Use graph-based execution for research flows
+USE_GRAPH_EXECUTION=false
+Budget & Rate Limiting Configuration¶
# Default token budget per research loop (1000-1000000, default: 100000)
+DEFAULT_TOKEN_LIMIT=100000
+
+# Default time limit per research loop in minutes (1-120, default: 10)
+DEFAULT_TIME_LIMIT_MINUTES=10
+
+# Default iterations limit per research loop (1-50, default: 10)
+DEFAULT_ITERATIONS_LIMIT=10
+RAG Service Configuration¶
# ChromaDB collection name for RAG
+RAG_COLLECTION_NAME=deepcritical_evidence
+
+# Number of top results to retrieve from RAG (1-50, default: 5)
+RAG_SIMILARITY_TOP_K=5
+
+# Automatically ingest evidence into RAG
+RAG_AUTO_INGEST=true
+ChromaDB Configuration¶
# ChromaDB storage path
+CHROMA_DB_PATH=./chroma_db
+
+# Whether to persist ChromaDB to disk
+CHROMA_DB_PERSIST=true
+
+# ChromaDB server host (for remote ChromaDB, optional)
+CHROMA_DB_HOST=localhost
+
+# ChromaDB server port (for remote ChromaDB, optional)
+CHROMA_DB_PORT=8000
+External Services¶
Modal Configuration¶
# Modal Token ID (for Modal sandbox execution)
+MODAL_TOKEN_ID=your_modal_token_id_here
+
+# Modal Token Secret
+MODAL_TOKEN_SECRET=your_modal_token_secret_here
+Logging Configuration¶
configure_logging() function:Configuration Properties¶
Settings class provides helpful properties for checking configuration state:API Key Availability¶
from src.utils.config import settings
+
+# Check API key availability
+if settings.has_openai_key:
+ # Use OpenAI
+ pass
+
+if settings.has_anthropic_key:
+ # Use Anthropic
+ pass
+
+if settings.has_huggingface_key:
+ # Use HuggingFace
+ pass
+
+if settings.has_any_llm_key:
+ # At least one LLM is available
+ pass
+Service Availability¶
from src.utils.config import settings
+
+# Check service availability
+if settings.modal_available:
+ # Use Modal sandbox
+ pass
+
+if settings.web_search_available:
+ # Web search is configured
+ pass
+API Key Retrieval¶
Configuration Usage in Codebase¶
LLM Factory¶
Embedding Service¶
Orchestrator Factory¶
Environment Variables Reference¶
Required (at least one LLM)¶
OPENAI_API_KEY - OpenAI API key (required for OpenAI provider)ANTHROPIC_API_KEY - Anthropic API key (required for Anthropic provider)HF_TOKEN or HUGGINGFACE_API_KEY - HuggingFace API token (optional, can work without for public models)LLM Configuration Variables¶
LLM_PROVIDER - Provider to use: "openai", "anthropic", or "huggingface" (default: "huggingface")OPENAI_MODEL - OpenAI model name (default: "gpt-5.1")ANTHROPIC_MODEL - Anthropic model name (default: "claude-sonnet-4-5-20250929")HUGGINGFACE_MODEL - HuggingFace model ID (default: "meta-llama/Llama-3.1-8B-Instruct")Embedding Configuration Variables¶
EMBEDDING_PROVIDER - Provider: "openai", "local", or "huggingface" (default: "local")OPENAI_EMBEDDING_MODEL - OpenAI embedding model (default: "text-embedding-3-small")LOCAL_EMBEDDING_MODEL - Local sentence-transformers model (default: "all-MiniLM-L6-v2")HUGGINGFACE_EMBEDDING_MODEL - HuggingFace embedding model (default: "sentence-transformers/all-MiniLM-L6-v2")Web Search Configuration Variables¶
WEB_SEARCH_PROVIDER - Provider: "serper", "searchxng", "brave", "tavily", or "duckduckgo" (default: "duckduckgo")SERPER_API_KEY - Serper API key (required for Serper provider)SEARCHXNG_HOST - SearchXNG host URL (required for SearchXNG provider)BRAVE_API_KEY - Brave Search API key (required for Brave provider)TAVILY_API_KEY - Tavily API key (required for Tavily provider)PubMed Configuration Variables¶
NCBI_API_KEY - NCBI API key (optional, increases rate limit from 3 to 10 req/sec)Agent Configuration Variables¶
MAX_ITERATIONS - Maximum iterations per research loop (1-50, default: 10)SEARCH_TIMEOUT - Search timeout in seconds (default: 30)USE_GRAPH_EXECUTION - Use graph-based execution (default: false)Budget Configuration Variables¶
DEFAULT_TOKEN_LIMIT - Default token budget per research loop (1000-1000000, default: 100000)DEFAULT_TIME_LIMIT_MINUTES - Default time limit in minutes (1-120, default: 10)DEFAULT_ITERATIONS_LIMIT - Default iterations limit (1-50, default: 10)RAG Configuration Variables¶
RAG_COLLECTION_NAME - ChromaDB collection name (default: "deepcritical_evidence")RAG_SIMILARITY_TOP_K - Number of top results to retrieve (1-50, default: 5)RAG_AUTO_INGEST - Automatically ingest evidence into RAG (default: true)ChromaDB Configuration Variables¶
CHROMA_DB_PATH - ChromaDB storage path (default: "./chroma_db")CHROMA_DB_PERSIST - Whether to persist ChromaDB to disk (default: true)CHROMA_DB_HOST - ChromaDB server host (optional, for remote ChromaDB)CHROMA_DB_PORT - ChromaDB server port (optional, for remote ChromaDB)External Services Variables¶
MODAL_TOKEN_ID - Modal token ID (optional, for Modal sandbox execution)MODAL_TOKEN_SECRET - Modal token secret (optional, for Modal sandbox execution)Logging Configuration Variables¶
LOG_LEVEL - Log level: "DEBUG", "INFO", "WARNING", or "ERROR" (default: "INFO")Validation¶
ge=1, le=50 for max_iterations)Literal["openai", "anthropic", "huggingface"])get_api_key() or get_openai_api_key()Validation Examples¶
max_iterations field has range validation:llm_provider field has literal validation:Error Handling¶
ConfigurationError from src/utils/exceptions.py:pass
+Error Handling Example¶
python from src.utils.config import settings from src.utils.exceptions import ConfigurationError try: api_key = settings.get_api_key() except ConfigurationError as e: print(f"Configuration error: {e}")Common Configuration Errors¶
get_api_key() is called but the required API key is not setllm_provider is set to an unsupported valueConfiguration Best Practices¶
.env File: Store sensitive keys in .env file (add to .gitignore)has_openai_key before accessing API keysConfigurationError when calling get_api_key()Future Enhancements¶
Code Quality & Documentation¶
Linting¶
pyproject.toml:PLR0913: Too many arguments (agents need many params)PLR0912: Too many branches (complex orchestrator logic)PLR0911: Too many return statements (complex agent logic)PLR2004: Magic values (statistical constants)PLW0603: Global statement (singleton pattern)PLC0415: Lazy imports for optional dependenciesType Checking¶
mypy --strict complianceignore_missing_imports = true (for optional dependencies)reference_repos/, examples/Pre-commit¶
make check before committingmake installDocumentation¶
Docstrings¶
Code Comments¶
requests not httpx for ClinicalTrials)# CRITICAL: ...See Also¶
Code Quality & Documentation¶
Linting¶
pyproject.toml:PLR0913: Too many arguments (agents need many params)PLR0912: Too many branches (complex orchestrator logic)PLR0911: Too many return statements (complex agent logic)PLR2004: Magic values (statistical constants)PLW0603: Global statement (singleton pattern)PLC0415: Lazy imports for optional dependenciesE402: Module level import not at top (needed for pytest.importorskip)E501: Line too long (ignore line length violations)RUF100: Unused noqa (version differences between local/CI)Type Checking¶
mypy --strict complianceignore_missing_imports = true (for optional dependencies)reference_repos/, examples/Pre-commit¶
.pre-commit-config.yaml.Installation¶
# Install dependencies (includes pre-commit package)
+uv sync --all-extras
+
+# Set up git hooks (must be run separately)
+uv run pre-commit install
+uv sync --all-extras installs the pre-commit package, but you must run uv run pre-commit install separately to set up the git hooks.Pre-commit Hooks¶
src/ (excludes tests/, reference_repos/)src/ (excludes tests/, reference_repos/)src/ (excludes folder/)tests/unit/ with -m "not openai and not embedding_provider"tests/ with -m "local_embeddings"Manual Pre-commit Run¶
Troubleshooting¶
git commit --no-verify (not recommended)uv run pre-commit installuv sync --all-extrasDocumentation¶
Building Documentation¶
docs/, and the configuration is in mkdocs.yml.# Build documentation
+uv run mkdocs build
+
+# Serve documentation locally (http://127.0.0.1:8000)
+uv run mkdocs serve
+Docstrings¶
Code Comments¶
requests not httpx for ClinicalTrials)# CRITICAL: ...See Also¶
Code Style & Conventions¶
Type Safety¶
mypy --strict compliance (no Any unless absolutely necessary)TYPE_CHECKING imports for circular dependencies:Pydantic Models¶
src/utils/models.py)model_config = {"frozen": True}) for immutabilityField() with descriptions for all model fieldsge=, le=, min_length=, max_length= constraintsAsync Patterns¶
async def, await)asyncio.gather() for parallel operationsrun_in_executor():loop = asyncio.get_running_loop()
-result = await loop.run_in_executor(None, cpu_bound_function, args)
-
Common Pitfalls¶
See Also¶
Code Style & Conventions¶
Package Manager¶
uv as the package manager. All commands should be prefixed with uv run to ensure they run in the correct environment.Installation¶
# Install uv if you haven't already (recommended: standalone installer)
+# Unix/macOS/Linux:
+curl -LsSf https://astral.sh/uv/install.sh | sh
+
+# Windows (PowerShell):
+powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
+
+# Alternative: pipx install uv
+# Or: pip install uv
+
+# Sync all dependencies including dev extras
+uv sync --all-extras
+Running Commands¶
uv run prefix:# Instead of: pytest tests/
+uv run pytest tests/
+
+# Instead of: ruff check src
+uv run ruff check src
+
+# Instead of: mypy src
+uv run mypy src
+uv.Type Safety¶
mypy --strict compliance (no Any unless absolutely necessary)TYPE_CHECKING imports for circular dependencies:Pydantic Models¶
src/utils/models.py)model_config = {"frozen": True}) for immutabilityField() with descriptions for all model fieldsge=, le=, min_length=, max_length= constraintsAsync Patterns¶
async def, await)asyncio.gather() for parallel operationsrun_in_executor():loop = asyncio.get_running_loop()
+result = await loop.run_in_executor(None, cpu_bound_function, args)
+
Common Pitfalls¶
See Also¶
Error Handling & Logging¶
Exception Hierarchy¶
src/utils/exceptions.py):Error Handling Rules¶
raise SearchError(...) from estructlog:
Logging¶
structlog for all logging (NOT print or logging)import structlog; logger = structlog.get_logger()logger.info("event", key=value)Logging Examples¶
logger.info("Starting search", query=query, tools=[t.name for t in tools])
-logger.warning("Search tool failed", tool=tool.name, error=str(result))
-logger.error("Assessment failed", error=str(e))
-Error Chaining¶
try:
- result = await api_call()
-except httpx.HTTPError as e:
- raise SearchError(f"API call failed: {e}") from e
-See Also¶
Error Handling & Logging¶
Exception Hierarchy¶
src/utils/exceptions.py):Error Handling Rules¶
raise SearchError(...) from estructlog:
Logging¶
structlog for all logging (NOT print or logging)import structlog; logger = structlog.get_logger()logger.info("event", key=value)Logging Examples¶
logger.info("Starting search", query=query, tools=[t.name for t in tools])
+logger.warning("Search tool failed", tool=tool.name, error=str(result))
+logger.error("Assessment failed", error=str(e))
+Error Chaining¶
try:
+ result = await api_call()
+except httpx.HTTPError as e:
+ raise SearchError(f"API call failed: {e}") from e
+See Also¶
Implementation Patterns¶
Search Tools¶
SearchTool protocol (src/tools/base.py):
name propertyasync def search(query, max_results) -> list[Evidence]@retry decorator from tenacity for resilience_rate_limit() for APIs with limits (e.g., PubMed)SearchError or RateLimitError on failuresclass MySearchTool:
- @property
- def name(self) -> str:
- return "mytool"
-
- @retry(stop=stop_after_attempt(3), wait=wait_exponential(...))
- async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
- # Implementation
- return evidence_list
-Judge Handlers¶
JudgeHandlerProtocol (async def assess(question, evidence) -> JudgeAssessment)Agent with output_type=JudgeAssessmentsrc/prompts/judge.pyMockJudgeHandler, HFInferenceJudgeHandlerJudgeAssessment (never raise exceptions)Agent Factory Pattern¶
src/agent_factory/)State Management¶
ContextVar for thread-safe state (src/agents/state.py)@lru_cache)Singleton Pattern¶
@lru_cache(maxsize=1) for singletons:
See Also¶
Implementation Patterns¶
Search Tools¶
SearchTool protocol (src/tools/base.py):
name propertyasync def search(query, max_results) -> list[Evidence]@retry decorator from tenacity for resilience_rate_limit() for APIs with limits (e.g., PubMed)SearchError or RateLimitError on failuresclass MySearchTool:
+ @property
+ def name(self) -> str:
+ return "mytool"
+
+ @retry(stop=stop_after_attempt(3), wait=wait_exponential(...))
+ async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
+ # Implementation
+ return evidence_list
+Judge Handlers¶
JudgeHandlerProtocol (async def assess(question, evidence) -> JudgeAssessment)Agent with output_type=JudgeAssessmentsrc/prompts/judge.pyMockJudgeHandler, HFInferenceJudgeHandlerJudgeAssessment (never raise exceptions)Agent Factory Pattern¶
src/agent_factory/)State Management¶
ContextVar for thread-safe state (src/agents/state.py)@lru_cache)Singleton Pattern¶
@lru_cache(maxsize=1) for singletons:
See Also¶
Contributing to DeepCritical¶
Git Workflow¶
main: Production-ready (GitHub)dev: Development integration (GitHub)yourname-devmain or dev on HuggingFaceDevelopment Commands¶
make install # Install dependencies + pre-commit
-make check # Lint + typecheck + test (MUST PASS)
-make test # Run unit tests
-make lint # Run ruff
-make format # Format with ruff
-make typecheck # Run mypy
-make test-cov # Test with coverage
-Getting Started¶
Development Guidelines¶
Code Style¶
mypy --strictruff for linting and formattingError Handling¶
raise SearchError(...) from estructlogTesting¶
unit, integration, slowImplementation Patterns¶
@lru_cache(maxsize=1)Prompt Engineering¶
Code Quality¶
# CRITICAL: ...MCP Integration¶
MCP Tools¶
src/mcp_tools.py for Claude DesktopGradio MCP Server¶
mcp_server=True in demo.launch()/gradio_api/mcp/ssr_mode=False to fix hydration issues in HF SpacesCommon Pitfalls¶
from e when raising exceptionsKey Principles¶
mypy --strictPull Request Process¶
make checkQuestions?¶
Contributing to The DETERMINATOR¶
Git Workflow¶
main: Production-ready (GitHub)dev: Development integration (GitHub)yourname-devmain or dev on HuggingFaceRepository Information¶
DeepCritical/GradioDemo (source of truth, PRs, code review)DataQuests/DeepCritical (deployment/demo)determinator (Python package name in pyproject.toml)Dual Repository Setup¶
DeepCritical/GradioDemo): Source of truth for code, PRs, and code reviewDataQuests/DeepCritical): Deployment target for the Gradio demoRemote Configuration¶
# Clone from GitHub
+git clone https://github.com/DeepCritical/GradioDemo.git
+cd GradioDemo
+
+# Add HuggingFace remote (optional, for deployment)
+git remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/DeepCritical
+main or dev on HuggingFace. Always work through GitHub PRs. GitHub is the source of truth; HuggingFace is for deployment/demo only.Package Manager¶
uv as the package manager. All commands should be prefixed with uv run to ensure they run in the correct environment.Installation¶
# Install uv if you haven't already (recommended: standalone installer)
+# Unix/macOS/Linux:
+curl -LsSf https://astral.sh/uv/install.sh | sh
+
+# Windows (PowerShell):
+powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
+
+# Alternative: pipx install uv
+# Or: pip install uv
+
+# Sync all dependencies including dev extras
+uv sync --all-extras
+
+# Install pre-commit hooks
+uv run pre-commit install
+Development Commands¶
# Installation
+uv sync --all-extras # Install all dependencies including dev
+uv run pre-commit install # Install pre-commit hooks
+
+# Code Quality Checks (run all before committing)
+uv run ruff check src tests # Lint with ruff
+uv run ruff format src tests # Format with ruff
+uv run mypy src # Type checking
+uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with coverage
+
+# Testing Commands
+uv run pytest tests/unit/ -v -m "not openai" -p no:logfire # Run unit tests (excludes OpenAI tests)
+uv run pytest tests/ -v -m "huggingface" -p no:logfire # Run HuggingFace tests
+uv run pytest tests/ -v -p no:logfire # Run all tests
+uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with terminal coverage
+uv run pytest --cov=src --cov-report=html -p no:logfire # Generate HTML coverage report (opens htmlcov/index.html)
+
+# Documentation Commands
+uv run mkdocs build # Build documentation
+uv run mkdocs serve # Serve documentation locally (http://127.0.0.1:8000)
+Test Markers¶
unit: Unit tests (mocked, fast)integration: Integration tests (real APIs)slow: Slow testsopenai: Tests requiring OpenAI API keyhuggingface: Tests requiring HuggingFace API keyembedding_provider: Tests requiring API-based embedding providerslocal_embeddings: Tests using local embeddings-p no:logfire flag disables the logfire plugin to avoid conflicts during testing.Getting Started¶
DeepCritical/GradioDemo
uv run ruff check src tests
+uv run mypy src
+uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
+
Development Guidelines¶
Code Style¶
mypy --strictruff for linting and formattingError Handling¶
raise SearchError(...) from estructlogTesting¶
unit, integration, slowImplementation Patterns¶
@lru_cache(maxsize=1)Prompt Engineering¶
Code Quality¶
# CRITICAL: ...MCP Integration¶
MCP Tools¶
src/mcp_tools.py for Claude DesktopGradio MCP Server¶
mcp_server=True in demo.launch()/gradio_api/mcp/ssr_mode=False to fix hydration issues in HF SpacesCommon Pitfalls¶
from e when raising exceptionsKey Principles¶
mypy --strictPull Request Process¶
uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfireProject Structure¶
src/: Main source codetests/: Test files (unit/ and integration/)docs/: Documentation source files (MkDocs)examples/: Example usage scriptspyproject.toml: Project configuration and dependencies.pre-commit-config.yaml: Pre-commit hook configurationQuestions?¶
Prompt Engineering & Citation Validation¶
Judge Prompts¶
src/prompts/judge.pyformat_user_prompt() and format_empty_evidence_prompt() helpersHypothesis Prompts¶
truncate_at_sentence())format_hypothesis_prompt() with embeddings for diversityReport Prompts¶
Citation Validation¶
validate_references() from src/utils/citation_validator.pyCitation Validation Rules¶
Evidence Selection¶
select_diverse_evidence() for MMR-based selectionSee Also¶
Prompt Engineering & Citation Validation¶
Judge Prompts¶
src/prompts/judge.pyformat_user_prompt() and format_empty_evidence_prompt() helpersHypothesis Prompts¶
truncate_at_sentence())format_hypothesis_prompt() with embeddings for diversityReport Prompts¶
Citation Validation¶
validate_references() from src/utils/citation_validator.pyCitation Validation Rules¶
Evidence Selection¶
select_diverse_evidence() for MMR-based selectionSee Also¶
Testing Requirements¶
Test Structure¶
tests/unit/ (mocked, fast)tests/integration/ (real APIs, marked @pytest.mark.integration)unit, integration, slowMocking¶
respx for httpx mockingpytest-mock for general mockingMockJudgeHandler)tests/conftest.py: mock_httpx_client, mock_llm_responseTDD Workflow¶
tests/unit/src/make check (lint + typecheck + test)Test Examples¶
@pytest.mark.unit
-async def test_pubmed_search(mock_httpx_client):
- tool = PubMedTool()
- results = await tool.search("metformin", max_results=5)
- assert len(results) > 0
- assert all(isinstance(r, Evidence) for r in results)
-
-@pytest.mark.integration
-async def test_real_pubmed_search():
- tool = PubMedTool()
- results = await tool.search("metformin", max_results=3)
- assert len(results) <= 3
-Test Coverage¶
make test-cov for coverage report__init__.py, TYPE_CHECKING blocksSee Also¶
Testing Requirements¶
Test Structure¶
tests/unit/ (mocked, fast)tests/integration/ (real APIs, marked @pytest.mark.integration)unit, integration, slow, openai, huggingface, embedding_provider, local_embeddingsTest Markers¶
pyproject.toml:
@pytest.mark.unit: Unit tests (mocked, fast) - Run with -m "unit"@pytest.mark.integration: Integration tests (real APIs) - Run with -m "integration"@pytest.mark.slow: Slow tests - Run with -m "slow"@pytest.mark.openai: Tests requiring OpenAI API key - Run with -m "openai" or exclude with -m "not openai"@pytest.mark.huggingface: Tests requiring HuggingFace API key or using HuggingFace models - Run with -m "huggingface"@pytest.mark.embedding_provider: Tests requiring API-based embedding providers (OpenAI, etc.) - Run with -m "embedding_provider"@pytest.mark.local_embeddings: Tests using local embeddings (sentence-transformers, ChromaDB) - Run with -m "local_embeddings"Running Tests by Marker¶
# Run only unit tests (excludes OpenAI tests by default)
+uv run pytest tests/unit/ -v -m "not openai" -p no:logfire
+
+# Run HuggingFace tests
+uv run pytest tests/ -v -m "huggingface" -p no:logfire
+
+# Run all tests
+uv run pytest tests/ -v -p no:logfire
+
+# Run only local embedding tests
+uv run pytest tests/ -v -m "local_embeddings" -p no:logfire
+
+# Exclude slow tests
+uv run pytest tests/ -v -m "not slow" -p no:logfire
+-p no:logfire flag disables the logfire plugin to avoid conflicts during testing.Mocking¶
respx for httpx mockingpytest-mock for general mockingMockJudgeHandler)tests/conftest.py: mock_httpx_client, mock_llm_responseTDD Workflow¶
tests/unit/src/uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfireTest Command Examples¶
# Run unit tests (default, excludes OpenAI tests)
+uv run pytest tests/unit/ -v -m "not openai" -p no:logfire
+
+# Run HuggingFace tests
+uv run pytest tests/ -v -m "huggingface" -p no:logfire
+
+# Run all tests
+uv run pytest tests/ -v -p no:logfire
+Test Examples¶
@pytest.mark.unit
+async def test_pubmed_search(mock_httpx_client):
+ tool = PubMedTool()
+ results = await tool.search("metformin", max_results=5)
+ assert len(results) > 0
+ assert all(isinstance(r, Evidence) for r in results)
+
+@pytest.mark.integration
+async def test_real_pubmed_search():
+ tool = PubMedTool()
+ results = await tool.search("metformin", max_results=3)
+ assert len(results) <= 3
+Test Coverage¶
Terminal Coverage Report¶
HTML Coverage Report¶
htmlcov/index.html. Open this file in your browser to see detailed coverage information.Coverage Goals¶
__init__.py, TYPE_CHECKING blockspyproject.toml under [tool.coverage.*]See Also¶
Examples¶
Basic Research Query¶
Example 1: Drug Information¶
Example 2: Clinical Trial Search¶
Advanced Research Queries¶
Example 3: Comprehensive Review¶
Review the evidence for using metformin as an anti-aging intervention,
-including clinical trials, mechanisms of action, and safety profile.
-Example 4: Hypothesis Testing¶
MCP Tool Examples¶
Using search_pubmed¶
Using search_clinical_trials¶
Using search_all¶
Using analyze_hypothesis¶
Code Examples¶
Python API Usage¶
from src.orchestrator_factory import create_orchestrator
-from src.tools.search_handler import SearchHandler
-from src.agent_factory.judges import create_judge_handler
-
-# Create orchestrator
-search_handler = SearchHandler()
-judge_handler = create_judge_handler()
-# Run research query
-query = "What are the latest treatments for Alzheimer's disease?"
-async for event in orchestrator.run(query):
- print(f"Event: {event.type} - {event.data}")
-Gradio UI Integration¶
import gradio as gr
-from src.app import create_research_interface
-
-# Create interface
-interface = create_research_interface()
-
-# Launch
-interface.launch(server_name="0.0.0.0", server_port=7860)
-Research Patterns¶
Iterative Research¶
Deep Research¶
Configuration Examples¶
Basic Configuration¶
Advanced Configuration¶
# .env file
-LLM_PROVIDER=anthropic
-ANTHROPIC_API_KEY=your_key_here
-EMBEDDING_PROVIDER=local
-WEB_SEARCH_PROVIDER=duckduckgo
-MAX_ITERATIONS=20
-DEFAULT_TOKEN_LIMIT=200000
-USE_GRAPH_EXECUTION=true
-Next Steps¶
Examples¶
Basic Research Query¶
Example 1: Drug Information¶
Example 2: Clinical Trial Search¶
Advanced Research Queries¶
Example 3: Comprehensive Review¶
Review the evidence for using metformin as an anti-aging intervention,
+including clinical trials, mechanisms of action, and safety profile.
+Example 4: Hypothesis Testing¶
MCP Tool Examples¶
Using search_pubmed¶
Using search_clinical_trials¶
Using search_all¶
Using analyze_hypothesis¶
Code Examples¶
Python API Usage¶
from src.orchestrator_factory import create_orchestrator
+from src.tools.search_handler import SearchHandler
+from src.agent_factory.judges import create_judge_handler
+
+# Create orchestrator
+search_handler = SearchHandler()
+judge_handler = create_judge_handler()
+# Run research query
+query = "What are the latest treatments for Alzheimer's disease?"
+async for event in orchestrator.run(query):
+ print(f"Event: {event.type} - {event.data}")
+Gradio UI Integration¶
import gradio as gr
+from src.app import create_research_interface
+
+# Create interface
+interface = create_research_interface()
+
+# Launch
+interface.launch(server_name="0.0.0.0", server_port=7860)
+Research Patterns¶
Iterative Research¶
Deep Research¶
Configuration Examples¶
Basic Configuration¶
Advanced Configuration¶
# .env file
+LLM_PROVIDER=anthropic
+ANTHROPIC_API_KEY=your_key_here
+EMBEDDING_PROVIDER=local
+WEB_SEARCH_PROVIDER=duckduckgo
+MAX_ITERATIONS=20
+DEFAULT_TOKEN_LIMIT=200000
+USE_GRAPH_EXECUTION=true
+Next Steps¶
Installation¶
Prerequisites¶
uv package manager (recommended) or pipInstallation Steps¶
1. Install uv (Recommended)¶
uv is a fast Python package installer and resolver. Install it with:2. Clone the Repository¶
3. Install Dependencies¶
uv (recommended):pip:4. Install Optional Dependencies¶
5. Configure Environment Variables¶
.env file in the project root:# Required: At least one LLM provider
-LLM_PROVIDER=openai # or "anthropic" or "huggingface"
-OPENAI_API_KEY=your_openai_api_key_here
-
-# Optional: Other services
-NCBI_API_KEY=your_ncbi_api_key_here # For higher PubMed rate limits
-MODAL_TOKEN_ID=your_modal_token_id
-MODAL_TOKEN_SECRET=your_modal_token_secret
-6. Verify Installation¶
http://localhost:7860 to verify the installation.Development Setup¶
Troubleshooting¶
Common Issues¶
.env file is in the project root - Check that API keys are correctly formatted - Ensure at least one LLM provider is configureduv sync or pip install -e . again - Check that you're in the correct virtual environmentsrc/app.py or use environment variable - Kill the process using port 7860Next Steps¶
<<<<<<< Updated upstream¶
Installation¶
Prerequisites¶
uv package manager (recommended) or pipInstallation Steps¶
1. Install uv (Recommended)¶
uv is a fast Python package installer and resolver. Install it using the standalone installer (recommended):# Using pipx (recommended if you have pipx installed)
+pipx install uv
+
+# Or using pip
+pip install uv
+~/.cargo/bin to your PATH.2. Clone the Repository¶
3. Install Dependencies¶
uv (recommended):pip:4. Install Optional Dependencies¶
5. Configure Environment Variables¶
.env file in the project root:# Required: At least one LLM provider
+LLM_PROVIDER=openai # or "anthropic" or "huggingface"
+OPENAI_API_KEY=your_openai_api_key_here
+
+# Optional: Other services
+NCBI_API_KEY=your_ncbi_api_key_here # For higher PubMed rate limits
+MODAL_TOKEN_ID=your_modal_token_id
+MODAL_TOKEN_SECRET=your_modal_token_secret
+6. Verify Installation¶
http://localhost:7860 to verify the installation.Development Setup¶
Troubleshooting¶
Common Issues¶
.env file is in the project root - Check that API keys are correctly formatted - Ensure at least one LLM provider is configureduv sync or pip install -e . again - Check that you're in the correct virtual environmentsrc/app.py or use environment variable - Kill the process using port 7860Next Steps¶
MCP Integration¶
What is MCP?¶
MCP Server URL¶
Claude Desktop Configuration¶
1. Locate Configuration File¶
2. Add DeepCritical Server¶
claude_desktop_config.json and add:{
- "mcpServers": {
- "deepcritical": {
- "url": "http://localhost:7860/gradio_api/mcp/"
- }
- }
-}
-3. Restart Claude Desktop¶
4. Verify Connection¶
search_pubmed - search_clinical_trials - search_biorxiv - search_all - analyze_hypothesisAvailable Tools¶
search_pubmed¶
query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)search_clinical_trials¶
query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)search_biorxiv¶
query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)search_all¶
query (string): Search query - max_results (integer, optional): Maximum number of results per source (default: 10)analyze_hypothesis¶
hypothesis (string): Hypothesis to analyze - data (string, optional): Data description or codeUsing Tools in Claude Desktop¶
Troubleshooting¶
Connection Issues¶
uv run gradio run src/app.py) - Verify the URL in claude_desktop_config.json is correct - Check that port 7860 is not blocked by firewallAuthentication¶
Advanced Configuration¶
Custom Port¶
{
- "mcpServers": {
- "deepcritical": {
- "url": "http://localhost:8080/gradio_api/mcp/"
- }
- }
-}
-Multiple Instances¶
{
- "mcpServers": {
- "deepcritical-local": {
- "url": "http://localhost:7860/gradio_api/mcp/"
- },
- "deepcritical-remote": {
- "url": "https://your-server.com/gradio_api/mcp/"
- }
- }
-}
-Next Steps¶
MCP Integration¶
What is MCP?¶
MCP Server URL¶
Claude Desktop Configuration¶
1. Locate Configuration File¶
2. Add The DETERMINATOR Server¶
claude_desktop_config.json and add:{
+ "mcpServers": {
+ "determinator": {
+ "url": "http://localhost:7860/gradio_api/mcp/"
+ }
+ }
+}
+3. Restart Claude Desktop¶
4. Verify Connection¶
search_pubmed - search_clinical_trials - search_biorxiv - search_all - analyze_hypothesisAvailable Tools¶
search_pubmed¶
query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)search_clinical_trials¶
query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)search_biorxiv¶
query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)search_all¶
query (string): Search query - max_results (integer, optional): Maximum number of results per source (default: 10)analyze_hypothesis¶
hypothesis (string): Hypothesis to analyze - data (string, optional): Data description or codeUsing Tools in Claude Desktop¶
Troubleshooting¶
Connection Issues¶
uv run gradio run src/app.py) - Verify the URL in claude_desktop_config.json is correct - Check that port 7860 is not blocked by firewallAuthentication¶
Advanced Configuration¶
Custom Port¶
{
+ "mcpServers": {
+ "deepcritical": {
+ "url": "http://localhost:8080/gradio_api/mcp/"
+ }
+ }
+}
+Multiple Instances¶
{
+ "mcpServers": {
+ "deepcritical-local": {
+ "url": "http://localhost:7860/gradio_api/mcp/"
+ },
+ "deepcritical-remote": {
+ "url": "https://your-server.com/gradio_api/mcp/"
+ }
+ }
+}
+Next Steps¶
Quick Start Guide¶
Start the Application¶
http://localhost:7860.First Research Query¶
Authentication¶
HuggingFace OAuth (Recommended)¶
Manual API Key¶
Understanding the Interface¶
Chat Interface¶
Status Indicators¶
Settings¶
Example Queries¶
Simple Query¶
Complex Query¶
Review the evidence for using metformin as an anti-aging intervention,
-including clinical trials, mechanisms of action, and safety profile.
-Clinical Trial Query¶
Next Steps¶
Single Command Deploy¶
docker run -it -p 7860:7860 --platform=linux/amd64 \
+ -e DB_KEY="YOUR_VALUE_HERE" \
+ -e SERP_API="YOUR_VALUE_HERE" \
+ -e INFERENCE_API="YOUR_VALUE_HERE" \
+ -e MODAL_TOKEN_ID="YOUR_VALUE_HERE" \
+ -e MODAL_TOKEN_SECRET="YOUR_VALUE_HERE" \
+ -e NCBI_API_KEY="YOUR_VALUE_HERE" \
+ -e SERPER_API_KEY="YOUR_VALUE_HERE" \
+ -e CHROMA_DB_PATH="./chroma_db" \
+ -e CHROMA_DB_HOST="localhost" \
+ -e CHROMA_DB_PORT="8000" \
+ -e RAG_COLLECTION_NAME="deepcritical_evidence" \
+ -e RAG_SIMILARITY_TOP_K="5" \
+ -e RAG_AUTO_INGEST="true" \
+ -e USE_GRAPH_EXECUTION="false" \
+ -e DEFAULT_TOKEN_LIMIT="100000" \
+ -e DEFAULT_TIME_LIMIT_MINUTES="10" \
+ -e DEFAULT_ITERATIONS_LIMIT="10" \
+ -e WEB_SEARCH_PROVIDER="duckduckgo" \
+ -e MAX_ITERATIONS="10" \
+ -e SEARCH_TIMEOUT="30" \
+ -e LOG_LEVEL="DEBUG" \
+ -e EMBEDDING_PROVIDER="local" \
+ -e OPENAI_EMBEDDING_MODEL="text-embedding-3-small" \
+ -e LOCAL_EMBEDDING_MODEL="BAAI/bge-small-en-v1.5" \
+ -e HUGGINGFACE_EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" \
+ -e HF_FALLBACK_MODELS="Qwen/Qwen3-Next-80B-A3B-Thinking,Qwen/Qwen3-Next-80B-A3B-Instruct,meta-llama/Llama-3.3-70B-Instruct,meta-llama/Llama-3.1-8B-Instruct,HuggingFaceH4/zephyr-7b-beta,Qwen/Qwen2-7B-Instruct" \
+ -e HUGGINGFACE_MODEL="Qwen/Qwen3-Next-80B-A3B-Thinking" \
+ registry.hf.space/dataquests-deepcritical:latest python src/app.py
+ ```
+
+## Quick start guide
+
+Get up and running with The DETERMINATOR in minutes.
+
+## Start the Application
+
+```bash
+gradio src/app.py
+http://localhost:7860.First Research Query¶
Authentication¶
HuggingFace OAuth (Recommended)¶
Manual API Key¶
Understanding the Interface¶
Chat Interface¶
Status Indicators¶
Settings¶
Example Queries¶
Simple Query¶
Complex Query¶
Review the evidence for using metformin as an anti-aging intervention,
+including clinical trials, mechanisms of action, and safety profile.
+Clinical Trial Query¶
Next Steps¶
DeepCritical¶
Features¶
Quick Start¶
# Install uv if you haven't already
-pip install uv
-
-# Sync dependencies
-uv sync
-
-# Start the Gradio app
-uv run gradio run src/app.py
-http://localhost:7860.Architecture¶
Documentation¶
Links¶
The DETERMINATOR¶
Features¶
Quick Start¶
# Install uv if you haven't already (recommended: standalone installer)
+# Unix/macOS/Linux:
+curl -LsSf https://astral.sh/uv/install.sh | sh
+
+# Windows (PowerShell):
+powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
+
+# Alternative: pipx install uv
+# Or: pip install uv
+
+# Sync dependencies
+uv sync
+
+# Start the Gradio app
+uv run gradio run src/app.py
+http://localhost:7860.Architecture¶
Documentation¶
Links¶
License¶
MIT License¶
License¶
MIT License¶
Architecture Overview¶
Core Architecture¶
Orchestration Patterns¶
src/orchestrator/graph_orchestrator.py):AsyncGenerator[AgentEvent] for real-time UI updatessrc/orchestrator/research_flow.py):PlannerAgent to break query into report sectionsIterativeResearchFlow instances in parallel per section via WorkflowManagerLongWriterAgent or ProofreaderAgentuse_graph=True) and agent chains (use_graph=False)src/orchestrator/research_flow.py):KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent, WriterAgentJudgeHandler assesses evidence sufficiencysrc/orchestrator_magentic.py):agent-framework-coreMagenticBuilder with participants: searcher, hypothesizer, judge, reporterOpenAIChatClientAgentEvent for UI streamingsrc/orchestrator_hierarchical.py):SubIterationMiddleware with ResearchTeam and LLMSubIterationJudgeSubIterationTeam protocolasyncio.Queue for coordinationsrc/legacy_orchestrator.py):SearchHandlerProtocol and JudgeHandlerProtocolAgentEvent objectsLong-Running Task Support¶
AgentEvent objects via AsyncGeneratorstarted, searching, search_complete, judging, judge_complete, looping, synthesizing, hypothesizing, complete, errorsrc/middleware/budget_tracker.py):src/middleware/workflow_manager.py):pending, running, completed, failed, cancelledsrc/middleware/state_machine.py):ContextVar for concurrent requestsWorkflowState tracks: evidence, conversation history, embedding servicesrc/app.py):Graph Architecture¶
src/orchestrator/graph_orchestrator.py) implements a flexible graph-based execution model:
KnowledgeGapAgent, ToolSelectorAgent)
[Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?] → [Tool Selector] or [Writer][Input] → [Planner] → [Parallel Iterative Loops] → [Synthesizer]
asyncio.gather()Key Components¶
src/orchestrator/, src/orchestrator_*.py)src/orchestrator/research_flow.py)src/agent_factory/graph_builder.py)src/agents/, src/agent_factory/agents.py)src/tools/)src/agent_factory/judges.py)src/services/embeddings.py)src/services/statistical_analyzer.py)src/middleware/)src/mcp_tools.py)src/app.py)Research Team & Parallel Execution¶
ResearchLoop instancesasyncio.gather()Configuration & Modes¶
src/orchestrator_factory.py):iterative: Single research loopdeep: Multi-section parallel researchauto: Auto-detect based on query complexityuse_graph=True: Graph-based execution (parallel, conditional routing)use_graph=False: Agent chains (sequential, backward compatible)
Architecture Overview¶
Core Architecture¶
Orchestration Patterns¶
src/orchestrator/graph_orchestrator.py):AsyncGenerator[AgentEvent] for real-time UI updatessrc/orchestrator/research_flow.py):PlannerAgent to break query into report sectionsIterativeResearchFlow instances in parallel per section via WorkflowManagerLongWriterAgent or ProofreaderAgentuse_graph=True) and agent chains (use_graph=False)src/orchestrator/research_flow.py):KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent, WriterAgentJudgeHandler assesses evidence sufficiencysrc/orchestrator_magentic.py):agent-framework-coreMagenticBuilder with participants: searcher, hypothesizer, judge, reporterOpenAIChatClientAgentEvent for UI streamingsrc/orchestrator_hierarchical.py):SubIterationMiddleware with ResearchTeam and LLMSubIterationJudgeSubIterationTeam protocolasyncio.Queue for coordinationsrc/legacy_orchestrator.py):SearchHandlerProtocol and JudgeHandlerProtocolAgentEvent objectsLong-Running Task Support¶
AgentEvent objects via AsyncGeneratorstarted, searching, search_complete, judging, judge_complete, looping, synthesizing, hypothesizing, complete, errorsrc/middleware/budget_tracker.py):src/middleware/workflow_manager.py):pending, running, completed, failed, cancelledsrc/middleware/state_machine.py):ContextVar for concurrent requestsWorkflowState tracks: evidence, conversation history, embedding servicesrc/app.py):Graph Architecture¶
src/orchestrator/graph_orchestrator.py) implements a flexible graph-based execution model:
KnowledgeGapAgent, ToolSelectorAgent)
[Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?] → [Tool Selector] or [Writer][Input] → [Planner] → [Parallel Iterative Loops] → [Synthesizer]
asyncio.gather()Key Components¶
src/orchestrator/, src/orchestrator_*.py)src/orchestrator/research_flow.py)src/agent_factory/graph_builder.py)src/agents/, src/agent_factory/agents.py)src/tools/)src/agent_factory/judges.py)src/services/embeddings.py)src/services/statistical_analyzer.py)src/services/multimodal_processing.py, src/services/audio_processing.py)src/middleware/)src/mcp_tools.py)src/app.py)Research Team & Parallel Execution¶
ResearchLoop instancesasyncio.gather()Configuration & Modes¶
src/orchestrator_factory.py):simple: Legacy linear search-judge loop (Free Tier)advanced or magentic: Multi-agent coordination using Microsoft Agent Framework (requires OpenAI API key)iterative: Knowledge-gap-driven research with single loop (Free Tier)deep: Parallel section-based research with planning (Free Tier)auto: Intelligent mode detection based on query complexity (Free Tier)iterative: Single research loop patterndeep: Multi-section parallel research patternauto: Auto-detect pattern based on query complexityuse_graph=True: Graph-based execution (parallel, conditional routing)use_graph=False: Agent chains (sequential, backward compatible)
Features¶
Core Features¶
Multi-Source Search¶
MCP Integration¶
Authentication¶
Secure Code Execution¶
Semantic Search & RAG¶
Orchestration Patterns¶
Real-Time Streaming¶
AsyncGenerator[AgentEvent]Budget Management¶
State Management¶
Advanced Features¶
Agent System¶
Search Tools¶
Error Handling¶
Configuration¶
.env filesTesting¶
UI Features¶
Gradio Interface¶
MCP Server¶
Development Features¶
Code Quality¶
Documentation¶
Features¶
Core Features¶
Multi-Source Search¶
MCP Integration¶
Authentication¶
HF_TOKEN or HUGGINGFACE_API_KEY)Secure Code Execution¶
Semantic Search & RAG¶
Orchestration Patterns¶
simple: Legacy linear search-judge loop - advanced (or magentic): Multi-agent coordination (requires OpenAI API key) - iterative: Knowledge-gap-driven research with single loop - deep: Parallel section-based research with planning - auto: Intelligent mode detection based on query complexityiterative: Single research loop pattern - deep: Multi-section parallel research pattern - auto: Auto-detect pattern based on query complexityuse_graph=True: Graph-based execution with parallel and conditional routing - use_graph=False: Agent chains with sequential execution (backward compatible)Real-Time Streaming¶
AsyncGenerator[AgentEvent]Budget Management¶
State Management¶
Multimodal Input & Output¶
Advanced Features¶
Agent System¶
Search Tools¶
Error Handling¶
Configuration¶
.env filesTesting¶
UI Features¶
Gradio Interface¶
MCP Server¶
Development Features¶
Code Quality¶
Documentation¶
Quick Start¶
Installation¶
Run the UI¶
http://localhost:7860.Basic Usage¶
1. Authentication (Optional)¶
2. Start a Research Query¶
3. MCP Integration (Optional)¶
claude_desktop_config.json: Available Tools¶
search_pubmed: Search peer-reviewed biomedical literaturesearch_clinical_trials: Search ClinicalTrials.govsearch_biorxiv: Search bioRxiv/medRxiv preprintssearch_all: Search all sources simultaneouslyanalyze_hypothesis: Secure statistical analysis using Modal sandboxesNext Steps¶
Quick Start¶
Installation¶
# Install uv if you haven't already (recommended: standalone installer)
+# Unix/macOS/Linux:
+curl -LsSf https://astral.sh/uv/install.sh | sh
+
+# Windows (PowerShell):
+powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
+
+# Alternative: pipx install uv
+# Or: pip install uv
+
+# Sync dependencies
+uv sync
+Run the UI¶
http://localhost:7860.Basic Usage¶
1. Authentication (REQUIRED)¶
HF_TOKEN or HUGGINGFACE_API_KEY before starting the app - The app will automatically use these tokens if OAuth login is not available - Supports HuggingFace API keys only (OpenAI/Anthropic keys are not used in the current implementation)2. Start a Research Query¶
3. MCP Integration (Optional)¶
claude_desktop_config.json: Available Tools¶
search_pubmed: Search peer-reviewed biomedical literaturesearch_clinical_trials: Search ClinicalTrials.govsearch_biorxiv: Search bioRxiv/medRxiv preprintssearch_neo4j: Search Neo4j knowledge graph for papers and disease relationshipssearch_all: Search all sources simultaneouslyanalyze_hypothesis: Secure statistical analysis using Modal sandboxesNext Steps¶
"},{"location":"#quick-start","title":"Quick Start","text":"# Install uv if you haven't already\npip install uv\n\n# Sync dependencies\nuv sync\n\n# Start the Gradio app\nuv run gradio run src/app.py\nhttp://localhost:7860.
"},{"location":"#links","title":"Links","text":"
"},{"location":"CONTRIBUTING/","title":"Contributing to DeepCritical","text":"
"},{"location":"CONTRIBUTING/#getting-started","title":"Getting Started","text":"main: Production-ready (GitHub)dev: Development integration (GitHub)yourname-devmain or dev on HuggingFace
git clone https://github.com/yourusername/GradioDemo.git\ncd GradioDemo\n
make install\n
git checkout -b yourname-feature-name\n
make check\n
8. Create a pull request on GitHubgit commit -m \"Description of changes\"\ngit push origin yourname-feature-name\n
make install # Install dependencies + pre-commit\nmake check # Lint + typecheck + test (MUST PASS)\nmake test # Run unit tests\nmake lint # Run ruff\nmake format # Format with ruff\nmake typecheck # Run mypy\nmake test-cov # Test with coverage\nmake docs-build # Build documentation\nmake docs-serve # Serve documentation locally\n"},{"location":"CONTRIBUTING/#code-style-conventions","title":"Code Style & Conventions","text":""},{"location":"CONTRIBUTING/#type-safety","title":"Type Safety","text":"mypy --strict compliance (no Any unless absolutely necessary)TYPE_CHECKING imports for circular dependencies:src/utils/models.py)model_config = {\"frozen\": True}) for immutabilityField() with descriptions for all model fieldsge=, le=, min_length=, max_length= constraintsasync def, await)asyncio.gather() for parallel operationsrun_in_executor():loop = asyncio.get_running_loop()\nresult = await loop.run_in_executor(None, cpu_bound_function, args)\n pyproject.toml:PLR0913: Too many arguments (agents need many params)PLR0912: Too many branches (complex orchestrator logic)PLR0911: Too many return statements (complex agent logic)PLR2004: Magic values (statistical constants)PLW0603: Global statement (singleton pattern)PLC0415: Lazy imports for optional dependenciesmake check before committingmake installUse custom exception hierarchy (src/utils/exceptions.py):
raise SearchError(...) from estructlog:logger.error(\"Operation failed\", error=str(e), context=value)\n structlog for all logging (NOT print or logging)import structlog; logger = structlog.get_logger()logger.info(\"event\", key=value)logger.info(\"Starting search\", query=query, tools=[t.name for t in tools])\nlogger.warning(\"Search tool failed\", tool=tool.name, error=str(result))\nlogger.error(\"Assessment failed\", error=str(e))\n"},{"location":"CONTRIBUTING/#error-chaining","title":"Error Chaining","text":"Always preserve exception context:
try:\n result = await api_call()\nexcept httpx.HTTPError as e:\n raise SearchError(f\"API call failed: {e}\") from e\n"},{"location":"CONTRIBUTING/#testing-requirements","title":"Testing Requirements","text":""},{"location":"CONTRIBUTING/#test-structure","title":"Test Structure","text":"tests/unit/ (mocked, fast)tests/integration/ (real APIs, marked @pytest.mark.integration)unit, integration, slowrespx for httpx mockingpytest-mock for general mockingMockJudgeHandler)tests/conftest.py: mock_httpx_client, mock_llm_responsetests/unit/src/make check (lint + typecheck + test)@pytest.mark.unit\nasync def test_pubmed_search(mock_httpx_client):\n tool = PubMedTool()\n results = await tool.search(\"metformin\", max_results=5)\n assert len(results) > 0\n assert all(isinstance(r, Evidence) for r in results)\n\n@pytest.mark.integration\nasync def test_real_pubmed_search():\n tool = PubMedTool()\n results = await tool.search(\"metformin\", max_results=3)\n assert len(results) <= 3\n"},{"location":"CONTRIBUTING/#test-coverage","title":"Test Coverage","text":"make test-cov for coverage report__init__.py, TYPE_CHECKING blocksAll tools implement SearchTool protocol (src/tools/base.py):
name propertyasync def search(query, max_results) -> list[Evidence]@retry decorator from tenacity for resilience_rate_limit() for APIs with limits (e.g., PubMed)SearchError or RateLimitError on failuresExample pattern:
class MySearchTool:\n @property\n def name(self) -> str:\n return \"mytool\"\n \n @retry(stop=stop_after_attempt(3), wait=wait_exponential(...))\n async def search(self, query: str, max_results: int = 10) -> list[Evidence]:\n # Implementation\n return evidence_list\n"},{"location":"CONTRIBUTING/#judge-handlers","title":"Judge Handlers","text":"JudgeHandlerProtocol (async def assess(question, evidence) -> JudgeAssessment)Agent with output_type=JudgeAssessmentsrc/prompts/judge.pyMockJudgeHandler, HFInferenceJudgeHandlerJudgeAssessment (never raise exceptions)src/agent_factory/)ContextVar for thread-safe state (src/agents/state.py)@lru_cache)Use @lru_cache(maxsize=1) for singletons:
Example:
"},{"location":"CONTRIBUTING/#code-comments","title":"Code Comments","text":"requests not httpx for ClinicalTrials)# CRITICAL: ...src/prompts/judge.pyformat_user_prompt() and format_empty_evidence_prompt() helperstruncate_at_sentence())format_hypothesis_prompt() with embeddings for diversityvalidate_references() from src/utils/citation_validator.pyselect_diverse_evidence() for MMR-based selectionsrc/mcp_tools.py for Claude Desktopmcp_server=True in demo.launch()/gradio_api/mcp/ssr_mode=False to fix hydration issues in HF Spacesfrom e when raising exceptionsmypy --strictmake checkThank you for contributing to DeepCritical!
"},{"location":"LICENSE/","title":"License","text":"DeepCritical is licensed under the MIT License.
"},{"location":"LICENSE/#mit-license","title":"MIT License","text":"Copyright (c) 2024 DeepCritical Team
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"},{"location":"team/","title":"Team","text":"DeepCritical is developed by a team of researchers and developers working on AI-assisted research.
"},{"location":"team/#team-members","title":"Team Members","text":""},{"location":"team/#zj","title":"ZJ","text":"The DeepCritical team met online in the Alzheimer's Critical Literature Review Group in the Hugging Science initiative. We're building the agent framework we want to use for AI-assisted research to turn the vast amounts of clinical data into cures.
"},{"location":"team/#contributing","title":"Contributing","text":"We welcome contributions! See the Contributing Guide for details.
"},{"location":"team/#links","title":"Links","text":"This page documents the API for DeepCritical agents.
"},{"location":"api/agents/#knowledgegapagent","title":"KnowledgeGapAgent","text":"Module: src.agents.knowledge_gap
Purpose: Evaluates research state and identifies knowledge gaps.
"},{"location":"api/agents/#methods","title":"Methods","text":""},{"location":"api/agents/#evaluate","title":"evaluate","text":"async def evaluate(\n self,\n query: str,\n background_context: str,\n conversation_history: Conversation,\n iteration: int,\n time_elapsed_minutes: float,\n max_time_minutes: float\n) -> KnowledgeGapOutput\n Evaluates research completeness and identifies outstanding knowledge gaps.
Parameters: - query: Research query string - background_context: Background context for the query - conversation_history: Conversation history with previous iterations - iteration: Current iteration number - time_elapsed_minutes: Elapsed time in minutes - max_time_minutes: Maximum time limit in minutes
Returns: KnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps
Module: src.agents.tool_selector
Purpose: Selects appropriate tools for addressing knowledge gaps.
"},{"location":"api/agents/#methods_1","title":"Methods","text":""},{"location":"api/agents/#select_tools","title":"select_tools","text":"async def select_tools(\n self,\n query: str,\n knowledge_gaps: list[str],\n available_tools: list[str]\n) -> AgentSelectionPlan\n Selects tools for addressing knowledge gaps.
Parameters: - query: Research query string - knowledge_gaps: List of knowledge gaps to address - available_tools: List of available tool names
Returns: AgentSelectionPlan with list of AgentTask objects.
Module: src.agents.writer
Purpose: Generates final reports from research findings.
"},{"location":"api/agents/#methods_2","title":"Methods","text":""},{"location":"api/agents/#write_report","title":"write_report","text":"async def write_report(\n self,\n query: str,\n findings: str,\n output_length: str = \"medium\",\n output_instructions: str | None = None\n) -> str\n Generates a markdown report from research findings.
Parameters: - query: Research query string - findings: Research findings to include in report - output_length: Desired output length (\"short\", \"medium\", \"long\") - output_instructions: Additional instructions for report generation
Returns: Markdown string with numbered citations.
"},{"location":"api/agents/#longwriteragent","title":"LongWriterAgent","text":"Module: src.agents.long_writer
Purpose: Long-form report generation with section-by-section writing.
"},{"location":"api/agents/#methods_3","title":"Methods","text":""},{"location":"api/agents/#write_next_section","title":"write_next_section","text":"async def write_next_section(\n self,\n query: str,\n draft: ReportDraft,\n section_title: str,\n section_content: str\n) -> LongWriterOutput\n Writes the next section of a long-form report.
Parameters: - query: Research query string - draft: Current report draft - section_title: Title of the section to write - section_content: Content/guidance for the section
Returns: LongWriterOutput with updated draft.
write_report","text":"async def write_report(\n self,\n query: str,\n report_title: str,\n report_draft: ReportDraft\n) -> str\n Generates final report from draft.
Parameters: - query: Research query string - report_title: Title of the report - report_draft: Complete report draft
Returns: Final markdown report string.
"},{"location":"api/agents/#proofreaderagent","title":"ProofreaderAgent","text":"Module: src.agents.proofreader
Purpose: Proofreads and polishes report drafts.
"},{"location":"api/agents/#methods_4","title":"Methods","text":""},{"location":"api/agents/#proofread","title":"proofread","text":"async def proofread(\n self,\n query: str,\n report_title: str,\n report_draft: ReportDraft\n) -> str\n Proofreads and polishes a report draft.
Parameters: - query: Research query string - report_title: Title of the report - report_draft: Report draft to proofread
Returns: Polished markdown string.
"},{"location":"api/agents/#thinkingagent","title":"ThinkingAgent","text":"Module: src.agents.thinking
Purpose: Generates observations from conversation history.
"},{"location":"api/agents/#methods_5","title":"Methods","text":""},{"location":"api/agents/#generate_observations","title":"generate_observations","text":"async def generate_observations(\n self,\n query: str,\n background_context: str,\n conversation_history: Conversation\n) -> str\n Generates observations from conversation history.
Parameters: - query: Research query string - background_context: Background context - conversation_history: Conversation history
Returns: Observation string.
"},{"location":"api/agents/#inputparseragent","title":"InputParserAgent","text":"Module: src.agents.input_parser
Purpose: Parses and improves user queries, detects research mode.
"},{"location":"api/agents/#methods_6","title":"Methods","text":""},{"location":"api/agents/#parse_query","title":"parse_query","text":"async def parse_query(\n self,\n query: str\n) -> ParsedQuery\n Parses and improves a user query.
Parameters: - query: Original query string
Returns: ParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: \"iterative\" or \"deep\" - key_entities: List of key entities - research_questions: List of research questions
All agents have factory functions in src.agent_factory.agents:
Parameters: - model: Optional Pydantic AI model. If None, uses get_model() from settings.
Returns: Agent instance.
"},{"location":"api/agents/#see-also","title":"See Also","text":"This page documents the Pydantic models used throughout DeepCritical.
"},{"location":"api/models/#evidence","title":"Evidence","text":"Module: src.utils.models
Purpose: Represents evidence from search results.
Fields: - citation: Citation information (title, URL, date, authors) - content: Evidence text content - relevance_score: Relevance score (0.0-1.0) - metadata: Additional metadata dictionary
Module: src.utils.models
Purpose: Citation information for evidence.
Fields: - title: Article/trial title - url: Source URL - date: Publication date (optional) - authors: List of authors (optional)
Module: src.utils.models
Purpose: Output from knowledge gap evaluation.
Fields: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps
Module: src.utils.models
Purpose: Plan for tool/agent selection.
Fields: - tasks: List of agent tasks to execute
Module: src.utils.models
Purpose: Individual agent task.
Fields: - agent_name: Name of agent to use - query: Task query - context: Additional context dictionary
Module: src.utils.models
Purpose: Draft structure for long-form reports.
Fields: - title: Report title - sections: List of report sections - references: List of citations
Module: src.utils.models
Purpose: Individual section in a report draft.
Fields: - title: Section title - content: Section content - order: Section order number
Module: src.utils.models
Purpose: Parsed and improved query.
Fields: - original_query: Original query string - improved_query: Refined query string - research_mode: Research mode (\"iterative\" or \"deep\") - key_entities: List of key entities - research_questions: List of research questions
Module: src.utils.models
Purpose: Conversation history with iterations.
Fields: - iterations: List of iteration data
Module: src.utils.models
Purpose: Data for a single iteration.
Fields: - iteration: Iteration number - observations: Generated observations - knowledge_gaps: Identified knowledge gaps - tool_calls: Tool calls made - findings: Findings from tools - thoughts: Agent thoughts
Module: src.utils.models
Purpose: Event emitted during research execution.
Fields: - type: Event type (e.g., \"started\", \"search_complete\", \"complete\") - iteration: Iteration number (optional) - data: Event data dictionary
Module: src.utils.models
Purpose: Current budget status.
Fields: - tokens_used: Tokens used so far - tokens_limit: Token limit - time_elapsed_seconds: Elapsed time in seconds - time_limit_seconds: Time limit in seconds - iterations: Current iteration count - iterations_limit: Iteration limit
This page documents the API for DeepCritical orchestrators.
"},{"location":"api/orchestrators/#iterativeresearchflow","title":"IterativeResearchFlow","text":"Module: src.orchestrator.research_flow
Purpose: Single-loop research with search-judge-synthesize cycles.
"},{"location":"api/orchestrators/#methods","title":"Methods","text":""},{"location":"api/orchestrators/#run","title":"run","text":"async def run(\n self,\n query: str,\n background_context: str = \"\",\n max_iterations: int | None = None,\n max_time_minutes: float | None = None,\n token_budget: int | None = None\n) -> AsyncGenerator[AgentEvent, None]\n Runs iterative research flow.
Parameters: - query: Research query string - background_context: Background context (default: \"\") - max_iterations: Maximum iterations (default: from settings) - max_time_minutes: Maximum time in minutes (default: from settings) - token_budget: Token budget (default: from settings)
Yields: AgentEvent objects for: - started: Research started - search_complete: Search completed - judge_complete: Evidence evaluation completed - synthesizing: Generating report - complete: Research completed - error: Error occurred
Module: src.orchestrator.research_flow
Purpose: Multi-section parallel research with planning and synthesis.
"},{"location":"api/orchestrators/#methods_1","title":"Methods","text":""},{"location":"api/orchestrators/#run_1","title":"run","text":"async def run(\n self,\n query: str,\n background_context: str = \"\",\n max_iterations_per_section: int | None = None,\n max_time_minutes: float | None = None,\n token_budget: int | None = None\n) -> AsyncGenerator[AgentEvent, None]\n Runs deep research flow.
Parameters: - query: Research query string - background_context: Background context (default: \"\") - max_iterations_per_section: Maximum iterations per section (default: from settings) - max_time_minutes: Maximum time in minutes (default: from settings) - token_budget: Token budget (default: from settings)
Yields: AgentEvent objects for: - started: Research started - planning: Creating research plan - looping: Running parallel research loops - synthesizing: Synthesizing results - complete: Research completed - error: Error occurred
Module: src.orchestrator.graph_orchestrator
Purpose: Graph-based execution using Pydantic AI agents as nodes.
"},{"location":"api/orchestrators/#methods_2","title":"Methods","text":""},{"location":"api/orchestrators/#run_2","title":"run","text":"async def run(\n self,\n query: str,\n research_mode: str = \"auto\",\n use_graph: bool = True\n) -> AsyncGenerator[AgentEvent, None]\n Runs graph-based research orchestration.
Parameters: - query: Research query string - research_mode: Research mode (\"iterative\", \"deep\", or \"auto\") - use_graph: Whether to use graph execution (default: True)
Yields: AgentEvent objects during graph execution.
Module: src.orchestrator_factory
Purpose: Factory for creating orchestrators.
"},{"location":"api/orchestrators/#functions","title":"Functions","text":""},{"location":"api/orchestrators/#create_orchestrator","title":"create_orchestrator","text":"def create_orchestrator(\n search_handler: SearchHandlerProtocol,\n judge_handler: JudgeHandlerProtocol,\n config: dict[str, Any],\n mode: str | None = None\n) -> Any\n Creates an orchestrator instance.
Parameters: - search_handler: Search handler protocol implementation - judge_handler: Judge handler protocol implementation - config: Configuration dictionary - mode: Orchestrator mode (\"simple\", \"advanced\", \"magentic\", or None for auto-detect)
Returns: Orchestrator instance.
Raises: - ValueError: If requirements not met
Modes: - \"simple\": Legacy orchestrator - \"advanced\" or \"magentic\": Magentic orchestrator (requires OpenAI API key) - None: Auto-detect based on API key availability
Module: src.orchestrator_magentic
Purpose: Multi-agent coordination using Microsoft Agent Framework.
"},{"location":"api/orchestrators/#methods_3","title":"Methods","text":""},{"location":"api/orchestrators/#run_3","title":"run","text":"async def run(\n self,\n query: str,\n max_rounds: int = 15,\n max_stalls: int = 3\n) -> AsyncGenerator[AgentEvent, None]\n Runs Magentic orchestration.
Parameters: - query: Research query string - max_rounds: Maximum rounds (default: 15) - max_stalls: Maximum stalls before reset (default: 3)
Yields: AgentEvent objects converted from Magentic events.
Requirements: - agent-framework-core package - OpenAI API key
This page documents the API for DeepCritical services.
"},{"location":"api/services/#embeddingservice","title":"EmbeddingService","text":"Module: src.services.embeddings
Purpose: Local sentence-transformers for semantic search and deduplication.
"},{"location":"api/services/#methods","title":"Methods","text":""},{"location":"api/services/#embed","title":"embed","text":"async def embed(self, text: str) -> list[float]\n Generates embedding for a text string.
Parameters: - text: Text to embed
Returns: Embedding vector as list of floats.
"},{"location":"api/services/#embed_batch","title":"embed_batch","text":"async def embed_batch(self, texts: list[str]) -> list[list[float]]\n Generates embeddings for multiple texts.
Parameters: - texts: List of texts to embed
Returns: List of embedding vectors.
"},{"location":"api/services/#similarity","title":"similarity","text":"async def similarity(self, text1: str, text2: str) -> float\n Calculates similarity between two texts.
Parameters: - text1: First text - text2: Second text
Returns: Similarity score (0.0-1.0).
"},{"location":"api/services/#find_duplicates","title":"find_duplicates","text":"async def find_duplicates(\n self,\n texts: list[str],\n threshold: float = 0.85\n) -> list[tuple[int, int]]\n Finds duplicate texts based on similarity threshold.
Parameters: - texts: List of texts to check - threshold: Similarity threshold (default: 0.85)
Returns: List of (index1, index2) tuples for duplicate pairs.
"},{"location":"api/services/#factory-function","title":"Factory Function","text":""},{"location":"api/services/#get_embedding_service","title":"get_embedding_service","text":"@lru_cache(maxsize=1)\ndef get_embedding_service() -> EmbeddingService\n Returns singleton EmbeddingService instance.
"},{"location":"api/services/#llamaindexragservice","title":"LlamaIndexRAGService","text":"Module: src.services.rag
Purpose: Retrieval-Augmented Generation using LlamaIndex.
"},{"location":"api/services/#methods_1","title":"Methods","text":""},{"location":"api/services/#ingest_evidence","title":"ingest_evidence","text":"async def ingest_evidence(self, evidence: list[Evidence]) -> None\n Ingests evidence into RAG service.
Parameters: - evidence: List of Evidence objects to ingest
Note: Requires OpenAI API key for embeddings.
"},{"location":"api/services/#retrieve","title":"retrieve","text":"async def retrieve(\n self,\n query: str,\n top_k: int = 5\n) -> list[Document]\n Retrieves relevant documents for a query.
Parameters: - query: Search query string - top_k: Number of top results to return (default: 5)
Returns: List of Document objects with metadata.
"},{"location":"api/services/#query","title":"query","text":"async def query(\n self,\n query: str,\n top_k: int = 5\n) -> str\n Queries RAG service and returns formatted results.
Parameters: - query: Search query string - top_k: Number of top results to return (default: 5)
Returns: Formatted query results as string.
"},{"location":"api/services/#factory-function_1","title":"Factory Function","text":""},{"location":"api/services/#get_rag_service","title":"get_rag_service","text":"@lru_cache(maxsize=1)\ndef get_rag_service() -> LlamaIndexRAGService | None\n Returns singleton LlamaIndexRAGService instance, or None if OpenAI key not available.
"},{"location":"api/services/#statisticalanalyzer","title":"StatisticalAnalyzer","text":"Module: src.services.statistical_analyzer
Purpose: Secure execution of AI-generated statistical code.
"},{"location":"api/services/#methods_2","title":"Methods","text":""},{"location":"api/services/#analyze","title":"analyze","text":"async def analyze(\n self,\n hypothesis: str,\n evidence: list[Evidence],\n data_description: str | None = None\n) -> AnalysisResult\n Analyzes a hypothesis using statistical methods.
Parameters: - hypothesis: Hypothesis to analyze - evidence: List of Evidence objects - data_description: Optional data description
Returns: AnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - code: Generated analysis code - output: Execution output - error: Error message if execution failed
Note: Requires Modal credentials for sandbox execution.
"},{"location":"api/services/#see-also","title":"See Also","text":"This page documents the API for DeepCritical search tools.
"},{"location":"api/tools/#searchtool-protocol","title":"SearchTool Protocol","text":"All tools implement the SearchTool protocol:
class SearchTool(Protocol):\n @property\n def name(self) -> str: ...\n \n async def search(\n self, \n query: str, \n max_results: int = 10\n ) -> list[Evidence]: ...\n"},{"location":"api/tools/#pubmedtool","title":"PubMedTool","text":"Module: src.tools.pubmed
Purpose: Search peer-reviewed biomedical literature from PubMed.
"},{"location":"api/tools/#properties","title":"Properties","text":""},{"location":"api/tools/#name","title":"name","text":"@property\ndef name(self) -> str\n Returns tool name: \"pubmed\"
search","text":"async def search(\n self,\n query: str,\n max_results: int = 10\n) -> list[Evidence]\n Searches PubMed for articles.
Parameters: - query: Search query string - max_results: Maximum number of results to return (default: 10)
Returns: List of Evidence objects with PubMed articles.
Raises: - SearchError: If search fails - RateLimitError: If rate limit is exceeded
Module: src.tools.clinicaltrials
Purpose: Search ClinicalTrials.gov for interventional studies.
"},{"location":"api/tools/#properties_1","title":"Properties","text":""},{"location":"api/tools/#name_1","title":"name","text":"@property\ndef name(self) -> str\n Returns tool name: \"clinicaltrials\"
search","text":"async def search(\n self,\n query: str,\n max_results: int = 10\n) -> list[Evidence]\n Searches ClinicalTrials.gov for trials.
Parameters: - query: Search query string - max_results: Maximum number of results to return (default: 10)
Returns: List of Evidence objects with clinical trials.
Note: Only returns interventional studies with status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION
Raises: - SearchError: If search fails
Module: src.tools.europepmc
Purpose: Search Europe PMC for preprints and peer-reviewed articles.
"},{"location":"api/tools/#properties_2","title":"Properties","text":""},{"location":"api/tools/#name_2","title":"name","text":"@property\ndef name(self) -> str\n Returns tool name: \"europepmc\"
search","text":"async def search(\n self,\n query: str,\n max_results: int = 10\n) -> list[Evidence]\n Searches Europe PMC for articles and preprints.
Parameters: - query: Search query string - max_results: Maximum number of results to return (default: 10)
Returns: List of Evidence objects with articles/preprints.
Note: Includes both preprints (marked with [PREPRINT - Not peer-reviewed]) and peer-reviewed articles.
Raises: - SearchError: If search fails
Module: src.tools.rag_tool
Purpose: Semantic search within collected evidence.
"},{"location":"api/tools/#properties_3","title":"Properties","text":""},{"location":"api/tools/#name_3","title":"name","text":"@property\ndef name(self) -> str\n Returns tool name: \"rag\"
search","text":"async def search(\n self,\n query: str,\n max_results: int = 10\n) -> list[Evidence]\n Searches collected evidence using semantic similarity.
Parameters: - query: Search query string - max_results: Maximum number of results to return (default: 10)
Returns: List of Evidence objects from collected evidence.
Note: Requires evidence to be ingested into RAG service first.
"},{"location":"api/tools/#searchhandler","title":"SearchHandler","text":"Module: src.tools.search_handler
Purpose: Orchestrates parallel searches across multiple tools.
"},{"location":"api/tools/#methods_4","title":"Methods","text":""},{"location":"api/tools/#search_4","title":"search","text":"async def search(\n self,\n query: str,\n tools: list[SearchTool] | None = None,\n max_results_per_tool: int = 10\n) -> SearchResult\n Searches multiple tools in parallel.
Parameters: - query: Search query string - tools: List of tools to use (default: all available tools) - max_results_per_tool: Maximum results per tool (default: 10)
Returns: SearchResult with: - evidence: Aggregated list of evidence - tool_results: Results per tool - total_count: Total number of results
Note: Uses asyncio.gather() for parallel execution. Handles tool failures gracefully.
DeepCritical uses Pydantic AI agents for all AI-powered operations. All agents follow a consistent pattern and use structured output types.
"},{"location":"architecture/agents/#agent-pattern","title":"Agent Pattern","text":"All agents use the Pydantic AI Agent class with the following structure:
__init__(model: Any | None = None)async def evaluate(), async def write_report())def create_agent_name(model: Any | None = None) -> AgentNameAgents use get_model() from src/agent_factory/judges.py if no model is provided. This supports:
The model selection is based on the configured LLM_PROVIDER in settings.
Agents return fallback values on failure rather than raising exceptions:
KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])All errors are logged with context using structlog.
"},{"location":"architecture/agents/#input-validation","title":"Input Validation","text":"All agents validate inputs:
Agents use structured output types from src/utils/models.py:
KnowledgeGapOutput: Research completeness evaluationAgentSelectionPlan: Tool selection planReportDraft: Long-form report structureParsedQuery: Query parsing and mode detectionFor text output (writer agents), agents return str directly.
File: src/agents/knowledge_gap.py
Purpose: Evaluates research state and identifies knowledge gaps.
Output: KnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps
Methods: - async def evaluate(query, background_context, conversation_history, iteration, time_elapsed_minutes, max_time_minutes) -> KnowledgeGapOutput
File: src/agents/tool_selector.py
Purpose: Selects appropriate tools for addressing knowledge gaps.
Output: AgentSelectionPlan with list of AgentTask objects.
Available Agents: - WebSearchAgent: General web search for fresh information - SiteCrawlerAgent: Research specific entities/companies - RAGAgent: Semantic search within collected evidence
File: src/agents/writer.py
Purpose: Generates final reports from research findings.
Output: Markdown string with numbered citations.
Methods: - async def write_report(query, findings, output_length, output_instructions) -> str
Features: - Validates inputs - Truncates very long findings (max 50000 chars) with warning - Retry logic for transient failures (3 retries) - Citation validation before returning
"},{"location":"architecture/agents/#long-writer-agent","title":"Long Writer Agent","text":"File: src/agents/long_writer.py
Purpose: Long-form report generation with section-by-section writing.
Input/Output: Uses ReportDraft models.
Methods: - async def write_next_section(query, draft, section_title, section_content) -> LongWriterOutput - async def write_report(query, report_title, report_draft) -> str
Features: - Writes sections iteratively - Aggregates references across sections - Reformats section headings and references - Deduplicates and renumbers references
"},{"location":"architecture/agents/#proofreader-agent","title":"Proofreader Agent","text":"File: src/agents/proofreader.py
Purpose: Proofreads and polishes report drafts.
Input: ReportDraft Output: Polished markdown string
Methods: - async def proofread(query, report_title, report_draft) -> str
Features: - Removes duplicate content across sections - Adds executive summary if multiple sections - Preserves all references and citations - Improves flow and readability
"},{"location":"architecture/agents/#thinking-agent","title":"Thinking Agent","text":"File: src/agents/thinking.py
Purpose: Generates observations from conversation history.
Output: Observation string
Methods: - async def generate_observations(query, background_context, conversation_history) -> str
File: src/agents/input_parser.py
Purpose: Parses and improves user queries, detects research mode.
Output: ParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: \"iterative\" or \"deep\" - key_entities: List of key entities - research_questions: List of research questions
All agents have factory functions in src/agent_factory/agents.py:
Factory functions: - Use get_model() if no model provided - Raise ConfigurationError if creation fails - Log agent creation
Phase 4 implements a graph-based orchestration system for research workflows using Pydantic AI agents as nodes. This enables better parallel execution, conditional routing, and state management compared to simple agent chains.
"},{"location":"architecture/graph-orchestration/#graph-structure","title":"Graph Structure","text":""},{"location":"architecture/graph-orchestration/#nodes","title":"Nodes","text":"Graph nodes represent different stages in the research workflow:
Examples: KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent
State Nodes: Update or read workflow state
Examples: Update evidence, update conversation history
Decision Nodes: Make routing decisions based on conditions
Examples: Continue research vs. complete research
Parallel Nodes: Execute multiple nodes concurrently
Edges define transitions between nodes:
Condition: None (always True)
Conditional Edges: Traversed based on condition
Example: If research complete \u2192 go to writer, else \u2192 continue loop
Parallel Edges: Used for parallel execution branches
[Input] \u2192 [Thinking] \u2192 [Knowledge Gap] \u2192 [Decision: Complete?]\n \u2193 No \u2193 Yes\n [Tool Selector] [Writer]\n \u2193\n [Execute Tools] \u2192 [Loop Back]\n"},{"location":"architecture/graph-orchestration/#deep-research-graph","title":"Deep Research Graph","text":"[Input] \u2192 [Planner] \u2192 [Parallel Iterative Loops] \u2192 [Synthesizer]\n \u2193 \u2193 \u2193\n [Loop1] [Loop2] [Loop3]\n"},{"location":"architecture/graph-orchestration/#state-management","title":"State Management","text":"State is managed via WorkflowState using ContextVar for thread-safe isolation:
State transitions occur at state nodes, which update the global workflow state.
"},{"location":"architecture/graph-orchestration/#execution-flow","title":"Execution Flow","text":"asyncio.gather() for parallel nodesDecision nodes evaluate conditions and return next node IDs:
research_complete \u2192 writer, else \u2192 tool selectorParallel nodes execute multiple nodes concurrently:
Budget constraints are enforced at decision nodes:
If any budget is exceeded, execution routes to exit node.
"},{"location":"architecture/graph-orchestration/#error-handling","title":"Error Handling","text":"Errors are handled at multiple levels:
Errors are logged and yield error events for UI.
"},{"location":"architecture/graph-orchestration/#backward-compatibility","title":"Backward Compatibility","text":"Graph execution is optional via feature flag:
USE_GRAPH_EXECUTION=true: Use graph-based executionUSE_GRAPH_EXECUTION=false: Use agent chain execution (existing)This allows gradual migration and fallback if needed.
"},{"location":"architecture/graph_orchestration/","title":"Graph Orchestration Architecture","text":""},{"location":"architecture/graph_orchestration/#graph-patterns","title":"Graph Patterns","text":""},{"location":"architecture/graph_orchestration/#iterative-research-graph","title":"Iterative Research Graph","text":"[Input] \u2192 [Thinking] \u2192 [Knowledge Gap] \u2192 [Decision: Complete?]\n \u2193 No \u2193 Yes\n [Tool Selector] [Writer]\n \u2193\n [Execute Tools] \u2192 [Loop Back]\n"},{"location":"architecture/graph_orchestration/#deep-research-graph","title":"Deep Research Graph","text":"[Input] \u2192 [Planner] \u2192 [Parallel Iterative Loops] \u2192 [Synthesizer]\n \u2193 \u2193 \u2193\n [Loop1] [Loop2] [Loop3]\n"},{"location":"architecture/graph_orchestration/#deep-research","title":"Deep Research","text":"\nsequenceDiagram\n actor User\n participant GraphOrchestrator\n participant InputParser\n participant GraphBuilder\n participant GraphExecutor\n participant Agent\n participant BudgetTracker\n participant WorkflowState\n\n User->>GraphOrchestrator: run(query)\n GraphOrchestrator->>InputParser: detect_research_mode(query)\n InputParser-->>GraphOrchestrator: mode (iterative/deep)\n GraphOrchestrator->>GraphBuilder: build_graph(mode)\n GraphBuilder-->>GraphOrchestrator: ResearchGraph\n GraphOrchestrator->>WorkflowState: init_workflow_state()\n GraphOrchestrator->>BudgetTracker: create_budget()\n GraphOrchestrator->>GraphExecutor: _execute_graph(graph)\n \n loop For each node in graph\n GraphExecutor->>Agent: execute_node(agent_node)\n Agent->>Agent: process_input\n Agent-->>GraphExecutor: result\n GraphExecutor->>WorkflowState: update_state(result)\n GraphExecutor->>BudgetTracker: add_tokens(used)\n GraphExecutor->>BudgetTracker: check_budget()\n alt Budget exceeded\n GraphExecutor->>GraphOrchestrator: emit(error_event)\n else Continue\n GraphExecutor->>GraphOrchestrator: emit(progress_event)\n end\n end\n \n GraphOrchestrator->>User: AsyncGenerator[AgentEvent]\n"},{"location":"architecture/graph_orchestration/#iterative-research","title":"Iterative Research","text":"sequenceDiagram\n participant IterativeFlow\n participant ThinkingAgent\n participant KnowledgeGapAgent\n participant ToolSelector\n participant ToolExecutor\n participant JudgeHandler\n participant WriterAgent\n\n IterativeFlow->>IterativeFlow: run(query)\n \n loop Until complete or max_iterations\n IterativeFlow->>ThinkingAgent: generate_observations()\n ThinkingAgent-->>IterativeFlow: observations\n \n IterativeFlow->>KnowledgeGapAgent: evaluate_gaps()\n KnowledgeGapAgent-->>IterativeFlow: KnowledgeGapOutput\n \n alt Research complete\n IterativeFlow->>WriterAgent: create_final_report()\n WriterAgent-->>IterativeFlow: final_report\n else Gaps remain\n IterativeFlow->>ToolSelector: select_agents(gap)\n ToolSelector-->>IterativeFlow: AgentSelectionPlan\n \n IterativeFlow->>ToolExecutor: execute_tool_tasks()\n ToolExecutor-->>IterativeFlow: ToolAgentOutput[]\n \n IterativeFlow->>JudgeHandler: assess_evidence()\n JudgeHandler-->>IterativeFlow: should_continue\n end\n end"},{"location":"architecture/graph_orchestration/#graph-structure","title":"Graph Structure","text":""},{"location":"architecture/graph_orchestration/#nodes","title":"Nodes","text":"Graph nodes represent different stages in the research workflow:
Examples: KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent
State Nodes: Update or read workflow state
Examples: Update evidence, update conversation history
Decision Nodes: Make routing decisions based on conditions
Examples: Continue research vs. complete research
Parallel Nodes: Execute multiple nodes concurrently
Edges define transitions between nodes:
Condition: None (always True)
Conditional Edges: Traversed based on condition
Example: If research complete \u2192 go to writer, else \u2192 continue loop
Parallel Edges: Used for parallel execution branches
State is managed via WorkflowState using ContextVar for thread-safe isolation:
State transitions occur at state nodes, which update the global workflow state.
"},{"location":"architecture/graph_orchestration/#execution-flow","title":"Execution Flow","text":"asyncio.gather() for parallel nodesDecision nodes evaluate conditions and return next node IDs:
research_complete \u2192 writer, else \u2192 tool selectorParallel nodes execute multiple nodes concurrently:
Budget constraints are enforced at decision nodes:
If any budget is exceeded, execution routes to exit node.
"},{"location":"architecture/graph_orchestration/#error-handling","title":"Error Handling","text":"Errors are handled at multiple levels:
Errors are logged and yield error events for UI.
"},{"location":"architecture/graph_orchestration/#backward-compatibility","title":"Backward Compatibility","text":"Graph execution is optional via feature flag:
USE_GRAPH_EXECUTION=true: Use graph-based executionUSE_GRAPH_EXECUTION=false: Use agent chain execution (existing)This allows gradual migration and fallback if needed.
"},{"location":"architecture/graph_orchestration/#see-also","title":"See Also","text":"DeepCritical uses middleware for state management, budget tracking, and workflow coordination.
"},{"location":"architecture/middleware/#state-management","title":"State Management","text":""},{"location":"architecture/middleware/#workflowstate","title":"WorkflowState","text":"File: src/middleware/state_machine.py
Purpose: Thread-safe state management for research workflows
Implementation: Uses ContextVar for thread-safe isolation
State Components: - evidence: list[Evidence]: Collected evidence from searches - conversation: Conversation: Iteration history (gaps, tool calls, findings, thoughts) - embedding_service: Any: Embedding service for semantic search
Methods: - add_evidence(evidence: Evidence): Adds evidence with URL-based deduplication - async search_related(query: str, top_k: int = 5) -> list[Evidence]: Semantic search
Initialization:
Access:
"},{"location":"architecture/middleware/#workflow-manager","title":"Workflow Manager","text":"File: src/middleware/workflow_manager.py
Purpose: Coordinates parallel research loops
Methods: - add_loop(loop: ResearchLoop): Add a research loop to manage - async run_loops_parallel() -> list[ResearchLoop]: Run all loops in parallel - update_loop_status(loop_id: str, status: str): Update loop status - sync_loop_evidence_to_state(): Synchronize evidence from loops to global state
Features: - Uses asyncio.gather() for parallel execution - Handles errors per loop (doesn't fail all if one fails) - Tracks loop status: pending, running, completed, failed, cancelled - Evidence deduplication across parallel loops
Usage:
from src.middleware.workflow_manager import WorkflowManager\n\nmanager = WorkflowManager()\nmanager.add_loop(loop1)\nmanager.add_loop(loop2)\ncompleted_loops = await manager.run_loops_parallel()\n"},{"location":"architecture/middleware/#budget-tracker","title":"Budget Tracker","text":"File: src/middleware/budget_tracker.py
Purpose: Tracks and enforces resource limits
Budget Components: - Tokens: LLM token usage - Time: Elapsed time in seconds - Iterations: Number of iterations
Methods: - create_budget(token_limit, time_limit_seconds, iterations_limit) -> BudgetStatus - add_tokens(tokens: int): Add token usage - start_timer(): Start time tracking - update_timer(): Update elapsed time - increment_iteration(): Increment iteration count - check_budget() -> BudgetStatus: Check current budget status - can_continue() -> bool: Check if research can continue
Token Estimation: - estimate_tokens(text: str) -> int: ~4 chars per token - estimate_llm_call_tokens(prompt: str, response: str) -> int: Estimate LLM call tokens
Usage:
from src.middleware.budget_tracker import BudgetTracker\n\ntracker = BudgetTracker()\nbudget = tracker.create_budget(\n token_limit=100000,\n time_limit_seconds=600,\n iterations_limit=10\n)\ntracker.start_timer()\n# ... research operations ...\nif not tracker.can_continue():\n # Budget exceeded, stop research\n pass\n"},{"location":"architecture/middleware/#models","title":"Models","text":"All middleware models are defined in src/utils/models.py:
IterationData: Data for a single iterationConversation: Conversation history with iterationsResearchLoop: Research loop state and configurationBudgetStatus: Current budget statusAll middleware components use ContextVar for thread-safe isolation:
DeepCritical supports multiple orchestration patterns for research workflows.
"},{"location":"architecture/orchestrators/#research-flows","title":"Research Flows","text":""},{"location":"architecture/orchestrators/#iterativeresearchflow","title":"IterativeResearchFlow","text":"File: src/orchestrator/research_flow.py
Pattern: Generate observations \u2192 Evaluate gaps \u2192 Select tools \u2192 Execute \u2192 Judge \u2192 Continue/Complete
Agents Used: - KnowledgeGapAgent: Evaluates research completeness - ToolSelectorAgent: Selects tools for addressing gaps - ThinkingAgent: Generates observations - WriterAgent: Creates final report - JudgeHandler: Assesses evidence sufficiency
Features: - Tracks iterations, time, budget - Supports graph execution (use_graph=True) and agent chains (use_graph=False) - Iterates until research complete or constraints met
Usage:
"},{"location":"architecture/orchestrators/#deepresearchflow","title":"DeepResearchFlow","text":"File: src/orchestrator/research_flow.py
Pattern: Planner \u2192 Parallel iterative loops per section \u2192 Synthesizer
Agents Used: - PlannerAgent: Breaks query into report sections - IterativeResearchFlow: Per-section research (parallel) - LongWriterAgent or ProofreaderAgent: Final synthesis
Features: - Uses WorkflowManager for parallel execution - Budget tracking per section and globally - State synchronization across parallel loops - Supports graph execution and agent chains
Usage:
"},{"location":"architecture/orchestrators/#graph-orchestrator","title":"Graph Orchestrator","text":"File: src/orchestrator/graph_orchestrator.py
Purpose: Graph-based execution using Pydantic AI agents as nodes
Features: - Uses Pydantic AI Graphs (when available) or agent chains (fallback) - Routes based on research mode (iterative/deep/auto) - Streams AgentEvent objects for UI
Node Types: - Agent Nodes: Execute Pydantic AI agents - State Nodes: Update or read workflow state - Decision Nodes: Make routing decisions - Parallel Nodes: Execute multiple nodes concurrently
Edge Types: - Sequential Edges: Always traversed - Conditional Edges: Traversed based on condition - Parallel Edges: Used for parallel execution branches
"},{"location":"architecture/orchestrators/#orchestrator-factory","title":"Orchestrator Factory","text":"File: src/orchestrator_factory.py
Purpose: Factory for creating orchestrators
Modes: - Simple: Legacy orchestrator (backward compatible) - Advanced: Magentic orchestrator (requires OpenAI API key) - Auto-detect: Chooses based on API key availability
Usage:
"},{"location":"architecture/orchestrators/#magentic-orchestrator","title":"Magentic Orchestrator","text":"File: src/orchestrator_magentic.py
Purpose: Multi-agent coordination using Microsoft Agent Framework
Features: - Uses agent-framework-core - ChatAgent pattern with internal LLMs per agent - MagenticBuilder with participants: searcher, hypothesizer, judge, reporter - Manager orchestrates agents via OpenAIChatClient - Requires OpenAI API key (function calling support) - Event-driven: converts Magentic events to AgentEvent for UI streaming
Requirements: - agent-framework-core package - OpenAI API key
File: src/orchestrator_hierarchical.py
Purpose: Hierarchical orchestrator using middleware and sub-teams
Features: - Uses SubIterationMiddleware with ResearchTeam and LLMSubIterationJudge - Adapts Magentic ChatAgent to SubIterationTeam protocol - Event-driven via asyncio.Queue for coordination - Supports sub-iteration patterns for complex research tasks
File: src/legacy_orchestrator.py
Purpose: Linear search-judge-synthesize loop
Features: - Uses SearchHandlerProtocol and JudgeHandlerProtocol - Generator-based design yielding AgentEvent objects - Backward compatibility for simple use cases
All orchestrators must initialize workflow state:
"},{"location":"architecture/orchestrators/#event-streaming","title":"Event Streaming","text":"All orchestrators yield AgentEvent objects:
Event Types: - started: Research started - search_complete: Search completed - judge_complete: Evidence evaluation completed - hypothesizing: Generating hypotheses - synthesizing: Synthesizing results - complete: Research completed - error: Error occurred
Event Structure:
"},{"location":"architecture/orchestrators/#see-also","title":"See Also","text":"DeepCritical provides several services for embeddings, RAG, and statistical analysis.
"},{"location":"architecture/services/#embedding-service","title":"Embedding Service","text":"File: src/services/embeddings.py
Purpose: Local sentence-transformers for semantic search and deduplication
Features: - No API Key Required: Uses local sentence-transformers models - Async-Safe: All operations use run_in_executor() to avoid blocking - ChromaDB Storage: Vector storage for embeddings - Deduplication: 0.85 similarity threshold (85% similarity = duplicate)
Model: Configurable via settings.local_embedding_model (default: all-MiniLM-L6-v2)
Methods: - async def embed(text: str) -> list[float]: Generate embeddings - async def embed_batch(texts: list[str]) -> list[list[float]]: Batch embedding - async def similarity(text1: str, text2: str) -> float: Calculate similarity - async def find_duplicates(texts: list[str], threshold: float = 0.85) -> list[tuple[int, int]]: Find duplicates
Usage:
from src.services.embeddings import get_embedding_service\n\nservice = get_embedding_service()\nembedding = await service.embed(\"text to embed\")\n"},{"location":"architecture/services/#llamaindex-rag-service","title":"LlamaIndex RAG Service","text":"File: src/services/rag.py
Purpose: Retrieval-Augmented Generation using LlamaIndex
Features: - OpenAI Embeddings: Requires OPENAI_API_KEY - ChromaDB Storage: Vector database for document storage - Metadata Preservation: Preserves source, title, URL, date, authors - Lazy Initialization: Graceful fallback if OpenAI key not available
Methods: - async def ingest_evidence(evidence: list[Evidence]) -> None: Ingest evidence into RAG - async def retrieve(query: str, top_k: int = 5) -> list[Document]: Retrieve relevant documents - async def query(query: str, top_k: int = 5) -> str: Query with RAG
Usage:
from src.services.rag import get_rag_service\n\nservice = get_rag_service()\nif service:\n documents = await service.retrieve(\"query\", top_k=5)\n"},{"location":"architecture/services/#statistical-analyzer","title":"Statistical Analyzer","text":"File: src/services/statistical_analyzer.py
Purpose: Secure execution of AI-generated statistical code
Features: - Modal Sandbox: Secure, isolated execution environment - Code Generation: Generates Python code via LLM - Library Pinning: Version-pinned libraries in SANDBOX_LIBRARIES - Network Isolation: block_network=True by default
Libraries Available: - pandas, numpy, scipy - matplotlib, scikit-learn - statsmodels
Output: AnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - code: Generated analysis code - output: Execution output - error: Error message if execution failed
Usage:
from src.services.statistical_analyzer import StatisticalAnalyzer\n\nanalyzer = StatisticalAnalyzer()\nresult = await analyzer.analyze(\n hypothesis=\"Metformin reduces cancer risk\",\n evidence=evidence_list\n)\n"},{"location":"architecture/services/#singleton-pattern","title":"Singleton Pattern","text":"All services use the singleton pattern with @lru_cache(maxsize=1):
@lru_cache(maxsize=1)\ndef get_embedding_service() -> EmbeddingService:\n return EmbeddingService()\n This ensures: - Single instance per process - Lazy initialization - No dependencies required at import time
"},{"location":"architecture/services/#service-availability","title":"Service Availability","text":"Services check availability before use:
from src.utils.config import settings\n\nif settings.modal_available:\n # Use Modal sandbox\n pass\n\nif settings.has_openai_key:\n # Use OpenAI embeddings for RAG\n pass\n"},{"location":"architecture/services/#see-also","title":"See Also","text":"DeepCritical implements a protocol-based search tool system for retrieving evidence from multiple sources.
"},{"location":"architecture/tools/#searchtool-protocol","title":"SearchTool Protocol","text":"All tools implement the SearchTool protocol from src/tools/base.py:
All tools use the @retry decorator from tenacity:
@retry(\n stop=stop_after_attempt(3), \n wait=wait_exponential(...)\n)\nasync def search(self, query: str, max_results: int = 10) -> list[Evidence]:\n # Implementation\n Tools with API rate limits implement _rate_limit() method and use shared rate limiters from src/tools/rate_limiter.py.
Tools raise custom exceptions:
SearchError: General search failuresRateLimitError: Rate limit exceededTools handle HTTP errors (429, 500, timeout) and return empty lists on non-critical errors (with warning logs).
"},{"location":"architecture/tools/#query-preprocessing","title":"Query Preprocessing","text":"Tools use preprocess_query() from src/tools/query_utils.py to:
All tools convert API responses to Evidence objects with:
Citation: Title, URL, date, authorscontent: Evidence textrelevance_score: 0.0-1.0 relevance scoremetadata: Additional metadataMissing fields are handled gracefully with defaults.
"},{"location":"architecture/tools/#tool-implementations","title":"Tool Implementations","text":""},{"location":"architecture/tools/#pubmed-tool","title":"PubMed Tool","text":"File: src/tools/pubmed.py
API: NCBI E-utilities (ESearch \u2192 EFetch)
Rate Limiting: - 0.34s between requests (3 req/sec without API key) - 0.1s between requests (10 req/sec with NCBI API key)
Features: - XML parsing with xmltodict - Handles single vs. multiple articles - Query preprocessing - Evidence conversion with metadata extraction
File: src/tools/clinicaltrials.py
API: ClinicalTrials.gov API v2
Important: Uses requests library (NOT httpx) because WAF blocks httpx TLS fingerprint.
Execution: Runs in thread pool: await asyncio.to_thread(requests.get, ...)
Filtering: - Only interventional studies - Status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION
Features: - Parses nested JSON structure - Extracts trial metadata - Evidence conversion
"},{"location":"architecture/tools/#europe-pmc-tool","title":"Europe PMC Tool","text":"File: src/tools/europepmc.py
API: Europe PMC REST API
Features: - Handles preprint markers: [PREPRINT - Not peer-reviewed] - Builds URLs from DOI or PMID - Checks pubTypeList for preprint detection - Includes both preprints and peer-reviewed articles
File: src/tools/rag_tool.py
Purpose: Semantic search within collected evidence
Implementation: Wraps LlamaIndexRAGService
Features: - Returns Evidence from RAG results - Handles evidence ingestion - Semantic similarity search - Metadata preservation
"},{"location":"architecture/tools/#search-handler","title":"Search Handler","text":"File: src/tools/search_handler.py
Purpose: Orchestrates parallel searches across multiple tools
Features: - Uses asyncio.gather() with return_exceptions=True - Aggregates results into SearchResult - Handles tool failures gracefully - Deduplicates results by URL
Tools are registered in the search handler:
from src.tools.pubmed import PubMedTool\nfrom src.tools.clinicaltrials import ClinicalTrialsTool\nfrom src.tools.europepmc import EuropePMCTool\n\nsearch_handler = SearchHandler(\n tools=[\n PubMedTool(),\n ClinicalTrialsTool(),\n EuropePMCTool(),\n ]\n)\n"},{"location":"architecture/tools/#see-also","title":"See Also","text":"Architecture Pattern: Microsoft Magentic Orchestration Design Philosophy: Simple, dynamic, manager-driven coordination Key Innovation: Intelligent manager replaces rigid sequential phases
"},{"location":"architecture/workflow-diagrams/#1-high-level-magentic-workflow","title":"1. High-Level Magentic Workflow","text":"flowchart TD\n Start([User Query]) --> Manager[Magentic Manager<br/>Plan \u2022 Select \u2022 Assess \u2022 Adapt]\n\n Manager -->|Plans| Task1[Task Decomposition]\n Task1 --> Manager\n\n Manager -->|Selects & Executes| HypAgent[Hypothesis Agent]\n Manager -->|Selects & Executes| SearchAgent[Search Agent]\n Manager -->|Selects & Executes| AnalysisAgent[Analysis Agent]\n Manager -->|Selects & Executes| ReportAgent[Report Agent]\n\n HypAgent -->|Results| Manager\n SearchAgent -->|Results| Manager\n AnalysisAgent -->|Results| Manager\n ReportAgent -->|Results| Manager\n\n Manager -->|Assesses Quality| Decision{Good Enough?}\n Decision -->|No - Refine| Manager\n Decision -->|No - Different Agent| Manager\n Decision -->|No - Stalled| Replan[Reset Plan]\n Replan --> Manager\n\n Decision -->|Yes| Synthesis[Synthesize Final Result]\n Synthesis --> Output([Research Report])\n\n style Start fill:#e1f5e1\n style Manager fill:#ffe6e6\n style HypAgent fill:#fff4e6\n style SearchAgent fill:#fff4e6\n style AnalysisAgent fill:#fff4e6\n style ReportAgent fill:#fff4e6\n style Decision fill:#ffd6d6\n style Synthesis fill:#d4edda\n style Output fill:#e1f5e1"},{"location":"architecture/workflow-diagrams/#2-magentic-manager-the-6-phase-cycle","title":"2. Magentic Manager: The 6-Phase Cycle","text":"flowchart LR\n P1[1. Planning<br/>Analyze task<br/>Create strategy] --> P2[2. Agent Selection<br/>Pick best agent<br/>for subtask]\n P2 --> P3[3. Execution<br/>Run selected<br/>agent with tools]\n P3 --> P4[4. Assessment<br/>Evaluate quality<br/>Check progress]\n P4 --> Decision{Quality OK?<br/>Progress made?}\n Decision -->|Yes| P6[6. Synthesis<br/>Combine results<br/>Generate report]\n Decision -->|No| P5[5. Iteration<br/>Adjust plan<br/>Try again]\n P5 --> P2\n P6 --> Done([Complete])\n\n style P1 fill:#fff4e6\n style P2 fill:#ffe6e6\n style P3 fill:#e6f3ff\n style P4 fill:#ffd6d6\n style P5 fill:#fff3cd\n style P6 fill:#d4edda\n style Done fill:#e1f5e1"},{"location":"architecture/workflow-diagrams/#3-simplified-agent-architecture","title":"3. Simplified Agent Architecture","text":"graph TB\n subgraph \"Orchestration Layer\"\n Manager[Magentic Manager<br/>\u2022 Plans workflow<br/>\u2022 Selects agents<br/>\u2022 Assesses quality<br/>\u2022 Adapts strategy]\n SharedContext[(Shared Context<br/>\u2022 Hypotheses<br/>\u2022 Search Results<br/>\u2022 Analysis<br/>\u2022 Progress)]\n Manager <--> SharedContext\n end\n\n subgraph \"Specialist Agents\"\n HypAgent[Hypothesis Agent<br/>\u2022 Domain understanding<br/>\u2022 Hypothesis generation<br/>\u2022 Testability refinement]\n SearchAgent[Search Agent<br/>\u2022 Multi-source search<br/>\u2022 RAG retrieval<br/>\u2022 Result ranking]\n AnalysisAgent[Analysis Agent<br/>\u2022 Evidence extraction<br/>\u2022 Statistical analysis<br/>\u2022 Code execution]\n ReportAgent[Report Agent<br/>\u2022 Report assembly<br/>\u2022 Visualization<br/>\u2022 Citation formatting]\n end\n\n subgraph \"MCP Tools\"\n WebSearch[Web Search<br/>PubMed \u2022 arXiv \u2022 bioRxiv]\n CodeExec[Code Execution<br/>Sandboxed Python]\n RAG[RAG Retrieval<br/>Vector DB \u2022 Embeddings]\n Viz[Visualization<br/>Charts \u2022 Graphs]\n end\n\n Manager -->|Selects & Directs| HypAgent\n Manager -->|Selects & Directs| SearchAgent\n Manager -->|Selects & Directs| AnalysisAgent\n Manager -->|Selects & Directs| ReportAgent\n\n HypAgent --> SharedContext\n SearchAgent --> SharedContext\n AnalysisAgent --> SharedContext\n ReportAgent --> SharedContext\n\n SearchAgent --> WebSearch\n SearchAgent --> RAG\n AnalysisAgent --> CodeExec\n ReportAgent --> CodeExec\n ReportAgent --> Viz\n\n style Manager fill:#ffe6e6\n style SharedContext fill:#ffe6f0\n style HypAgent fill:#fff4e6\n style SearchAgent fill:#fff4e6\n style AnalysisAgent fill:#fff4e6\n style ReportAgent fill:#fff4e6\n style WebSearch fill:#e6f3ff\n style CodeExec fill:#e6f3ff\n style RAG fill:#e6f3ff\n style Viz fill:#e6f3ff"},{"location":"architecture/workflow-diagrams/#4-dynamic-workflow-example","title":"4. Dynamic Workflow Example","text":"sequenceDiagram\n participant User\n participant Manager\n participant HypAgent\n participant SearchAgent\n participant AnalysisAgent\n participant ReportAgent\n\n User->>Manager: \"Research protein folding in Alzheimer's\"\n\n Note over Manager: PLAN: Generate hypotheses \u2192 Search \u2192 Analyze \u2192 Report\n\n Manager->>HypAgent: Generate 3 hypotheses\n HypAgent-->>Manager: Returns 3 hypotheses\n Note over Manager: ASSESS: Good quality, proceed\n\n Manager->>SearchAgent: Search literature for hypothesis 1\n SearchAgent-->>Manager: Returns 15 papers\n Note over Manager: ASSESS: Good results, continue\n\n Manager->>SearchAgent: Search for hypothesis 2\n SearchAgent-->>Manager: Only 2 papers found\n Note over Manager: ASSESS: Insufficient, refine search\n\n Manager->>SearchAgent: Refined query for hypothesis 2\n SearchAgent-->>Manager: Returns 12 papers\n Note over Manager: ASSESS: Better, proceed\n\n Manager->>AnalysisAgent: Analyze evidence for all hypotheses\n AnalysisAgent-->>Manager: Returns analysis with code\n Note over Manager: ASSESS: Complete, generate report\n\n Manager->>ReportAgent: Create comprehensive report\n ReportAgent-->>Manager: Returns formatted report\n Note over Manager: SYNTHESIZE: Combine all results\n\n Manager->>User: Final Research Report"},{"location":"architecture/workflow-diagrams/#5-manager-decision-logic","title":"5. Manager Decision Logic","text":"flowchart TD\n Start([Manager Receives Task]) --> Plan[Create Initial Plan]\n\n Plan --> Select[Select Agent for Next Subtask]\n Select --> Execute[Execute Agent]\n Execute --> Collect[Collect Results]\n\n Collect --> Assess[Assess Quality & Progress]\n\n Assess --> Q1{Quality Sufficient?}\n Q1 -->|No| Q2{Same Agent Can Fix?}\n Q2 -->|Yes| Feedback[Provide Specific Feedback]\n Feedback --> Execute\n Q2 -->|No| Different[Try Different Agent]\n Different --> Select\n\n Q1 -->|Yes| Q3{Task Complete?}\n Q3 -->|No| Q4{Making Progress?}\n Q4 -->|Yes| Select\n Q4 -->|No - Stalled| Replan[Reset Plan & Approach]\n Replan --> Plan\n\n Q3 -->|Yes| Synth[Synthesize Final Result]\n Synth --> Done([Return Report])\n\n style Start fill:#e1f5e1\n style Plan fill:#fff4e6\n style Select fill:#ffe6e6\n style Execute fill:#e6f3ff\n style Assess fill:#ffd6d6\n style Q1 fill:#ffe6e6\n style Q2 fill:#ffe6e6\n style Q3 fill:#ffe6e6\n style Q4 fill:#ffe6e6\n style Synth fill:#d4edda\n style Done fill:#e1f5e1"},{"location":"architecture/workflow-diagrams/#6-hypothesis-agent-workflow","title":"6. Hypothesis Agent Workflow","text":"flowchart LR\n Input[Research Query] --> Domain[Identify Domain<br/>& Key Concepts]\n Domain --> Context[Retrieve Background<br/>Knowledge]\n Context --> Generate[Generate 3-5<br/>Initial Hypotheses]\n Generate --> Refine[Refine for<br/>Testability]\n Refine --> Rank[Rank by<br/>Quality Score]\n Rank --> Output[Return Top<br/>Hypotheses]\n\n Output --> Struct[Hypothesis Structure:<br/>\u2022 Statement<br/>\u2022 Rationale<br/>\u2022 Testability Score<br/>\u2022 Data Requirements<br/>\u2022 Expected Outcomes]\n\n style Input fill:#e1f5e1\n style Output fill:#fff4e6\n style Struct fill:#e6f3ff"},{"location":"architecture/workflow-diagrams/#7-search-agent-workflow","title":"7. Search Agent Workflow","text":"flowchart TD\n Input[Hypotheses] --> Strategy[Formulate Search<br/>Strategy per Hypothesis]\n\n Strategy --> Multi[Multi-Source Search]\n\n Multi --> PubMed[PubMed Search<br/>via MCP]\n Multi --> ArXiv[arXiv Search<br/>via MCP]\n Multi --> BioRxiv[bioRxiv Search<br/>via MCP]\n\n PubMed --> Aggregate[Aggregate Results]\n ArXiv --> Aggregate\n BioRxiv --> Aggregate\n\n Aggregate --> Filter[Filter & Rank<br/>by Relevance]\n Filter --> Dedup[Deduplicate<br/>Cross-Reference]\n Dedup --> Embed[Embed Documents<br/>via MCP]\n Embed --> Vector[(Vector DB)]\n Vector --> RAGRetrieval[RAG Retrieval<br/>Top-K per Hypothesis]\n RAGRetrieval --> Output[Return Contextualized<br/>Search Results]\n\n style Input fill:#fff4e6\n style Multi fill:#ffe6e6\n style Vector fill:#ffe6f0\n style Output fill:#e6f3ff"},{"location":"architecture/workflow-diagrams/#8-analysis-agent-workflow","title":"8. Analysis Agent Workflow","text":"flowchart TD\n Input1[Hypotheses] --> Extract\n Input2[Search Results] --> Extract[Extract Evidence<br/>per Hypothesis]\n\n Extract --> Methods[Determine Analysis<br/>Methods Needed]\n\n Methods --> Branch{Requires<br/>Computation?}\n Branch -->|Yes| GenCode[Generate Python<br/>Analysis Code]\n Branch -->|No| Qual[Qualitative<br/>Synthesis]\n\n GenCode --> Execute[Execute Code<br/>via MCP Sandbox]\n Execute --> Interpret1[Interpret<br/>Results]\n Qual --> Interpret2[Interpret<br/>Findings]\n\n Interpret1 --> Synthesize[Synthesize Evidence<br/>Across Sources]\n Interpret2 --> Synthesize\n\n Synthesize --> Verdict[Determine Verdict<br/>per Hypothesis]\n Verdict --> Support[\u2022 Supported<br/>\u2022 Refuted<br/>\u2022 Inconclusive]\n Support --> Gaps[Identify Knowledge<br/>Gaps & Limitations]\n Gaps --> Output[Return Analysis<br/>Report]\n\n style Input1 fill:#fff4e6\n style Input2 fill:#e6f3ff\n style Execute fill:#ffe6e6\n style Output fill:#e6ffe6"},{"location":"architecture/workflow-diagrams/#9-report-agent-workflow","title":"9. Report Agent Workflow","text":"flowchart TD\n Input1[Query] --> Assemble\n Input2[Hypotheses] --> Assemble\n Input3[Search Results] --> Assemble\n Input4[Analysis] --> Assemble[Assemble Report<br/>Sections]\n\n Assemble --> Exec[Executive Summary]\n Assemble --> Intro[Introduction]\n Assemble --> Methods[Methods]\n Assemble --> Results[Results per<br/>Hypothesis]\n Assemble --> Discussion[Discussion]\n Assemble --> Future[Future Directions]\n Assemble --> Refs[References]\n\n Results --> VizCheck{Needs<br/>Visualization?}\n VizCheck -->|Yes| GenViz[Generate Viz Code]\n GenViz --> ExecViz[Execute via MCP<br/>Create Charts]\n ExecViz --> Combine\n VizCheck -->|No| Combine[Combine All<br/>Sections]\n\n Exec --> Combine\n Intro --> Combine\n Methods --> Combine\n Discussion --> Combine\n Future --> Combine\n Refs --> Combine\n\n Combine --> Format[Format Output]\n Format --> MD[Markdown]\n Format --> PDF[PDF]\n Format --> JSON[JSON]\n\n MD --> Output[Return Final<br/>Report]\n PDF --> Output\n JSON --> Output\n\n style Input1 fill:#e1f5e1\n style Input2 fill:#fff4e6\n style Input3 fill:#e6f3ff\n style Input4 fill:#e6ffe6\n style Output fill:#d4edda"},{"location":"architecture/workflow-diagrams/#10-data-flow-event-streaming","title":"10. Data Flow & Event Streaming","text":"flowchart TD\n User[\ud83d\udc64 User] -->|Research Query| UI[Gradio UI]\n UI -->|Submit| Manager[Magentic Manager]\n\n Manager -->|Event: Planning| UI\n Manager -->|Select Agent| HypAgent[Hypothesis Agent]\n HypAgent -->|Event: Delta/Message| UI\n HypAgent -->|Hypotheses| Context[(Shared Context)]\n\n Context -->|Retrieved by| Manager\n Manager -->|Select Agent| SearchAgent[Search Agent]\n SearchAgent -->|MCP Request| WebSearch[Web Search Tool]\n WebSearch -->|Results| SearchAgent\n SearchAgent -->|Event: Delta/Message| UI\n SearchAgent -->|Documents| Context\n SearchAgent -->|Embeddings| VectorDB[(Vector DB)]\n\n Context -->|Retrieved by| Manager\n Manager -->|Select Agent| AnalysisAgent[Analysis Agent]\n AnalysisAgent -->|MCP Request| CodeExec[Code Execution Tool]\n CodeExec -->|Results| AnalysisAgent\n AnalysisAgent -->|Event: Delta/Message| UI\n AnalysisAgent -->|Analysis| Context\n\n Context -->|Retrieved by| Manager\n Manager -->|Select Agent| ReportAgent[Report Agent]\n ReportAgent -->|MCP Request| CodeExec\n ReportAgent -->|Event: Delta/Message| UI\n ReportAgent -->|Report| Context\n\n Manager -->|Event: Final Result| UI\n UI -->|Display| User\n\n style User fill:#e1f5e1\n style UI fill:#e6f3ff\n style Manager fill:#ffe6e6\n style Context fill:#ffe6f0\n style VectorDB fill:#ffe6f0\n style WebSearch fill:#f0f0f0\n style CodeExec fill:#f0f0f0"},{"location":"architecture/workflow-diagrams/#11-mcp-tool-architecture","title":"11. MCP Tool Architecture","text":"graph TB\n subgraph \"Agent Layer\"\n Manager[Magentic Manager]\n HypAgent[Hypothesis Agent]\n SearchAgent[Search Agent]\n AnalysisAgent[Analysis Agent]\n ReportAgent[Report Agent]\n end\n\n subgraph \"MCP Protocol Layer\"\n Registry[MCP Tool Registry<br/>\u2022 Discovers tools<br/>\u2022 Routes requests<br/>\u2022 Manages connections]\n end\n\n subgraph \"MCP Servers\"\n Server1[Web Search Server<br/>localhost:8001<br/>\u2022 PubMed<br/>\u2022 arXiv<br/>\u2022 bioRxiv]\n Server2[Code Execution Server<br/>localhost:8002<br/>\u2022 Sandboxed Python<br/>\u2022 Package management]\n Server3[RAG Server<br/>localhost:8003<br/>\u2022 Vector embeddings<br/>\u2022 Similarity search]\n Server4[Visualization Server<br/>localhost:8004<br/>\u2022 Chart generation<br/>\u2022 Plot rendering]\n end\n\n subgraph \"External Services\"\n PubMed[PubMed API]\n ArXiv[arXiv API]\n BioRxiv[bioRxiv API]\n Modal[Modal Sandbox]\n ChromaDB[(ChromaDB)]\n end\n\n SearchAgent -->|Request| Registry\n AnalysisAgent -->|Request| Registry\n ReportAgent -->|Request| Registry\n\n Registry --> Server1\n Registry --> Server2\n Registry --> Server3\n Registry --> Server4\n\n Server1 --> PubMed\n Server1 --> ArXiv\n Server1 --> BioRxiv\n Server2 --> Modal\n Server3 --> ChromaDB\n\n style Manager fill:#ffe6e6\n style Registry fill:#fff4e6\n style Server1 fill:#e6f3ff\n style Server2 fill:#e6f3ff\n style Server3 fill:#e6f3ff\n style Server4 fill:#e6f3ff"},{"location":"architecture/workflow-diagrams/#12-progress-tracking-stall-detection","title":"12. Progress Tracking & Stall Detection","text":"stateDiagram-v2\n [*] --> Initialization: User Query\n\n Initialization --> Planning: Manager starts\n\n Planning --> AgentExecution: Select agent\n\n AgentExecution --> Assessment: Collect results\n\n Assessment --> QualityCheck: Evaluate output\n\n QualityCheck --> AgentExecution: Poor quality<br/>(retry < max_rounds)\n QualityCheck --> Planning: Poor quality<br/>(try different agent)\n QualityCheck --> NextAgent: Good quality<br/>(task incomplete)\n QualityCheck --> Synthesis: Good quality<br/>(task complete)\n\n NextAgent --> AgentExecution: Select next agent\n\n state StallDetection <<choice>>\n Assessment --> StallDetection: Check progress\n StallDetection --> Planning: No progress<br/>(stall count < max)\n StallDetection --> ErrorRecovery: No progress<br/>(max stalls reached)\n\n ErrorRecovery --> PartialReport: Generate partial results\n PartialReport --> [*]\n\n Synthesis --> FinalReport: Combine all outputs\n FinalReport --> [*]\n\n note right of QualityCheck\n Manager assesses:\n \u2022 Output completeness\n \u2022 Quality metrics\n \u2022 Progress made\n end note\n\n note right of StallDetection\n Stall = no new progress\n after agent execution\n Triggers plan reset\n end note"},{"location":"architecture/workflow-diagrams/#13-gradio-ui-integration","title":"13. Gradio UI Integration","text":"graph TD\n App[Gradio App<br/>DeepCritical Research Agent]\n\n App --> Input[Input Section]\n App --> Status[Status Section]\n App --> Output[Output Section]\n\n Input --> Query[Research Question<br/>Text Area]\n Input --> Controls[Controls]\n Controls --> MaxHyp[Max Hypotheses: 1-10]\n Controls --> MaxRounds[Max Rounds: 5-20]\n Controls --> Submit[Start Research Button]\n\n Status --> Log[Real-time Event Log<br/>\u2022 Manager planning<br/>\u2022 Agent selection<br/>\u2022 Execution updates<br/>\u2022 Quality assessment]\n Status --> Progress[Progress Tracker<br/>\u2022 Current agent<br/>\u2022 Round count<br/>\u2022 Stall count]\n\n Output --> Tabs[Tabbed Results]\n Tabs --> Tab1[Hypotheses Tab<br/>Generated hypotheses with scores]\n Tabs --> Tab2[Search Results Tab<br/>Papers & sources found]\n Tabs --> Tab3[Analysis Tab<br/>Evidence & verdicts]\n Tabs --> Tab4[Report Tab<br/>Final research report]\n Tab4 --> Download[Download Report<br/>MD / PDF / JSON]\n\n Submit -.->|Triggers| Workflow[Magentic Workflow]\n Workflow -.->|MagenticOrchestratorMessageEvent| Log\n Workflow -.->|MagenticAgentDeltaEvent| Log\n Workflow -.->|MagenticAgentMessageEvent| Log\n Workflow -.->|MagenticFinalResultEvent| Tab4\n\n style App fill:#e1f5e1\n style Input fill:#fff4e6\n style Status fill:#e6f3ff\n style Output fill:#e6ffe6\n style Workflow fill:#ffe6e6"},{"location":"architecture/workflow-diagrams/#14-complete-system-context","title":"14. Complete System Context","text":"graph LR\n User[\ud83d\udc64 Researcher<br/>Asks research questions] -->|Submits query| DC[DeepCritical<br/>Magentic Workflow]\n\n DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]\n DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]\n DC -->|Biology search| BioRxiv[bioRxiv API<br/>Biology preprints]\n DC -->|Agent reasoning| Claude[Claude API<br/>Sonnet 4 / Opus]\n DC -->|Code execution| Modal[Modal Sandbox<br/>Safe Python env]\n DC -->|Vector storage| Chroma[ChromaDB<br/>Embeddings & RAG]\n\n DC -->|Deployed on| HF[HuggingFace Spaces<br/>Gradio 6.0]\n\n PubMed -->|Results| DC\n ArXiv -->|Results| DC\n BioRxiv -->|Results| DC\n Claude -->|Responses| DC\n Modal -->|Output| DC\n Chroma -->|Context| DC\n\n DC -->|Research report| User\n\n style User fill:#e1f5e1\n style DC fill:#ffe6e6\n style PubMed fill:#e6f3ff\n style ArXiv fill:#e6f3ff\n style BioRxiv fill:#e6f3ff\n style Claude fill:#ffd6d6\n style Modal fill:#f0f0f0\n style Chroma fill:#ffe6f0\n style HF fill:#d4edda"},{"location":"architecture/workflow-diagrams/#15-workflow-timeline-simplified","title":"15. Workflow Timeline (Simplified)","text":"gantt\n title DeepCritical Magentic Workflow - Typical Execution\n dateFormat mm:ss\n axisFormat %M:%S\n\n section Manager Planning\n Initial planning :p1, 00:00, 10s\n\n section Hypothesis Agent\n Generate hypotheses :h1, after p1, 30s\n Manager assessment :h2, after h1, 5s\n\n section Search Agent\n Search hypothesis 1 :s1, after h2, 20s\n Search hypothesis 2 :s2, after s1, 20s\n Search hypothesis 3 :s3, after s2, 20s\n RAG processing :s4, after s3, 15s\n Manager assessment :s5, after s4, 5s\n\n section Analysis Agent\n Evidence extraction :a1, after s5, 15s\n Code generation :a2, after a1, 20s\n Code execution :a3, after a2, 25s\n Synthesis :a4, after a3, 20s\n Manager assessment :a5, after a4, 5s\n\n section Report Agent\n Report assembly :r1, after a5, 30s\n Visualization :r2, after r1, 15s\n Formatting :r3, after r2, 10s\n\n section Manager Synthesis\n Final synthesis :f1, after r3, 10s"},{"location":"architecture/workflow-diagrams/#key-differences-from-original-design","title":"Key Differences from Original Design","text":"Aspect Original (Judge-in-Loop) New (Magentic) Control Flow Fixed sequential phases Dynamic agent selection Quality Control Separate Judge Agent Manager assessment built-in Retry Logic Phase-level with feedback Agent-level with adaptation Flexibility Rigid 4-phase pipeline Adaptive workflow Complexity 5 agents (including Judge) 4 agents (no Judge) Progress Tracking Manual state management Built-in round/stall detection Agent Coordination Sequential handoff Manager-driven dynamic selection Error Recovery Retry same phase Try different agent or replan"},{"location":"architecture/workflow-diagrams/#simplified-design-principles","title":"Simplified Design Principles","text":"Simple 4-Agent Setup:
workflow = (\n MagenticBuilder()\n .participants(\n hypothesis=HypothesisAgent(tools=[background_tool]),\n search=SearchAgent(tools=[web_search, rag_tool]),\n analysis=AnalysisAgent(tools=[code_execution]),\n report=ReportAgent(tools=[code_execution, visualization])\n )\n .with_standard_manager(\n chat_client=AnthropicClient(model=\"claude-sonnet-4\"),\n max_round_count=15, # Prevent infinite loops\n max_stall_count=3 # Detect stuck workflows\n )\n .build()\n)\n Manager handles quality assessment in its instructions: - Checks hypothesis quality (testable, novel, clear) - Validates search results (relevant, authoritative, recent) - Assesses analysis soundness (methodology, evidence, conclusions) - Ensures report completeness (all sections, proper citations)
No separate Judge Agent needed - manager does it all!
Document Version: 2.0 (Magentic Simplified) Last Updated: 2025-11-24 Architecture: Microsoft Magentic Orchestration Pattern Agents: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager License: MIT
"},{"location":"architecture/workflow-diagrams/#see-also","title":"See Also","text":"Architecture Pattern: Microsoft Magentic Orchestration Design Philosophy: Simple, dynamic, manager-driven coordination Key Innovation: Intelligent manager replaces rigid sequential phases
"},{"location":"architecture/workflows/#1-high-level-magentic-workflow","title":"1. High-Level Magentic Workflow","text":"flowchart TD\n Start([User Query]) --> Manager[Magentic Manager<br/>Plan \u2022 Select \u2022 Assess \u2022 Adapt]\n\n Manager -->|Plans| Task1[Task Decomposition]\n Task1 --> Manager\n\n Manager -->|Selects & Executes| HypAgent[Hypothesis Agent]\n Manager -->|Selects & Executes| SearchAgent[Search Agent]\n Manager -->|Selects & Executes| AnalysisAgent[Analysis Agent]\n Manager -->|Selects & Executes| ReportAgent[Report Agent]\n\n HypAgent -->|Results| Manager\n SearchAgent -->|Results| Manager\n AnalysisAgent -->|Results| Manager\n ReportAgent -->|Results| Manager\n\n Manager -->|Assesses Quality| Decision{Good Enough?}\n Decision -->|No - Refine| Manager\n Decision -->|No - Different Agent| Manager\n Decision -->|No - Stalled| Replan[Reset Plan]\n Replan --> Manager\n\n Decision -->|Yes| Synthesis[Synthesize Final Result]\n Synthesis --> Output([Research Report])\n\n style Start fill:#e1f5e1\n style Manager fill:#ffe6e6\n style HypAgent fill:#fff4e6\n style SearchAgent fill:#fff4e6\n style AnalysisAgent fill:#fff4e6\n style ReportAgent fill:#fff4e6\n style Decision fill:#ffd6d6\n style Synthesis fill:#d4edda\n style Output fill:#e1f5e1"},{"location":"architecture/workflows/#2-magentic-manager-the-6-phase-cycle","title":"2. Magentic Manager: The 6-Phase Cycle","text":"flowchart LR\n P1[1. Planning<br/>Analyze task<br/>Create strategy] --> P2[2. Agent Selection<br/>Pick best agent<br/>for subtask]\n P2 --> P3[3. Execution<br/>Run selected<br/>agent with tools]\n P3 --> P4[4. Assessment<br/>Evaluate quality<br/>Check progress]\n P4 --> Decision{Quality OK?<br/>Progress made?}\n Decision -->|Yes| P6[6. Synthesis<br/>Combine results<br/>Generate report]\n Decision -->|No| P5[5. Iteration<br/>Adjust plan<br/>Try again]\n P5 --> P2\n P6 --> Done([Complete])\n\n style P1 fill:#fff4e6\n style P2 fill:#ffe6e6\n style P3 fill:#e6f3ff\n style P4 fill:#ffd6d6\n style P5 fill:#fff3cd\n style P6 fill:#d4edda\n style Done fill:#e1f5e1"},{"location":"architecture/workflows/#3-simplified-agent-architecture","title":"3. Simplified Agent Architecture","text":"graph TB\n subgraph \"Orchestration Layer\"\n Manager[Magentic Manager<br/>\u2022 Plans workflow<br/>\u2022 Selects agents<br/>\u2022 Assesses quality<br/>\u2022 Adapts strategy]\n SharedContext[(Shared Context<br/>\u2022 Hypotheses<br/>\u2022 Search Results<br/>\u2022 Analysis<br/>\u2022 Progress)]\n Manager <--> SharedContext\n end\n\n subgraph \"Specialist Agents\"\n HypAgent[Hypothesis Agent<br/>\u2022 Domain understanding<br/>\u2022 Hypothesis generation<br/>\u2022 Testability refinement]\n SearchAgent[Search Agent<br/>\u2022 Multi-source search<br/>\u2022 RAG retrieval<br/>\u2022 Result ranking]\n AnalysisAgent[Analysis Agent<br/>\u2022 Evidence extraction<br/>\u2022 Statistical analysis<br/>\u2022 Code execution]\n ReportAgent[Report Agent<br/>\u2022 Report assembly<br/>\u2022 Visualization<br/>\u2022 Citation formatting]\n end\n\n subgraph \"MCP Tools\"\n WebSearch[Web Search<br/>PubMed \u2022 arXiv \u2022 bioRxiv]\n CodeExec[Code Execution<br/>Sandboxed Python]\n RAG[RAG Retrieval<br/>Vector DB \u2022 Embeddings]\n Viz[Visualization<br/>Charts \u2022 Graphs]\n end\n\n Manager -->|Selects & Directs| HypAgent\n Manager -->|Selects & Directs| SearchAgent\n Manager -->|Selects & Directs| AnalysisAgent\n Manager -->|Selects & Directs| ReportAgent\n\n HypAgent --> SharedContext\n SearchAgent --> SharedContext\n AnalysisAgent --> SharedContext\n ReportAgent --> SharedContext\n\n SearchAgent --> WebSearch\n SearchAgent --> RAG\n AnalysisAgent --> CodeExec\n ReportAgent --> CodeExec\n ReportAgent --> Viz\n\n style Manager fill:#ffe6e6\n style SharedContext fill:#ffe6f0\n style HypAgent fill:#fff4e6\n style SearchAgent fill:#fff4e6\n style AnalysisAgent fill:#fff4e6\n style ReportAgent fill:#fff4e6\n style WebSearch fill:#e6f3ff\n style CodeExec fill:#e6f3ff\n style RAG fill:#e6f3ff\n style Viz fill:#e6f3ff"},{"location":"architecture/workflows/#4-dynamic-workflow-example","title":"4. Dynamic Workflow Example","text":"sequenceDiagram\n participant User\n participant Manager\n participant HypAgent\n participant SearchAgent\n participant AnalysisAgent\n participant ReportAgent\n\n User->>Manager: \"Research protein folding in Alzheimer's\"\n\n Note over Manager: PLAN: Generate hypotheses \u2192 Search \u2192 Analyze \u2192 Report\n\n Manager->>HypAgent: Generate 3 hypotheses\n HypAgent-->>Manager: Returns 3 hypotheses\n Note over Manager: ASSESS: Good quality, proceed\n\n Manager->>SearchAgent: Search literature for hypothesis 1\n SearchAgent-->>Manager: Returns 15 papers\n Note over Manager: ASSESS: Good results, continue\n\n Manager->>SearchAgent: Search for hypothesis 2\n SearchAgent-->>Manager: Only 2 papers found\n Note over Manager: ASSESS: Insufficient, refine search\n\n Manager->>SearchAgent: Refined query for hypothesis 2\n SearchAgent-->>Manager: Returns 12 papers\n Note over Manager: ASSESS: Better, proceed\n\n Manager->>AnalysisAgent: Analyze evidence for all hypotheses\n AnalysisAgent-->>Manager: Returns analysis with code\n Note over Manager: ASSESS: Complete, generate report\n\n Manager->>ReportAgent: Create comprehensive report\n ReportAgent-->>Manager: Returns formatted report\n Note over Manager: SYNTHESIZE: Combine all results\n\n Manager->>User: Final Research Report"},{"location":"architecture/workflows/#5-manager-decision-logic","title":"5. Manager Decision Logic","text":"flowchart TD\n Start([Manager Receives Task]) --> Plan[Create Initial Plan]\n\n Plan --> Select[Select Agent for Next Subtask]\n Select --> Execute[Execute Agent]\n Execute --> Collect[Collect Results]\n\n Collect --> Assess[Assess Quality & Progress]\n\n Assess --> Q1{Quality Sufficient?}\n Q1 -->|No| Q2{Same Agent Can Fix?}\n Q2 -->|Yes| Feedback[Provide Specific Feedback]\n Feedback --> Execute\n Q2 -->|No| Different[Try Different Agent]\n Different --> Select\n\n Q1 -->|Yes| Q3{Task Complete?}\n Q3 -->|No| Q4{Making Progress?}\n Q4 -->|Yes| Select\n Q4 -->|No - Stalled| Replan[Reset Plan & Approach]\n Replan --> Plan\n\n Q3 -->|Yes| Synth[Synthesize Final Result]\n Synth --> Done([Return Report])\n\n style Start fill:#e1f5e1\n style Plan fill:#fff4e6\n style Select fill:#ffe6e6\n style Execute fill:#e6f3ff\n style Assess fill:#ffd6d6\n style Q1 fill:#ffe6e6\n style Q2 fill:#ffe6e6\n style Q3 fill:#ffe6e6\n style Q4 fill:#ffe6e6\n style Synth fill:#d4edda\n style Done fill:#e1f5e1"},{"location":"architecture/workflows/#6-hypothesis-agent-workflow","title":"6. Hypothesis Agent Workflow","text":"flowchart LR\n Input[Research Query] --> Domain[Identify Domain<br/>& Key Concepts]\n Domain --> Context[Retrieve Background<br/>Knowledge]\n Context --> Generate[Generate 3-5<br/>Initial Hypotheses]\n Generate --> Refine[Refine for<br/>Testability]\n Refine --> Rank[Rank by<br/>Quality Score]\n Rank --> Output[Return Top<br/>Hypotheses]\n\n Output --> Struct[Hypothesis Structure:<br/>\u2022 Statement<br/>\u2022 Rationale<br/>\u2022 Testability Score<br/>\u2022 Data Requirements<br/>\u2022 Expected Outcomes]\n\n style Input fill:#e1f5e1\n style Output fill:#fff4e6\n style Struct fill:#e6f3ff"},{"location":"architecture/workflows/#7-search-agent-workflow","title":"7. Search Agent Workflow","text":"flowchart TD\n Input[Hypotheses] --> Strategy[Formulate Search<br/>Strategy per Hypothesis]\n\n Strategy --> Multi[Multi-Source Search]\n\n Multi --> PubMed[PubMed Search<br/>via MCP]\n Multi --> ArXiv[arXiv Search<br/>via MCP]\n Multi --> BioRxiv[bioRxiv Search<br/>via MCP]\n\n PubMed --> Aggregate[Aggregate Results]\n ArXiv --> Aggregate\n BioRxiv --> Aggregate\n\n Aggregate --> Filter[Filter & Rank<br/>by Relevance]\n Filter --> Dedup[Deduplicate<br/>Cross-Reference]\n Dedup --> Embed[Embed Documents<br/>via MCP]\n Embed --> Vector[(Vector DB)]\n Vector --> RAGRetrieval[RAG Retrieval<br/>Top-K per Hypothesis]\n RAGRetrieval --> Output[Return Contextualized<br/>Search Results]\n\n style Input fill:#fff4e6\n style Multi fill:#ffe6e6\n style Vector fill:#ffe6f0\n style Output fill:#e6f3ff"},{"location":"architecture/workflows/#8-analysis-agent-workflow","title":"8. Analysis Agent Workflow","text":"flowchart TD\n Input1[Hypotheses] --> Extract\n Input2[Search Results] --> Extract[Extract Evidence<br/>per Hypothesis]\n\n Extract --> Methods[Determine Analysis<br/>Methods Needed]\n\n Methods --> Branch{Requires<br/>Computation?}\n Branch -->|Yes| GenCode[Generate Python<br/>Analysis Code]\n Branch -->|No| Qual[Qualitative<br/>Synthesis]\n\n GenCode --> Execute[Execute Code<br/>via MCP Sandbox]\n Execute --> Interpret1[Interpret<br/>Results]\n Qual --> Interpret2[Interpret<br/>Findings]\n\n Interpret1 --> Synthesize[Synthesize Evidence<br/>Across Sources]\n Interpret2 --> Synthesize\n\n Synthesize --> Verdict[Determine Verdict<br/>per Hypothesis]\n Verdict --> Support[\u2022 Supported<br/>\u2022 Refuted<br/>\u2022 Inconclusive]\n Support --> Gaps[Identify Knowledge<br/>Gaps & Limitations]\n Gaps --> Output[Return Analysis<br/>Report]\n\n style Input1 fill:#fff4e6\n style Input2 fill:#e6f3ff\n style Execute fill:#ffe6e6\n style Output fill:#e6ffe6"},{"location":"architecture/workflows/#9-report-agent-workflow","title":"9. Report Agent Workflow","text":"flowchart TD\n Input1[Query] --> Assemble\n Input2[Hypotheses] --> Assemble\n Input3[Search Results] --> Assemble\n Input4[Analysis] --> Assemble[Assemble Report<br/>Sections]\n\n Assemble --> Exec[Executive Summary]\n Assemble --> Intro[Introduction]\n Assemble --> Methods[Methods]\n Assemble --> Results[Results per<br/>Hypothesis]\n Assemble --> Discussion[Discussion]\n Assemble --> Future[Future Directions]\n Assemble --> Refs[References]\n\n Results --> VizCheck{Needs<br/>Visualization?}\n VizCheck -->|Yes| GenViz[Generate Viz Code]\n GenViz --> ExecViz[Execute via MCP<br/>Create Charts]\n ExecViz --> Combine\n VizCheck -->|No| Combine[Combine All<br/>Sections]\n\n Exec --> Combine\n Intro --> Combine\n Methods --> Combine\n Discussion --> Combine\n Future --> Combine\n Refs --> Combine\n\n Combine --> Format[Format Output]\n Format --> MD[Markdown]\n Format --> PDF[PDF]\n Format --> JSON[JSON]\n\n MD --> Output[Return Final<br/>Report]\n PDF --> Output\n JSON --> Output\n\n style Input1 fill:#e1f5e1\n style Input2 fill:#fff4e6\n style Input3 fill:#e6f3ff\n style Input4 fill:#e6ffe6\n style Output fill:#d4edda"},{"location":"architecture/workflows/#10-data-flow-event-streaming","title":"10. Data Flow & Event Streaming","text":"flowchart TD\n User[\ud83d\udc64 User] -->|Research Query| UI[Gradio UI]\n UI -->|Submit| Manager[Magentic Manager]\n\n Manager -->|Event: Planning| UI\n Manager -->|Select Agent| HypAgent[Hypothesis Agent]\n HypAgent -->|Event: Delta/Message| UI\n HypAgent -->|Hypotheses| Context[(Shared Context)]\n\n Context -->|Retrieved by| Manager\n Manager -->|Select Agent| SearchAgent[Search Agent]\n SearchAgent -->|MCP Request| WebSearch[Web Search Tool]\n WebSearch -->|Results| SearchAgent\n SearchAgent -->|Event: Delta/Message| UI\n SearchAgent -->|Documents| Context\n SearchAgent -->|Embeddings| VectorDB[(Vector DB)]\n\n Context -->|Retrieved by| Manager\n Manager -->|Select Agent| AnalysisAgent[Analysis Agent]\n AnalysisAgent -->|MCP Request| CodeExec[Code Execution Tool]\n CodeExec -->|Results| AnalysisAgent\n AnalysisAgent -->|Event: Delta/Message| UI\n AnalysisAgent -->|Analysis| Context\n\n Context -->|Retrieved by| Manager\n Manager -->|Select Agent| ReportAgent[Report Agent]\n ReportAgent -->|MCP Request| CodeExec\n ReportAgent -->|Event: Delta/Message| UI\n ReportAgent -->|Report| Context\n\n Manager -->|Event: Final Result| UI\n UI -->|Display| User\n\n style User fill:#e1f5e1\n style UI fill:#e6f3ff\n style Manager fill:#ffe6e6\n style Context fill:#ffe6f0\n style VectorDB fill:#ffe6f0\n style WebSearch fill:#f0f0f0\n style CodeExec fill:#f0f0f0"},{"location":"architecture/workflows/#11-mcp-tool-architecture","title":"11. MCP Tool Architecture","text":"graph TB\n subgraph \"Agent Layer\"\n Manager[Magentic Manager]\n HypAgent[Hypothesis Agent]\n SearchAgent[Search Agent]\n AnalysisAgent[Analysis Agent]\n ReportAgent[Report Agent]\n end\n\n subgraph \"MCP Protocol Layer\"\n Registry[MCP Tool Registry<br/>\u2022 Discovers tools<br/>\u2022 Routes requests<br/>\u2022 Manages connections]\n end\n\n subgraph \"MCP Servers\"\n Server1[Web Search Server<br/>localhost:8001<br/>\u2022 PubMed<br/>\u2022 arXiv<br/>\u2022 bioRxiv]\n Server2[Code Execution Server<br/>localhost:8002<br/>\u2022 Sandboxed Python<br/>\u2022 Package management]\n Server3[RAG Server<br/>localhost:8003<br/>\u2022 Vector embeddings<br/>\u2022 Similarity search]\n Server4[Visualization Server<br/>localhost:8004<br/>\u2022 Chart generation<br/>\u2022 Plot rendering]\n end\n\n subgraph \"External Services\"\n PubMed[PubMed API]\n ArXiv[arXiv API]\n BioRxiv[bioRxiv API]\n Modal[Modal Sandbox]\n ChromaDB[(ChromaDB)]\n end\n\n SearchAgent -->|Request| Registry\n AnalysisAgent -->|Request| Registry\n ReportAgent -->|Request| Registry\n\n Registry --> Server1\n Registry --> Server2\n Registry --> Server3\n Registry --> Server4\n\n Server1 --> PubMed\n Server1 --> ArXiv\n Server1 --> BioRxiv\n Server2 --> Modal\n Server3 --> ChromaDB\n\n style Manager fill:#ffe6e6\n style Registry fill:#fff4e6\n style Server1 fill:#e6f3ff\n style Server2 fill:#e6f3ff\n style Server3 fill:#e6f3ff\n style Server4 fill:#e6f3ff"},{"location":"architecture/workflows/#12-progress-tracking-stall-detection","title":"12. Progress Tracking & Stall Detection","text":"stateDiagram-v2\n [*] --> Initialization: User Query\n\n Initialization --> Planning: Manager starts\n\n Planning --> AgentExecution: Select agent\n\n AgentExecution --> Assessment: Collect results\n\n Assessment --> QualityCheck: Evaluate output\n\n QualityCheck --> AgentExecution: Poor quality<br/>(retry < max_rounds)\n QualityCheck --> Planning: Poor quality<br/>(try different agent)\n QualityCheck --> NextAgent: Good quality<br/>(task incomplete)\n QualityCheck --> Synthesis: Good quality<br/>(task complete)\n\n NextAgent --> AgentExecution: Select next agent\n\n state StallDetection <<choice>>\n Assessment --> StallDetection: Check progress\n StallDetection --> Planning: No progress<br/>(stall count < max)\n StallDetection --> ErrorRecovery: No progress<br/>(max stalls reached)\n\n ErrorRecovery --> PartialReport: Generate partial results\n PartialReport --> [*]\n\n Synthesis --> FinalReport: Combine all outputs\n FinalReport --> [*]\n\n note right of QualityCheck\n Manager assesses:\n \u2022 Output completeness\n \u2022 Quality metrics\n \u2022 Progress made\n end note\n\n note right of StallDetection\n Stall = no new progress\n after agent execution\n Triggers plan reset\n end note"},{"location":"architecture/workflows/#13-gradio-ui-integration","title":"13. Gradio UI Integration","text":"graph TD\n App[Gradio App<br/>DeepCritical Research Agent]\n\n App --> Input[Input Section]\n App --> Status[Status Section]\n App --> Output[Output Section]\n\n Input --> Query[Research Question<br/>Text Area]\n Input --> Controls[Controls]\n Controls --> MaxHyp[Max Hypotheses: 1-10]\n Controls --> MaxRounds[Max Rounds: 5-20]\n Controls --> Submit[Start Research Button]\n\n Status --> Log[Real-time Event Log<br/>\u2022 Manager planning<br/>\u2022 Agent selection<br/>\u2022 Execution updates<br/>\u2022 Quality assessment]\n Status --> Progress[Progress Tracker<br/>\u2022 Current agent<br/>\u2022 Round count<br/>\u2022 Stall count]\n\n Output --> Tabs[Tabbed Results]\n Tabs --> Tab1[Hypotheses Tab<br/>Generated hypotheses with scores]\n Tabs --> Tab2[Search Results Tab<br/>Papers & sources found]\n Tabs --> Tab3[Analysis Tab<br/>Evidence & verdicts]\n Tabs --> Tab4[Report Tab<br/>Final research report]\n Tab4 --> Download[Download Report<br/>MD / PDF / JSON]\n\n Submit -.->|Triggers| Workflow[Magentic Workflow]\n Workflow -.->|MagenticOrchestratorMessageEvent| Log\n Workflow -.->|MagenticAgentDeltaEvent| Log\n Workflow -.->|MagenticAgentMessageEvent| Log\n Workflow -.->|MagenticFinalResultEvent| Tab4\n\n style App fill:#e1f5e1\n style Input fill:#fff4e6\n style Status fill:#e6f3ff\n style Output fill:#e6ffe6\n style Workflow fill:#ffe6e6"},{"location":"architecture/workflows/#14-complete-system-context","title":"14. Complete System Context","text":"graph LR\n User[\ud83d\udc64 Researcher<br/>Asks research questions] -->|Submits query| DC[DeepCritical<br/>Magentic Workflow]\n\n DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]\n DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]\n DC -->|Biology search| BioRxiv[bioRxiv API<br/>Biology preprints]\n DC -->|Agent reasoning| Claude[Claude API<br/>Sonnet 4 / Opus]\n DC -->|Code execution| Modal[Modal Sandbox<br/>Safe Python env]\n DC -->|Vector storage| Chroma[ChromaDB<br/>Embeddings & RAG]\n\n DC -->|Deployed on| HF[HuggingFace Spaces<br/>Gradio 6.0]\n\n PubMed -->|Results| DC\n ArXiv -->|Results| DC\n BioRxiv -->|Results| DC\n Claude -->|Responses| DC\n Modal -->|Output| DC\n Chroma -->|Context| DC\n\n DC -->|Research report| User\n\n style User fill:#e1f5e1\n style DC fill:#ffe6e6\n style PubMed fill:#e6f3ff\n style ArXiv fill:#e6f3ff\n style BioRxiv fill:#e6f3ff\n style Claude fill:#ffd6d6\n style Modal fill:#f0f0f0\n style Chroma fill:#ffe6f0\n style HF fill:#d4edda"},{"location":"architecture/workflows/#15-workflow-timeline-simplified","title":"15. Workflow Timeline (Simplified)","text":"gantt\n title DeepCritical Magentic Workflow - Typical Execution\n dateFormat mm:ss\n axisFormat %M:%S\n\n section Manager Planning\n Initial planning :p1, 00:00, 10s\n\n section Hypothesis Agent\n Generate hypotheses :h1, after p1, 30s\n Manager assessment :h2, after h1, 5s\n\n section Search Agent\n Search hypothesis 1 :s1, after h2, 20s\n Search hypothesis 2 :s2, after s1, 20s\n Search hypothesis 3 :s3, after s2, 20s\n RAG processing :s4, after s3, 15s\n Manager assessment :s5, after s4, 5s\n\n section Analysis Agent\n Evidence extraction :a1, after s5, 15s\n Code generation :a2, after a1, 20s\n Code execution :a3, after a2, 25s\n Synthesis :a4, after a3, 20s\n Manager assessment :a5, after a4, 5s\n\n section Report Agent\n Report assembly :r1, after a5, 30s\n Visualization :r2, after r1, 15s\n Formatting :r3, after r2, 10s\n\n section Manager Synthesis\n Final synthesis :f1, after r3, 10s"},{"location":"architecture/workflows/#key-differences-from-original-design","title":"Key Differences from Original Design","text":"Aspect Original (Judge-in-Loop) New (Magentic) Control Flow Fixed sequential phases Dynamic agent selection Quality Control Separate Judge Agent Manager assessment built-in Retry Logic Phase-level with feedback Agent-level with adaptation Flexibility Rigid 4-phase pipeline Adaptive workflow Complexity 5 agents (including Judge) 4 agents (no Judge) Progress Tracking Manual state management Built-in round/stall detection Agent Coordination Sequential handoff Manager-driven dynamic selection Error Recovery Retry same phase Try different agent or replan"},{"location":"architecture/workflows/#simplified-design-principles","title":"Simplified Design Principles","text":"Simple 4-Agent Setup:
workflow = (\n MagenticBuilder()\n .participants(\n hypothesis=HypothesisAgent(tools=[background_tool]),\n search=SearchAgent(tools=[web_search, rag_tool]),\n analysis=AnalysisAgent(tools=[code_execution]),\n report=ReportAgent(tools=[code_execution, visualization])\n )\n .with_standard_manager(\n chat_client=AnthropicClient(model=\"claude-sonnet-4\"),\n max_round_count=15, # Prevent infinite loops\n max_stall_count=3 # Detect stuck workflows\n )\n .build()\n)\n Manager handles quality assessment in its instructions: - Checks hypothesis quality (testable, novel, clear) - Validates search results (relevant, authoritative, recent) - Assesses analysis soundness (methodology, evidence, conclusions) - Ensures report completeness (all sections, proper citations)
No separate Judge Agent needed - manager does it all!
Document Version: 2.0 (Magentic Simplified) Last Updated: 2025-11-24 Architecture: Microsoft Magentic Orchestration Pattern Agents: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager License: MIT
"},{"location":"configuration/","title":"Configuration Guide","text":""},{"location":"configuration/#overview","title":"Overview","text":"DeepCritical uses Pydantic Settings for centralized configuration management. All settings are defined in the Settings class in src/utils/config.py and can be configured via environment variables or a .env file.
The configuration system provides:
.env file (if present)settings instance for easy access throughout the codebase.env file in the project rootOPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN)The [Settings][settings-class] class extends BaseSettings from pydantic_settings and defines all application configuration:
View source
"},{"location":"configuration/#singleton-instance","title":"Singleton Instance","text":"A global settings instance is available for import:
View source
"},{"location":"configuration/#usage-pattern","title":"Usage Pattern","text":"Access configuration throughout the codebase:
from src.utils.config import settings\n\n# Check if API keys are available\nif settings.has_openai_key:\n # Use OpenAI\n pass\n\n# Access configuration values\nmax_iterations = settings.max_iterations\nweb_search_provider = settings.web_search_provider\n"},{"location":"configuration/#required-configuration","title":"Required Configuration","text":""},{"location":"configuration/#llm-provider","title":"LLM Provider","text":"You must configure at least one LLM provider. The system supports:
OPENAI_API_KEYANTHROPIC_API_KEYHF_TOKEN or HUGGINGFACE_API_KEY (can work without key for public models)LLM_PROVIDER=openai\nOPENAI_API_KEY=your_openai_api_key_here\nOPENAI_MODEL=gpt-5.1\n The default model is defined in the Settings class:
LLM_PROVIDER=anthropic\nANTHROPIC_API_KEY=your_anthropic_api_key_here\nANTHROPIC_MODEL=claude-sonnet-4-5-20250929\n The default model is defined in the Settings class:
HuggingFace can work without an API key for public models, but an API key provides higher rate limits:
# Option 1: Using HF_TOKEN (preferred)\nHF_TOKEN=your_huggingface_token_here\n\n# Option 2: Using HUGGINGFACE_API_KEY (alternative)\nHUGGINGFACE_API_KEY=your_huggingface_api_key_here\n\n# Default model\nHUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct\n The HuggingFace token can be set via either environment variable:
"},{"location":"configuration/#optional-configuration","title":"Optional Configuration","text":""},{"location":"configuration/#embedding-configuration","title":"Embedding Configuration","text":"DeepCritical supports multiple embedding providers for semantic search and RAG:
# Embedding Provider: \"openai\", \"local\", or \"huggingface\"\nEMBEDDING_PROVIDER=local\n\n# OpenAI Embedding Model (used by LlamaIndex RAG)\nOPENAI_EMBEDDING_MODEL=text-embedding-3-small\n\n# Local Embedding Model (sentence-transformers, used by EmbeddingService)\nLOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2\n\n# HuggingFace Embedding Model\nHUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2\n The embedding provider configuration:
Note: OpenAI embeddings require OPENAI_API_KEY. The local provider (default) uses sentence-transformers and requires no API key.
DeepCritical supports multiple web search providers:
# Web Search Provider: \"serper\", \"searchxng\", \"brave\", \"tavily\", or \"duckduckgo\"\n# Default: \"duckduckgo\" (no API key required)\nWEB_SEARCH_PROVIDER=duckduckgo\n\n# Serper API Key (for Google search via Serper)\nSERPER_API_KEY=your_serper_api_key_here\n\n# SearchXNG Host URL (for self-hosted search)\nSEARCHXNG_HOST=http://localhost:8080\n\n# Brave Search API Key\nBRAVE_API_KEY=your_brave_api_key_here\n\n# Tavily API Key\nTAVILY_API_KEY=your_tavily_api_key_here\n The web search provider configuration:
Note: DuckDuckGo is the default and requires no API key, making it ideal for development and testing.
"},{"location":"configuration/#pubmed-configuration","title":"PubMed Configuration","text":"PubMed search supports optional NCBI API key for higher rate limits:
# NCBI API Key (optional, for higher rate limits: 10 req/sec vs 3 req/sec)\nNCBI_API_KEY=your_ncbi_api_key_here\n The PubMed tool uses this configuration:
"},{"location":"configuration/#agent-configuration","title":"Agent Configuration","text":"Control agent behavior and research loop execution:
# Maximum iterations per research loop (1-50, default: 10)\nMAX_ITERATIONS=10\n\n# Search timeout in seconds\nSEARCH_TIMEOUT=30\n\n# Use graph-based execution for research flows\nUSE_GRAPH_EXECUTION=false\n The agent configuration fields:
"},{"location":"configuration/#budget-rate-limiting-configuration","title":"Budget & Rate Limiting Configuration","text":"Control resource limits for research loops:
# Default token budget per research loop (1000-1000000, default: 100000)\nDEFAULT_TOKEN_LIMIT=100000\n\n# Default time limit per research loop in minutes (1-120, default: 10)\nDEFAULT_TIME_LIMIT_MINUTES=10\n\n# Default iterations limit per research loop (1-50, default: 10)\nDEFAULT_ITERATIONS_LIMIT=10\n The budget configuration with validation:
"},{"location":"configuration/#rag-service-configuration","title":"RAG Service Configuration","text":"Configure the Retrieval-Augmented Generation service:
# ChromaDB collection name for RAG\nRAG_COLLECTION_NAME=deepcritical_evidence\n\n# Number of top results to retrieve from RAG (1-50, default: 5)\nRAG_SIMILARITY_TOP_K=5\n\n# Automatically ingest evidence into RAG\nRAG_AUTO_INGEST=true\n The RAG configuration:
"},{"location":"configuration/#chromadb-configuration","title":"ChromaDB Configuration","text":"Configure the vector database for embeddings and RAG:
# ChromaDB storage path\nCHROMA_DB_PATH=./chroma_db\n\n# Whether to persist ChromaDB to disk\nCHROMA_DB_PERSIST=true\n\n# ChromaDB server host (for remote ChromaDB, optional)\nCHROMA_DB_HOST=localhost\n\n# ChromaDB server port (for remote ChromaDB, optional)\nCHROMA_DB_PORT=8000\n The ChromaDB configuration:
"},{"location":"configuration/#external-services","title":"External Services","text":""},{"location":"configuration/#modal-configuration","title":"Modal Configuration","text":"Modal is used for secure sandbox execution of statistical analysis:
# Modal Token ID (for Modal sandbox execution)\nMODAL_TOKEN_ID=your_modal_token_id_here\n\n# Modal Token Secret\nMODAL_TOKEN_SECRET=your_modal_token_secret_here\n The Modal configuration:
"},{"location":"configuration/#logging-configuration","title":"Logging Configuration","text":"Configure structured logging:
# Log Level: \"DEBUG\", \"INFO\", \"WARNING\", or \"ERROR\"\nLOG_LEVEL=INFO\n The logging configuration:
Logging is configured via the configure_logging() function:
The Settings class provides helpful properties for checking configuration state:
Check which API keys are available:
Usage:
from src.utils.config import settings\n\n# Check API key availability\nif settings.has_openai_key:\n # Use OpenAI\n pass\n\nif settings.has_anthropic_key:\n # Use Anthropic\n pass\n\nif settings.has_huggingface_key:\n # Use HuggingFace\n pass\n\nif settings.has_any_llm_key:\n # At least one LLM is available\n pass\n"},{"location":"configuration/#service-availability","title":"Service Availability","text":"Check if external services are configured:
Usage:
from src.utils.config import settings\n\n# Check service availability\nif settings.modal_available:\n # Use Modal sandbox\n pass\n\nif settings.web_search_available:\n # Web search is configured\n pass\n"},{"location":"configuration/#api-key-retrieval","title":"API Key Retrieval","text":"Get the API key for the configured provider:
For OpenAI-specific operations (e.g., Magentic mode):
"},{"location":"configuration/#configuration-usage-in-codebase","title":"Configuration Usage in Codebase","text":"The configuration system is used throughout the codebase:
"},{"location":"configuration/#llm-factory","title":"LLM Factory","text":"The LLM factory uses settings to create appropriate models:
"},{"location":"configuration/#embedding-service","title":"Embedding Service","text":"The embedding service uses local embedding model configuration:
"},{"location":"configuration/#orchestrator-factory","title":"Orchestrator Factory","text":"The orchestrator factory uses settings to determine mode:
"},{"location":"configuration/#environment-variables-reference","title":"Environment Variables Reference","text":""},{"location":"configuration/#required-at-least-one-llm","title":"Required (at least one LLM)","text":"OPENAI_API_KEY - OpenAI API key (required for OpenAI provider)ANTHROPIC_API_KEY - Anthropic API key (required for Anthropic provider)HF_TOKEN or HUGGINGFACE_API_KEY - HuggingFace API token (optional, can work without for public models)LLM_PROVIDER - Provider to use: \"openai\", \"anthropic\", or \"huggingface\" (default: \"huggingface\")OPENAI_MODEL - OpenAI model name (default: \"gpt-5.1\")ANTHROPIC_MODEL - Anthropic model name (default: \"claude-sonnet-4-5-20250929\")HUGGINGFACE_MODEL - HuggingFace model ID (default: \"meta-llama/Llama-3.1-8B-Instruct\")EMBEDDING_PROVIDER - Provider: \"openai\", \"local\", or \"huggingface\" (default: \"local\")OPENAI_EMBEDDING_MODEL - OpenAI embedding model (default: \"text-embedding-3-small\")LOCAL_EMBEDDING_MODEL - Local sentence-transformers model (default: \"all-MiniLM-L6-v2\")HUGGINGFACE_EMBEDDING_MODEL - HuggingFace embedding model (default: \"sentence-transformers/all-MiniLM-L6-v2\")WEB_SEARCH_PROVIDER - Provider: \"serper\", \"searchxng\", \"brave\", \"tavily\", or \"duckduckgo\" (default: \"duckduckgo\")SERPER_API_KEY - Serper API key (required for Serper provider)SEARCHXNG_HOST - SearchXNG host URL (required for SearchXNG provider)BRAVE_API_KEY - Brave Search API key (required for Brave provider)TAVILY_API_KEY - Tavily API key (required for Tavily provider)NCBI_API_KEY - NCBI API key (optional, increases rate limit from 3 to 10 req/sec)MAX_ITERATIONS - Maximum iterations per research loop (1-50, default: 10)SEARCH_TIMEOUT - Search timeout in seconds (default: 30)USE_GRAPH_EXECUTION - Use graph-based execution (default: false)DEFAULT_TOKEN_LIMIT - Default token budget per research loop (1000-1000000, default: 100000)DEFAULT_TIME_LIMIT_MINUTES - Default time limit in minutes (1-120, default: 10)DEFAULT_ITERATIONS_LIMIT - Default iterations limit (1-50, default: 10)RAG_COLLECTION_NAME - ChromaDB collection name (default: \"deepcritical_evidence\")RAG_SIMILARITY_TOP_K - Number of top results to retrieve (1-50, default: 5)RAG_AUTO_INGEST - Automatically ingest evidence into RAG (default: true)CHROMA_DB_PATH - ChromaDB storage path (default: \"./chroma_db\")CHROMA_DB_PERSIST - Whether to persist ChromaDB to disk (default: true)CHROMA_DB_HOST - ChromaDB server host (optional, for remote ChromaDB)CHROMA_DB_PORT - ChromaDB server port (optional, for remote ChromaDB)MODAL_TOKEN_ID - Modal token ID (optional, for Modal sandbox execution)MODAL_TOKEN_SECRET - Modal token secret (optional, for Modal sandbox execution)LOG_LEVEL - Log level: \"DEBUG\", \"INFO\", \"WARNING\", or \"ERROR\" (default: \"INFO\")Settings are validated on load using Pydantic validation:
ge=1, le=50 for max_iterations)Literal[\"openai\", \"anthropic\", \"huggingface\"])get_api_key() or get_openai_api_key()The max_iterations field has range validation:
The llm_provider field has literal validation:
Configuration errors raise ConfigurationError from src/utils/exceptions.py:
```22:25:src/utils/exceptions.py class ConfigurationError(DeepCriticalError): \"\"\"Raised when configuration is invalid.\"\"\"
pass\n ```
"},{"location":"configuration/#error-handling-example","title":"Error Handling Example","text":"python from src.utils.config import settings from src.utils.exceptions import ConfigurationError try: api_key = settings.get_api_key() except ConfigurationError as e: print(f\"Configuration error: {e}\")
get_api_key() is called but the required API key is not setllm_provider is set to an unsupported value.env File: Store sensitive keys in .env file (add to .gitignore)has_openai_key before accessing API keysConfigurationError when calling get_api_key()The following configurations are planned for future phases:
DeepCritical uses Pydantic Settings for centralized configuration management. All settings are defined in the Settings class in src/utils/config.py and can be configured via environment variables or a .env file.
The configuration system provides:
.env file (if present)settings instance for easy access throughout the codebase.env file in the project rootOPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN)The Settings class extends BaseSettings from pydantic_settings and defines all application configuration:
A global settings instance is available for import:
Access configuration throughout the codebase:
from src.utils.config import settings\n\n# Check if API keys are available\nif settings.has_openai_key:\n # Use OpenAI\n pass\n\n# Access configuration values\nmax_iterations = settings.max_iterations\nweb_search_provider = settings.web_search_provider\n"},{"location":"configuration/CONFIGURATION/#required-configuration","title":"Required Configuration","text":""},{"location":"configuration/CONFIGURATION/#llm-provider","title":"LLM Provider","text":"You must configure at least one LLM provider. The system supports:
OPENAI_API_KEYANTHROPIC_API_KEYHF_TOKEN or HUGGINGFACE_API_KEY (can work without key for public models)LLM_PROVIDER=openai\nOPENAI_API_KEY=your_openai_api_key_here\nOPENAI_MODEL=gpt-5.1\n The default model is defined in the Settings class:
LLM_PROVIDER=anthropic\nANTHROPIC_API_KEY=your_anthropic_api_key_here\nANTHROPIC_MODEL=claude-sonnet-4-5-20250929\n The default model is defined in the Settings class:
HuggingFace can work without an API key for public models, but an API key provides higher rate limits:
# Option 1: Using HF_TOKEN (preferred)\nHF_TOKEN=your_huggingface_token_here\n\n# Option 2: Using HUGGINGFACE_API_KEY (alternative)\nHUGGINGFACE_API_KEY=your_huggingface_api_key_here\n\n# Default model\nHUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct\n The HuggingFace token can be set via either environment variable:
"},{"location":"configuration/CONFIGURATION/#optional-configuration","title":"Optional Configuration","text":""},{"location":"configuration/CONFIGURATION/#embedding-configuration","title":"Embedding Configuration","text":"DeepCritical supports multiple embedding providers for semantic search and RAG:
# Embedding Provider: \"openai\", \"local\", or \"huggingface\"\nEMBEDDING_PROVIDER=local\n\n# OpenAI Embedding Model (used by LlamaIndex RAG)\nOPENAI_EMBEDDING_MODEL=text-embedding-3-small\n\n# Local Embedding Model (sentence-transformers, used by EmbeddingService)\nLOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2\n\n# HuggingFace Embedding Model\nHUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2\n The embedding provider configuration:
Note: OpenAI embeddings require OPENAI_API_KEY. The local provider (default) uses sentence-transformers and requires no API key.
DeepCritical supports multiple web search providers:
# Web Search Provider: \"serper\", \"searchxng\", \"brave\", \"tavily\", or \"duckduckgo\"\n# Default: \"duckduckgo\" (no API key required)\nWEB_SEARCH_PROVIDER=duckduckgo\n\n# Serper API Key (for Google search via Serper)\nSERPER_API_KEY=your_serper_api_key_here\n\n# SearchXNG Host URL (for self-hosted search)\nSEARCHXNG_HOST=http://localhost:8080\n\n# Brave Search API Key\nBRAVE_API_KEY=your_brave_api_key_here\n\n# Tavily API Key\nTAVILY_API_KEY=your_tavily_api_key_here\n The web search provider configuration:
Note: DuckDuckGo is the default and requires no API key, making it ideal for development and testing.
"},{"location":"configuration/CONFIGURATION/#pubmed-configuration","title":"PubMed Configuration","text":"PubMed search supports optional NCBI API key for higher rate limits:
# NCBI API Key (optional, for higher rate limits: 10 req/sec vs 3 req/sec)\nNCBI_API_KEY=your_ncbi_api_key_here\n The PubMed tool uses this configuration:
"},{"location":"configuration/CONFIGURATION/#agent-configuration","title":"Agent Configuration","text":"Control agent behavior and research loop execution:
# Maximum iterations per research loop (1-50, default: 10)\nMAX_ITERATIONS=10\n\n# Search timeout in seconds\nSEARCH_TIMEOUT=30\n\n# Use graph-based execution for research flows\nUSE_GRAPH_EXECUTION=false\n The agent configuration fields:
"},{"location":"configuration/CONFIGURATION/#budget-rate-limiting-configuration","title":"Budget & Rate Limiting Configuration","text":"Control resource limits for research loops:
# Default token budget per research loop (1000-1000000, default: 100000)\nDEFAULT_TOKEN_LIMIT=100000\n\n# Default time limit per research loop in minutes (1-120, default: 10)\nDEFAULT_TIME_LIMIT_MINUTES=10\n\n# Default iterations limit per research loop (1-50, default: 10)\nDEFAULT_ITERATIONS_LIMIT=10\n The budget configuration with validation:
"},{"location":"configuration/CONFIGURATION/#rag-service-configuration","title":"RAG Service Configuration","text":"Configure the Retrieval-Augmented Generation service:
# ChromaDB collection name for RAG\nRAG_COLLECTION_NAME=deepcritical_evidence\n\n# Number of top results to retrieve from RAG (1-50, default: 5)\nRAG_SIMILARITY_TOP_K=5\n\n# Automatically ingest evidence into RAG\nRAG_AUTO_INGEST=true\n The RAG configuration:
"},{"location":"configuration/CONFIGURATION/#chromadb-configuration","title":"ChromaDB Configuration","text":"Configure the vector database for embeddings and RAG:
# ChromaDB storage path\nCHROMA_DB_PATH=./chroma_db\n\n# Whether to persist ChromaDB to disk\nCHROMA_DB_PERSIST=true\n\n# ChromaDB server host (for remote ChromaDB, optional)\nCHROMA_DB_HOST=localhost\n\n# ChromaDB server port (for remote ChromaDB, optional)\nCHROMA_DB_PORT=8000\n The ChromaDB configuration:
"},{"location":"configuration/CONFIGURATION/#external-services","title":"External Services","text":""},{"location":"configuration/CONFIGURATION/#modal-configuration","title":"Modal Configuration","text":"Modal is used for secure sandbox execution of statistical analysis:
# Modal Token ID (for Modal sandbox execution)\nMODAL_TOKEN_ID=your_modal_token_id_here\n\n# Modal Token Secret\nMODAL_TOKEN_SECRET=your_modal_token_secret_here\n The Modal configuration:
"},{"location":"configuration/CONFIGURATION/#logging-configuration","title":"Logging Configuration","text":"Configure structured logging:
# Log Level: \"DEBUG\", \"INFO\", \"WARNING\", or \"ERROR\"\nLOG_LEVEL=INFO\n The logging configuration:
Logging is configured via the configure_logging() function:
The Settings class provides helpful properties for checking configuration state:
Check which API keys are available:
Usage:
from src.utils.config import settings\n\n# Check API key availability\nif settings.has_openai_key:\n # Use OpenAI\n pass\n\nif settings.has_anthropic_key:\n # Use Anthropic\n pass\n\nif settings.has_huggingface_key:\n # Use HuggingFace\n pass\n\nif settings.has_any_llm_key:\n # At least one LLM is available\n pass\n"},{"location":"configuration/CONFIGURATION/#service-availability","title":"Service Availability","text":"Check if external services are configured:
Usage:
from src.utils.config import settings\n\n# Check service availability\nif settings.modal_available:\n # Use Modal sandbox\n pass\n\nif settings.web_search_available:\n # Web search is configured\n pass\n"},{"location":"configuration/CONFIGURATION/#api-key-retrieval","title":"API Key Retrieval","text":"Get the API key for the configured provider:
For OpenAI-specific operations (e.g., Magentic mode):
"},{"location":"configuration/CONFIGURATION/#configuration-usage-in-codebase","title":"Configuration Usage in Codebase","text":"The configuration system is used throughout the codebase:
"},{"location":"configuration/CONFIGURATION/#llm-factory","title":"LLM Factory","text":"The LLM factory uses settings to create appropriate models:
"},{"location":"configuration/CONFIGURATION/#embedding-service","title":"Embedding Service","text":"The embedding service uses local embedding model configuration:
"},{"location":"configuration/CONFIGURATION/#orchestrator-factory","title":"Orchestrator Factory","text":"The orchestrator factory uses settings to determine mode:
"},{"location":"configuration/CONFIGURATION/#environment-variables-reference","title":"Environment Variables Reference","text":""},{"location":"configuration/CONFIGURATION/#required-at-least-one-llm","title":"Required (at least one LLM)","text":"OPENAI_API_KEY - OpenAI API key (required for OpenAI provider)ANTHROPIC_API_KEY - Anthropic API key (required for Anthropic provider)HF_TOKEN or HUGGINGFACE_API_KEY - HuggingFace API token (optional, can work without for public models)LLM_PROVIDER - Provider to use: \"openai\", \"anthropic\", or \"huggingface\" (default: \"huggingface\")OPENAI_MODEL - OpenAI model name (default: \"gpt-5.1\")ANTHROPIC_MODEL - Anthropic model name (default: \"claude-sonnet-4-5-20250929\")HUGGINGFACE_MODEL - HuggingFace model ID (default: \"meta-llama/Llama-3.1-8B-Instruct\")EMBEDDING_PROVIDER - Provider: \"openai\", \"local\", or \"huggingface\" (default: \"local\")OPENAI_EMBEDDING_MODEL - OpenAI embedding model (default: \"text-embedding-3-small\")LOCAL_EMBEDDING_MODEL - Local sentence-transformers model (default: \"all-MiniLM-L6-v2\")HUGGINGFACE_EMBEDDING_MODEL - HuggingFace embedding model (default: \"sentence-transformers/all-MiniLM-L6-v2\")WEB_SEARCH_PROVIDER - Provider: \"serper\", \"searchxng\", \"brave\", \"tavily\", or \"duckduckgo\" (default: \"duckduckgo\")SERPER_API_KEY - Serper API key (required for Serper provider)SEARCHXNG_HOST - SearchXNG host URL (required for SearchXNG provider)BRAVE_API_KEY - Brave Search API key (required for Brave provider)TAVILY_API_KEY - Tavily API key (required for Tavily provider)NCBI_API_KEY - NCBI API key (optional, increases rate limit from 3 to 10 req/sec)MAX_ITERATIONS - Maximum iterations per research loop (1-50, default: 10)SEARCH_TIMEOUT - Search timeout in seconds (default: 30)USE_GRAPH_EXECUTION - Use graph-based execution (default: false)DEFAULT_TOKEN_LIMIT - Default token budget per research loop (1000-1000000, default: 100000)DEFAULT_TIME_LIMIT_MINUTES - Default time limit in minutes (1-120, default: 10)DEFAULT_ITERATIONS_LIMIT - Default iterations limit (1-50, default: 10)RAG_COLLECTION_NAME - ChromaDB collection name (default: \"deepcritical_evidence\")RAG_SIMILARITY_TOP_K - Number of top results to retrieve (1-50, default: 5)RAG_AUTO_INGEST - Automatically ingest evidence into RAG (default: true)CHROMA_DB_PATH - ChromaDB storage path (default: \"./chroma_db\")CHROMA_DB_PERSIST - Whether to persist ChromaDB to disk (default: true)CHROMA_DB_HOST - ChromaDB server host (optional, for remote ChromaDB)CHROMA_DB_PORT - ChromaDB server port (optional, for remote ChromaDB)MODAL_TOKEN_ID - Modal token ID (optional, for Modal sandbox execution)MODAL_TOKEN_SECRET - Modal token secret (optional, for Modal sandbox execution)LOG_LEVEL - Log level: \"DEBUG\", \"INFO\", \"WARNING\", or \"ERROR\" (default: \"INFO\")Settings are validated on load using Pydantic validation:
ge=1, le=50 for max_iterations)Literal[\"openai\", \"anthropic\", \"huggingface\"])get_api_key() or get_openai_api_key()The max_iterations field has range validation:
The llm_provider field has literal validation:
Configuration errors raise ConfigurationError from src/utils/exceptions.py:
from src.utils.config import settings\nfrom src.utils.exceptions import ConfigurationError\n\ntry:\n api_key = settings.get_api_key()\nexcept ConfigurationError as e:\n print(f\"Configuration error: {e}\")\n"},{"location":"configuration/CONFIGURATION/#common-configuration-errors","title":"Common Configuration Errors","text":"get_api_key() is called but the required API key is not setllm_provider is set to an unsupported value.env File: Store sensitive keys in .env file (add to .gitignore)has_openai_key before accessing API keysConfigurationError when calling get_api_key()The following configurations are planned for future phases:
Thank you for your interest in contributing to DeepCritical! This guide will help you get started.
"},{"location":"contributing/#git-workflow","title":"Git Workflow","text":"main: Production-ready (GitHub)dev: Development integration (GitHub)yourname-devmain or dev on HuggingFacemake install # Install dependencies + pre-commit\nmake check # Lint + typecheck + test (MUST PASS)\nmake test # Run unit tests\nmake lint # Run ruff\nmake format # Format with ruff\nmake typecheck # Run mypy\nmake test-cov # Test with coverage\n"},{"location":"contributing/#getting-started","title":"Getting Started","text":"git clone https://github.com/yourusername/GradioDemo.git\ncd GradioDemo\nmake install\ngit checkout -b yourname-feature-name\nmake check\ngit commit -m \"Description of changes\"\ngit push origin yourname-feature-name\nmypy --strictruff for linting and formattingraise SearchError(...) from estructlogunit, integration, slow@lru_cache(maxsize=1)# CRITICAL: ...src/mcp_tools.py for Claude Desktopmcp_server=True in demo.launch()/gradio_api/mcp/ssr_mode=False to fix hydration issues in HF Spacesfrom e when raising exceptionsmypy --strictmake checkThank you for contributing to DeepCritical!
"},{"location":"contributing/code-quality/","title":"Code Quality & Documentation","text":"This document outlines code quality standards and documentation requirements.
"},{"location":"contributing/code-quality/#linting","title":"Linting","text":"pyproject.toml:PLR0913: Too many arguments (agents need many params)PLR0912: Too many branches (complex orchestrator logic)PLR0911: Too many return statements (complex agent logic)PLR2004: Magic values (statistical constants)PLW0603: Global statement (singleton pattern)PLC0415: Lazy imports for optional dependenciesmypy --strict complianceignore_missing_imports = true (for optional dependencies)reference_repos/, examples/make check before committingmake installExample:
"},{"location":"contributing/code-quality/#code-comments","title":"Code Comments","text":"requests not httpx for ClinicalTrials)# CRITICAL: ...This document outlines the code style and conventions for DeepCritical.
"},{"location":"contributing/code-style/#type-safety","title":"Type Safety","text":"mypy --strict compliance (no Any unless absolutely necessary)TYPE_CHECKING imports for circular dependencies:src/utils/models.py)model_config = {\"frozen\": True}) for immutabilityField() with descriptions for all model fieldsge=, le=, min_length=, max_length= constraintsasync def, await)asyncio.gather() for parallel operationsrun_in_executor():loop = asyncio.get_running_loop()\nresult = await loop.run_in_executor(None, cpu_bound_function, args)\n This document outlines error handling and logging conventions for DeepCritical.
"},{"location":"contributing/error-handling/#exception-hierarchy","title":"Exception Hierarchy","text":"Use custom exception hierarchy (src/utils/exceptions.py):
raise SearchError(...) from estructlog:logger.error(\"Operation failed\", error=str(e), context=value)\n structlog for all logging (NOT print or logging)import structlog; logger = structlog.get_logger()logger.info(\"event\", key=value)logger.info(\"Starting search\", query=query, tools=[t.name for t in tools])\nlogger.warning(\"Search tool failed\", tool=tool.name, error=str(result))\nlogger.error(\"Assessment failed\", error=str(e))\n"},{"location":"contributing/error-handling/#error-chaining","title":"Error Chaining","text":"Always preserve exception context:
try:\n result = await api_call()\nexcept httpx.HTTPError as e:\n raise SearchError(f\"API call failed: {e}\") from e\n"},{"location":"contributing/error-handling/#see-also","title":"See Also","text":"This document outlines common implementation patterns used in DeepCritical.
"},{"location":"contributing/implementation-patterns/#search-tools","title":"Search Tools","text":"All tools implement SearchTool protocol (src/tools/base.py):
name propertyasync def search(query, max_results) -> list[Evidence]@retry decorator from tenacity for resilience_rate_limit() for APIs with limits (e.g., PubMed)SearchError or RateLimitError on failuresExample pattern:
class MySearchTool:\n @property\n def name(self) -> str:\n return \"mytool\"\n \n @retry(stop=stop_after_attempt(3), wait=wait_exponential(...))\n async def search(self, query: str, max_results: int = 10) -> list[Evidence]:\n # Implementation\n return evidence_list\n"},{"location":"contributing/implementation-patterns/#judge-handlers","title":"Judge Handlers","text":"JudgeHandlerProtocol (async def assess(question, evidence) -> JudgeAssessment)Agent with output_type=JudgeAssessmentsrc/prompts/judge.pyMockJudgeHandler, HFInferenceJudgeHandlerJudgeAssessment (never raise exceptions)src/agent_factory/)ContextVar for thread-safe state (src/agents/state.py)@lru_cache)Use @lru_cache(maxsize=1) for singletons:
This document outlines prompt engineering guidelines and citation validation rules.
"},{"location":"contributing/prompt-engineering/#judge-prompts","title":"Judge Prompts","text":"src/prompts/judge.pyformat_user_prompt() and format_empty_evidence_prompt() helperstruncate_at_sentence())format_hypothesis_prompt() with embeddings for diversityvalidate_references() from src/utils/citation_validator.pyselect_diverse_evidence() for MMR-based selectionThis document outlines testing requirements and guidelines for DeepCritical.
"},{"location":"contributing/testing/#test-structure","title":"Test Structure","text":"tests/unit/ (mocked, fast)tests/integration/ (real APIs, marked @pytest.mark.integration)unit, integration, slowrespx for httpx mockingpytest-mock for general mockingMockJudgeHandler)tests/conftest.py: mock_httpx_client, mock_llm_responsetests/unit/src/make check (lint + typecheck + test)@pytest.mark.unit\nasync def test_pubmed_search(mock_httpx_client):\n tool = PubMedTool()\n results = await tool.search(\"metformin\", max_results=5)\n assert len(results) > 0\n assert all(isinstance(r, Evidence) for r in results)\n\n@pytest.mark.integration\nasync def test_real_pubmed_search():\n tool = PubMedTool()\n results = await tool.search(\"metformin\", max_results=3)\n assert len(results) <= 3\n"},{"location":"contributing/testing/#test-coverage","title":"Test Coverage","text":"make test-cov for coverage report__init__.py, TYPE_CHECKING blocksThis page provides examples of using DeepCritical for various research tasks.
"},{"location":"getting-started/examples/#basic-research-query","title":"Basic Research Query","text":""},{"location":"getting-started/examples/#example-1-drug-information","title":"Example 1: Drug Information","text":"Query:
What are the latest treatments for Alzheimer's disease?\n What DeepCritical Does: 1. Searches PubMed for recent papers 2. Searches ClinicalTrials.gov for active trials 3. Evaluates evidence quality 4. Synthesizes findings into a comprehensive report
"},{"location":"getting-started/examples/#example-2-clinical-trial-search","title":"Example 2: Clinical Trial Search","text":"Query:
What clinical trials are investigating metformin for cancer prevention?\n What DeepCritical Does: 1. Searches ClinicalTrials.gov for relevant trials 2. Searches PubMed for supporting literature 3. Provides trial details and status 4. Summarizes findings
"},{"location":"getting-started/examples/#advanced-research-queries","title":"Advanced Research Queries","text":""},{"location":"getting-started/examples/#example-3-comprehensive-review","title":"Example 3: Comprehensive Review","text":"Query:
Review the evidence for using metformin as an anti-aging intervention, \nincluding clinical trials, mechanisms of action, and safety profile.\n What DeepCritical Does: 1. Uses deep research mode (multi-section) 2. Searches multiple sources in parallel 3. Generates sections on: - Clinical trials - Mechanisms of action - Safety profile 4. Synthesizes comprehensive report
"},{"location":"getting-started/examples/#example-4-hypothesis-testing","title":"Example 4: Hypothesis Testing","text":"Query:
Test the hypothesis that regular exercise reduces Alzheimer's disease risk.\n What DeepCritical Does: 1. Generates testable hypotheses 2. Searches for supporting/contradicting evidence 3. Performs statistical analysis (if Modal configured) 4. Provides verdict: SUPPORTED, REFUTED, or INCONCLUSIVE
"},{"location":"getting-started/examples/#mcp-tool-examples","title":"MCP Tool Examples","text":""},{"location":"getting-started/examples/#using-search_pubmed","title":"Using search_pubmed","text":"Search PubMed for \"CRISPR gene editing cancer therapy\"\n"},{"location":"getting-started/examples/#using-search_clinical_trials","title":"Using search_clinical_trials","text":"Find active clinical trials for \"diabetes type 2 treatment\"\n"},{"location":"getting-started/examples/#using-search_all","title":"Using search_all","text":"Search all sources for \"COVID-19 vaccine side effects\"\n"},{"location":"getting-started/examples/#using-analyze_hypothesis","title":"Using analyze_hypothesis","text":"Analyze whether vitamin D supplementation reduces COVID-19 severity\n"},{"location":"getting-started/examples/#code-examples","title":"Code Examples","text":""},{"location":"getting-started/examples/#python-api-usage","title":"Python API Usage","text":"from src.orchestrator_factory import create_orchestrator\nfrom src.tools.search_handler import SearchHandler\nfrom src.agent_factory.judges import create_judge_handler\n\n# Create orchestrator\nsearch_handler = SearchHandler()\njudge_handler = create_judge_handler()\n # Run research query\nquery = \"What are the latest treatments for Alzheimer's disease?\"\nasync for event in orchestrator.run(query):\n print(f\"Event: {event.type} - {event.data}\")\n"},{"location":"getting-started/examples/#gradio-ui-integration","title":"Gradio UI Integration","text":"import gradio as gr\nfrom src.app import create_research_interface\n\n# Create interface\ninterface = create_research_interface()\n\n# Launch\ninterface.launch(server_name=\"0.0.0.0\", server_port=7860)\n"},{"location":"getting-started/examples/#research-patterns","title":"Research Patterns","text":""},{"location":"getting-started/examples/#iterative-research","title":"Iterative Research","text":"Single-loop research with search-judge-synthesize cycles:
from src.orchestrator.research_flow import IterativeResearchFlow\n async for event in flow.run(query):\n # Handle events\n pass\n"},{"location":"getting-started/examples/#deep-research","title":"Deep Research","text":"Multi-section parallel research:
from src.orchestrator.research_flow import DeepResearchFlow\n async for event in flow.run(query):\n # Handle events\n pass\n"},{"location":"getting-started/examples/#configuration-examples","title":"Configuration Examples","text":""},{"location":"getting-started/examples/#basic-configuration","title":"Basic Configuration","text":"# .env file\nLLM_PROVIDER=openai\nOPENAI_API_KEY=your_key_here\nMAX_ITERATIONS=10\n"},{"location":"getting-started/examples/#advanced-configuration","title":"Advanced Configuration","text":"# .env file\nLLM_PROVIDER=anthropic\nANTHROPIC_API_KEY=your_key_here\nEMBEDDING_PROVIDER=local\nWEB_SEARCH_PROVIDER=duckduckgo\nMAX_ITERATIONS=20\nDEFAULT_TOKEN_LIMIT=200000\nUSE_GRAPH_EXECUTION=true\n"},{"location":"getting-started/examples/#next-steps","title":"Next Steps","text":"This guide will help you install and set up DeepCritical on your system.
"},{"location":"getting-started/installation/#prerequisites","title":"Prerequisites","text":"uv package manager (recommended) or pipuv is a fast Python package installer and resolver. Install it with:
pip install uv\n"},{"location":"getting-started/installation/#2-clone-the-repository","title":"2. Clone the Repository","text":"git clone https://github.com/DeepCritical/GradioDemo.git\ncd GradioDemo\n"},{"location":"getting-started/installation/#3-install-dependencies","title":"3. Install Dependencies","text":"Using uv (recommended):
uv sync\n Using pip:
pip install -e .\n"},{"location":"getting-started/installation/#4-install-optional-dependencies","title":"4. Install Optional Dependencies","text":"For embeddings support (local sentence-transformers):
uv sync --extra embeddings\n For Modal sandbox execution:
uv sync --extra modal\n For Magentic orchestration:
uv sync --extra magentic\n Install all extras:
uv sync --all-extras\n"},{"location":"getting-started/installation/#5-configure-environment-variables","title":"5. Configure Environment Variables","text":"Create a .env file in the project root:
# Required: At least one LLM provider\nLLM_PROVIDER=openai # or \"anthropic\" or \"huggingface\"\nOPENAI_API_KEY=your_openai_api_key_here\n\n# Optional: Other services\nNCBI_API_KEY=your_ncbi_api_key_here # For higher PubMed rate limits\nMODAL_TOKEN_ID=your_modal_token_id\nMODAL_TOKEN_SECRET=your_modal_token_secret\n See the Configuration Guide for all available options.
"},{"location":"getting-started/installation/#6-verify-installation","title":"6. Verify Installation","text":"Run the application:
uv run gradio run src/app.py\n Open your browser to http://localhost:7860 to verify the installation.
For development, install dev dependencies:
uv sync --all-extras --dev\n Install pre-commit hooks:
uv run pre-commit install\n"},{"location":"getting-started/installation/#troubleshooting","title":"Troubleshooting","text":""},{"location":"getting-started/installation/#common-issues","title":"Common Issues","text":"Import Errors: - Ensure you've installed all required dependencies - Check that Python 3.11+ is being used
API Key Errors: - Verify your .env file is in the project root - Check that API keys are correctly formatted - Ensure at least one LLM provider is configured
Module Not Found: - Run uv sync or pip install -e . again - Check that you're in the correct virtual environment
Port Already in Use: - Change the port in src/app.py or use environment variable - Kill the process using port 7860
<<<<<<< Updated upstream
=======
Stashed changes
Stashed changes
"},{"location":"getting-started/mcp-integration/","title":"MCP Integration","text":"DeepCritical exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients.
"},{"location":"getting-started/mcp-integration/#what-is-mcp","title":"What is MCP?","text":"The Model Context Protocol (MCP) is a standard for connecting AI assistants to external tools and data sources. DeepCritical implements an MCP server that exposes its search capabilities as MCP tools.
"},{"location":"getting-started/mcp-integration/#mcp-server-url","title":"MCP Server URL","text":"When running locally:
http://localhost:7860/gradio_api/mcp/\n"},{"location":"getting-started/mcp-integration/#claude-desktop-configuration","title":"Claude Desktop Configuration","text":""},{"location":"getting-started/mcp-integration/#1-locate-configuration-file","title":"1. Locate Configuration File","text":"macOS:
~/Library/Application Support/Claude/claude_desktop_config.json\n Windows:
%APPDATA%\\Claude\\claude_desktop_config.json\n Linux:
~/.config/Claude/claude_desktop_config.json\n"},{"location":"getting-started/mcp-integration/#2-add-deepcritical-server","title":"2. Add DeepCritical Server","text":"Edit claude_desktop_config.json and add:
{\n \"mcpServers\": {\n \"deepcritical\": {\n \"url\": \"http://localhost:7860/gradio_api/mcp/\"\n }\n }\n}\n"},{"location":"getting-started/mcp-integration/#3-restart-claude-desktop","title":"3. Restart Claude Desktop","text":"Close and restart Claude Desktop for changes to take effect.
"},{"location":"getting-started/mcp-integration/#4-verify-connection","title":"4. Verify Connection","text":"In Claude Desktop, you should see DeepCritical tools available: - search_pubmed - search_clinical_trials - search_biorxiv - search_all - analyze_hypothesis
Search peer-reviewed biomedical literature from PubMed.
Parameters: - query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)
Example:
Search PubMed for \"metformin diabetes\"\n"},{"location":"getting-started/mcp-integration/#search_clinical_trials","title":"search_clinical_trials","text":"Search ClinicalTrials.gov for interventional studies.
Parameters: - query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)
Example:
Search clinical trials for \"Alzheimer's disease treatment\"\n"},{"location":"getting-started/mcp-integration/#search_biorxiv","title":"search_biorxiv","text":"Search bioRxiv/medRxiv preprints via Europe PMC.
Parameters: - query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)
Example:
Search bioRxiv for \"CRISPR gene editing\"\n"},{"location":"getting-started/mcp-integration/#search_all","title":"search_all","text":"Search all sources simultaneously (PubMed, ClinicalTrials.gov, Europe PMC).
Parameters: - query (string): Search query - max_results (integer, optional): Maximum number of results per source (default: 10)
Example:
Search all sources for \"COVID-19 vaccine efficacy\"\n"},{"location":"getting-started/mcp-integration/#analyze_hypothesis","title":"analyze_hypothesis","text":"Perform secure statistical analysis using Modal sandboxes.
Parameters: - hypothesis (string): Hypothesis to analyze - data (string, optional): Data description or code
Example:
Analyze the hypothesis that metformin reduces cancer risk\n"},{"location":"getting-started/mcp-integration/#using-tools-in-claude-desktop","title":"Using Tools in Claude Desktop","text":"Once configured, you can ask Claude to use DeepCritical tools:
Use DeepCritical to search PubMed for recent papers on Alzheimer's disease treatments.\n Claude will automatically: 1. Call the appropriate DeepCritical tool 2. Retrieve results 3. Use the results in its response
"},{"location":"getting-started/mcp-integration/#troubleshooting","title":"Troubleshooting","text":""},{"location":"getting-started/mcp-integration/#connection-issues","title":"Connection Issues","text":"Server Not Found: - Ensure DeepCritical is running (uv run gradio run src/app.py) - Verify the URL in claude_desktop_config.json is correct - Check that port 7860 is not blocked by firewall
Tools Not Appearing: - Restart Claude Desktop after configuration changes - Check Claude Desktop logs for errors - Verify MCP server is accessible at the configured URL
"},{"location":"getting-started/mcp-integration/#authentication","title":"Authentication","text":"If DeepCritical requires authentication: - Configure API keys in DeepCritical settings - Use HuggingFace OAuth login - Ensure API keys are valid
"},{"location":"getting-started/mcp-integration/#advanced-configuration","title":"Advanced Configuration","text":""},{"location":"getting-started/mcp-integration/#custom-port","title":"Custom Port","text":"If running on a different port, update the URL:
{\n \"mcpServers\": {\n \"deepcritical\": {\n \"url\": \"http://localhost:8080/gradio_api/mcp/\"\n }\n }\n}\n"},{"location":"getting-started/mcp-integration/#multiple-instances","title":"Multiple Instances","text":"You can configure multiple DeepCritical instances:
{\n \"mcpServers\": {\n \"deepcritical-local\": {\n \"url\": \"http://localhost:7860/gradio_api/mcp/\"\n },\n \"deepcritical-remote\": {\n \"url\": \"https://your-server.com/gradio_api/mcp/\"\n }\n }\n}\n"},{"location":"getting-started/mcp-integration/#next-steps","title":"Next Steps","text":"Get up and running with DeepCritical in minutes.
"},{"location":"getting-started/quick-start/#start-the-application","title":"Start the Application","text":"uv run gradio run src/app.py\n Open your browser to http://localhost:7860.
Type your research question in the chat interface, for example: - \"What are the latest treatments for Alzheimer's disease?\" - \"Review the evidence for metformin in cancer prevention\" - \"What clinical trials are investigating COVID-19 vaccines?\"
Click \"Submit\" or press Enter. The system will: - Generate observations about your query - Identify knowledge gaps - Search multiple sources (PubMed, ClinicalTrials.gov, Europe PMC) - Evaluate evidence quality - Synthesize findings into a report
Watch the real-time progress in the chat interface: - Search operations and results - Evidence evaluation - Report generation - Final research report with citations
"},{"location":"getting-started/quick-start/#authentication","title":"Authentication","text":""},{"location":"getting-started/quick-start/#huggingface-oauth-recommended","title":"HuggingFace OAuth (Recommended)","text":"What are the side effects of metformin?\n"},{"location":"getting-started/quick-start/#complex-query","title":"Complex Query","text":"Review the evidence for using metformin as an anti-aging intervention, \nincluding clinical trials, mechanisms of action, and safety profile.\n"},{"location":"getting-started/quick-start/#clinical-trial-query","title":"Clinical Trial Query","text":"What are the active clinical trials investigating Alzheimer's disease treatments?\n"},{"location":"getting-started/quick-start/#next-steps","title":"Next Steps","text":"DeepCritical is a deep research agent system that uses iterative search-and-judge loops to comprehensively answer research questions. The system supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming.
"},{"location":"overview/architecture/#core-architecture","title":"Core Architecture","text":""},{"location":"overview/architecture/#orchestration-patterns","title":"Orchestration Patterns","text":"src/orchestrator/graph_orchestrator.py):AsyncGenerator[AgentEvent] for real-time UI updatesFallback to agent chains when graph execution is disabled
Deep Research Flow (src/orchestrator/research_flow.py):
PlannerAgent to break query into report sectionsIterativeResearchFlow instances in parallel per section via WorkflowManagerLongWriterAgent or ProofreaderAgentuse_graph=True) and agent chains (use_graph=False)State synchronization across parallel loops
Iterative Research Flow (src/orchestrator/research_flow.py):
KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent, WriterAgentJudgeHandler assesses evidence sufficiencySupports graph execution and agent chains
Magentic Orchestrator (src/orchestrator_magentic.py):
agent-framework-coreMagenticBuilder with participants: searcher, hypothesizer, judge, reporterOpenAIChatClientAgentEvent for UI streamingSupports long-running workflows with max rounds and stall/reset handling
Hierarchical Orchestrator (src/orchestrator_hierarchical.py):
SubIterationMiddleware with ResearchTeam and LLMSubIterationJudgeSubIterationTeam protocolasyncio.Queue for coordinationSupports sub-iteration patterns for complex research tasks
Legacy Simple Mode (src/legacy_orchestrator.py):
SearchHandlerProtocol and JudgeHandlerProtocolAgentEvent objectsThe system is designed for long-running research tasks with comprehensive state management and streaming:
AgentEvent objects via AsyncGeneratorstarted, searching, search_complete, judging, judge_complete, looping, synthesizing, hypothesizing, complete, errorMetadata includes iteration numbers, tool names, result counts, durations
Budget Tracking (src/middleware/budget_tracker.py):
Budget summaries for monitoring
Workflow Manager (src/middleware/workflow_manager.py):
pending, running, completed, failed, cancelledEvidence deduplication across parallel loops
State Management (src/middleware/state_machine.py):
ContextVar for concurrent requestsWorkflowState tracks: evidence, conversation history, embedding serviceSupports both iterative and deep research patterns
Gradio UI (src/app.py):
The graph orchestrator (src/orchestrator/graph_orchestrator.py) implements a flexible graph-based execution model:
Node Types:
KnowledgeGapAgent, ToolSelectorAgent)Edge Types:
Graph Patterns:
[Input] \u2192 [Thinking] \u2192 [Knowledge Gap] \u2192 [Decision: Complete?] \u2192 [Tool Selector] or [Writer][Input] \u2192 [Planner] \u2192 [Parallel Iterative Loops] \u2192 [Synthesizer]Execution Flow:
asyncio.gather()src/orchestrator/, src/orchestrator_*.py)src/orchestrator/research_flow.py)src/agent_factory/graph_builder.py)src/agents/, src/agent_factory/agents.py)src/tools/)src/agent_factory/judges.py)src/services/embeddings.py)src/services/statistical_analyzer.py)src/middleware/)src/mcp_tools.py)src/app.py)The system supports complex research workflows through:
ResearchLoop instancesasyncio.gather()Handles loop failures gracefully
Deep Research Pattern: Breaks complex queries into sections
Final synthesis combines all section results
State Synchronization: Thread-safe evidence sharing
src/orchestrator_factory.py):Lazy imports for optional dependencies
Research Modes:
iterative: Single research loopdeep: Multi-section parallel researchauto: Auto-detect based on query complexity
Execution Modes:
use_graph=True: Graph-based execution (parallel, conditional routing)use_graph=False: Agent chains (sequential, backward compatible)DeepCritical provides a comprehensive set of features for AI-assisted research:
"},{"location":"overview/features/#core-features","title":"Core Features","text":""},{"location":"overview/features/#multi-source-search","title":"Multi-Source Search","text":"AsyncGenerator[AgentEvent].env filesGet started with DeepCritical in minutes.
"},{"location":"overview/quick-start/#installation","title":"Installation","text":"# Install uv if you haven't already\npip install uv\n\n# Sync dependencies\nuv sync\n"},{"location":"overview/quick-start/#run-the-ui","title":"Run the UI","text":"# Start the Gradio app\nuv run gradio run src/app.py\n Open your browser to http://localhost:7860.
HuggingFace OAuth Login: - Click the \"Sign in with HuggingFace\" button at the top of the app - Your HuggingFace API token will be automatically used for AI inference - No need to manually enter API keys when logged in
Manual API Key (BYOK): - Provide your own API key in the Settings accordion - Supports HuggingFace, OpenAI, or Anthropic API keys - Manual keys take priority over OAuth tokens
"},{"location":"overview/quick-start/#2-start-a-research-query","title":"2. Start a Research Query","text":"Connect DeepCritical to Claude Desktop:
Add to your claude_desktop_config.json:
{\n \"mcpServers\": {\n \"deepcritical\": {\n \"url\": \"http://localhost:7860/gradio_api/mcp/\"\n }\n }\n}\n Restart Claude Desktop
search_pubmed: Search peer-reviewed biomedical literaturesearch_clinical_trials: Search ClinicalTrials.govsearch_biorxiv: Search bioRxiv/medRxiv preprintssearch_all: Search all sources simultaneouslyanalyze_hypothesis: Secure statistical analysis using Modal sandboxesGeneralist Deep Research Agent - Stops at Nothing Until Finding Precise Answers
The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations).
Key Features: - Generalist: Handles queries from any domain (medical, technical, business, scientific, etc.) - Automatic Source Selection: Automatically determines if medical knowledge sources (PubMed, ClinicalTrials.gov) are needed - Multi-Source Search: Web search, PubMed, ClinicalTrials.gov, Europe PMC, RAG - Iterative Refinement: Continues searching and refining until precise answers are found - Evidence Synthesis: Comprehensive reports with proper citations
Important: The DETERMINATOR is a research tool that synthesizes evidence. It cannot provide medical advice or answer medical questions directly.
"},{"location":"#features","title":"Features","text":"# Install uv if you haven't already (recommended: standalone installer)\n# Unix/macOS/Linux:\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Windows (PowerShell):\npowershell -ExecutionPolicy ByPass -c \"irm https://astral.sh/uv/install.ps1 | iex\"\n\n# Alternative: pipx install uv\n# Or: pip install uv\n\n# Sync dependencies\nuv sync\n\n# Start the Gradio app\nuv run gradio run src/app.py\n Open your browser to http://localhost:7860.
For detailed installation and setup instructions, see the Getting Started Guide.
"},{"location":"#architecture","title":"Architecture","text":"The DETERMINATOR uses a Vertical Slice Architecture:
The system supports three main research patterns:
Learn more about the Architecture.
"},{"location":"#documentation","title":"Documentation","text":"DeepCritical is licensed under the MIT License.
"},{"location":"LICENSE/#mit-license","title":"MIT License","text":"Copyright (c) 2024 DeepCritical Team
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/","title":"MkDocs & Material UI Improvement Assessment","text":""},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#current-configuration-analysis","title":"Current Configuration Analysis","text":"Your current mkdocs.yml already includes many excellent features: - \u2705 Material theme with light/dark mode toggle - \u2705 Navigation tabs, sections, expand, and top navigation - \u2705 Search with suggestions and highlighting - \u2705 Code annotation and copy buttons - \u2705 Mermaid diagram support - \u2705 Code include plugin - \u2705 Minification for performance - \u2705 Comprehensive markdown extensions
If you plan to maintain multiple versions or branches:
plugins:\n - search\n - mermaid2\n - codeinclude\n - minify:\n minify_html: true\n minify_js: true\n minify_css: true\n - git-revision-date-localized:\n enable_creation_date: true\n type: timeago\n # Optional: For versioning\n # - versioning:\n # version: ['dev', 'main']\n Benefits: Shows when pages were last updated, helps users understand document freshness.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#2-git-integration--revision-information--high-priority","title":"2. Git Integration & Revision Information \u2b50 High Priority","text":"Add revision dates and authors to pages:
plugins:\n - git-revision-date-localized:\n enable_creation_date: true\n type: timeago\n fallback_to_build_date: true\n - git-committers:\n repository: DeepCritical/GradioDemo\n branch: dev\n Benefits: Users see when content was last updated, builds trust in documentation freshness.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#3-enhanced-navigation-features--high-priority","title":"3. Enhanced Navigation Features \u2b50 High Priority","text":"Add breadcrumbs and improve navigation:
theme:\n features:\n - navigation.tabs\n - navigation.sections\n - navigation.expand\n - navigation.top\n - navigation.indexes # Add index pages\n - navigation.instant # Instant page loads\n - navigation.tracking # Track scroll position\n - navigation.smooth # Smooth scrolling\n - search.suggest\n - search.highlight\n - content.code.annotate\n - content.code.copy\n - content.tabs.link # Link to specific tabs\n - content.tooltips # Tooltips for abbreviations\n Benefits: Better UX, easier navigation, professional feel.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#4-content-tabs-for-code-examples--high-priority","title":"4. Content Tabs for Code Examples \u2b50 High Priority","text":"Perfect for showing multiple code examples (Python, TypeScript, etc.):
markdown_extensions:\n - pymdownx.tabbed:\n alternate_style: true\n combine_header_slug: true # Add this\n Usage in docs:
=== \"Python\"\n ```python\n def example():\n pass\n ```\n\n=== \"TypeScript\"\n ```typescript\n function example() {}\n ```\n Benefits: Clean way to show multiple implementations without cluttering pages.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#5-enhanced-admonitions--medium-priority","title":"5. Enhanced Admonitions \u2b50 Medium Priority","text":"Add more admonition types and better styling:
markdown_extensions:\n - admonition\n - pymdownx.details\n - pymdownx.superfences:\n custom_fences:\n - name: mermaid\n class: mermaid\n format: !!python/name:pymdownx.superfences.fence_code_format\n # Add custom admonition fences\n - name: danger\n class: danger\n format: !!python/name:pymdownx.superfences.fence_code_format\n Usage:
!!! danger \"Important\"\n This is a critical warning.\n Benefits: Better visual hierarchy for warnings, tips, and important information.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#6-math-formula-support--medium-priority-if-needed","title":"6. Math Formula Support \u2b50 Medium Priority (if needed)","text":"If your documentation includes mathematical formulas:
markdown_extensions:\n - pymdownx.arithmatex:\n generic: true\n - pymdownx.superfences:\n custom_fences:\n - name: math\n class: arithmetic\n format: !!python/name:pymdownx.superfences.fence_code_format\n\nextra_javascript:\n - javascripts/mathjax.js\n - https://polyfill.io/v3/polyfill.min.js?features=es6\n - https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js\n Benefits: Essential for scientific/technical documentation with formulas.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#7-better-code-highlighting--medium-priority","title":"7. Better Code Highlighting \u2b50 Medium Priority","text":"Add more language support and better themes:
markdown_extensions:\n - pymdownx.highlight:\n anchor_linenums: true\n line_spans: __span\n pygments_lang_class: true\n use_pygments: true\n noclasses: false # Use CSS classes instead of inline styles\n Benefits: Better syntax highlighting, more language support.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#8-social-links-enhancement--low-priority","title":"8. Social Links Enhancement \u2b50 Low Priority","text":"Add more social platforms and better icons:
extra:\n social:\n - icon: fontawesome/brands/github\n link: https://github.com/DeepCritical/GradioDemo\n name: GitHub\n - icon: fontawesome/brands/twitter\n link: https://twitter.com/yourhandle\n name: Twitter\n - icon: material/web\n link: https://huggingface.co/spaces/DataQuests/DeepCritical\n name: HuggingFace Space\n - icon: fontawesome/brands/discord\n link: https://discord.gg/yourserver\n name: Discord\n Benefits: Better community engagement, more ways to connect.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#9-analytics-integration--medium-priority","title":"9. Analytics Integration \u2b50 Medium Priority","text":"Add privacy-respecting analytics:
extra:\n analytics:\n provider: google\n property: G-XXXXXXXXXX\n # Or use privacy-focused alternative:\n # analytics:\n # provider: plausible\n # domain: yourdomain.com\n Benefits: Understand how users interact with your documentation.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#10-custom-cssjs-for-branding--low-priority","title":"10. Custom CSS/JS for Branding \u2b50 Low Priority","text":"Add custom styling:
extra_css:\n - stylesheets/extra.css\n\nextra_javascript:\n - javascripts/extra.js\n Benefits: Customize appearance, add interactive features.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#11-better-table-of-contents--medium-priority","title":"11. Better Table of Contents \u2b50 Medium Priority","text":"Enhance TOC with more options:
markdown_extensions:\n - toc:\n permalink: true\n permalink_title: \"Anchor link to this section\"\n baselevel: 1\n toc_depth: 3\n slugify: !!python/object/apply:pymdownx.slugs.slugify\n kwds:\n case: lower\n Benefits: Better navigation within long pages, SEO-friendly anchor links.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#12-image-optimization--medium-priority","title":"12. Image Optimization \u2b50 Medium Priority","text":"Add image handling plugin:
plugins:\n - search\n - mermaid2\n - codeinclude\n - minify:\n minify_html: true\n minify_js: true\n minify_css: true\n - git-revision-date-localized:\n enable_creation_date: true\n type: timeago\n # Optional: Image optimization\n # - awesome-pages # For better page organization\n Benefits: Faster page loads, better mobile experience.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#13-keyboard-shortcuts--low-priority","title":"13. Keyboard Shortcuts \u2b50 Low Priority","text":"Enable keyboard navigation:
theme:\n keyboard_shortcuts:\n search: true\n previous: true\n next: true\n Benefits: Power users can navigate faster.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#14-print-styles--low-priority","title":"14. Print Styles \u2b50 Low Priority","text":"Better printing experience:
theme:\n features:\n - navigation.tabs\n - navigation.sections\n - navigation.expand\n - navigation.top\n - navigation.indexes\n - navigation.instant\n - navigation.tracking\n - navigation.smooth\n - search.suggest\n - search.highlight\n - content.code.annotate\n - content.code.copy\n - content.tabs.link\n - content.tooltips\n - content.action.edit # Edit button\n - content.action.view # View source\n Benefits: Users can print documentation cleanly.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#15-better-search-configuration--medium-priority","title":"15. Better Search Configuration \u2b50 Medium Priority","text":"Enhance search capabilities:
plugins:\n - search:\n lang:\n - en\n separator: '[\\s\\-,:!=\\[\\]()\"`/]+|\\.(?!\\d)|&[lg]t;|&'\n prebuild_index: true # For faster search\n indexing: full # Full-text indexing\n Benefits: Faster, more accurate search results.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#16-api-documentation-enhancements--high-priority-for-your-api-docs","title":"16. API Documentation Enhancements \u2b50 High Priority (for your API docs)","text":"Since you have extensive API documentation, consider:
markdown_extensions:\n - pymdownx.superfences:\n custom_fences:\n - name: mermaid\n class: mermaid\n format: !!python/name:pymdownx.superfences.fence_code_format\n preserve_tabs: true\n # Add API-specific features\n - attr_list\n - md_in_html\n - pymdownx.caret\n - pymdownx.tilde\n Benefits: Better formatting for API endpoints, parameters, responses.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#17-blognews-section--low-priority-if-needed","title":"17. Blog/News Section \u2b50 Low Priority (if needed)","text":"If you want to add a blog:
plugins:\n - blog:\n blog_dir: blog\n blog_description: \"News and updates\"\n post_date_format: full\n post_url_format: '{slug}'\n archive: true\n Benefits: Keep users updated with changelog, announcements.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#18-tags-and-categories--low-priority","title":"18. Tags and Categories \u2b50 Low Priority","text":"Organize content with tags:
markdown_extensions:\n - meta\n Then in frontmatter:
---\ntags:\n - api\n - agents\n - getting-started\n---\n Benefits: Better content organization, related content discovery.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#19-better-mobile-experience--high-priority","title":"19. Better Mobile Experience \u2b50 High Priority","text":"Ensure mobile optimization:
theme:\n features:\n - navigation.tabs\n - navigation.sections\n - navigation.expand\n - navigation.top\n - navigation.instant # Helps on mobile\n - navigation.tracking\n - navigation.smooth\n - search.suggest\n - search.highlight\n - content.code.annotate\n - content.code.copy\n - content.tabs.link\n - content.tooltips\n - toc.integrate # Better mobile TOC\n Benefits: Better experience for mobile users (growing segment).
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#20-feedback-mechanism--medium-priority","title":"20. Feedback Mechanism \u2b50 Medium Priority","text":"Add feedback buttons:
extra:\n feedback:\n title: \"Was this page helpful?\"\n ratings:\n - icon: material/thumb-up-outline\n name: \"This page was helpful\"\n - icon: material/thumb-down-outline\n name: \"This page could be improved\"\n Benefits: Understand what content needs improvement.
"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#priority-recommendations","title":"Priority Recommendations","text":""},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#immediate-high-impact-easy-implementation","title":"Immediate (High Impact, Easy Implementation)","text":"Here's an enhanced mkdocs.yml with the high-priority improvements:
site_name: The DETERMINATOR\nsite_description: Generalist Deep Research Agent that Stops at Nothing\nsite_author: The DETERMINATOR Team\nsite_url: https://deepcritical.github.io/GradioDemo/\n\nrepo_name: DeepCritical/GradioDemo\nrepo_url: https://github.com/DeepCritical/GradioDemo\nedit_uri: edit/dev/docs/\n\nstrict: false\n\ntheme:\n name: material\n palette:\n - scheme: default\n primary: orange\n accent: red\n toggle:\n icon: material/brightness-7\n name: Switch to dark mode\n - scheme: slate\n primary: orange\n accent: red\n toggle:\n icon: material/brightness-4\n name: Switch to light mode\n features:\n - navigation.tabs\n - navigation.sections\n - navigation.expand\n - navigation.top\n - navigation.indexes\n - navigation.instant\n - navigation.tracking\n - navigation.smooth\n - search.suggest\n - search.highlight\n - content.code.annotate\n - content.code.copy\n - content.tabs.link\n - content.tooltips\n - toc.integrate\n icon:\n repo: fontawesome/brands/github\n language: en\n\nplugins:\n - search:\n lang:\n - en\n separator: '[\\s\\-,:!=\\[\\]()\"`/]+|\\.(?!\\d)|&[lg]t;|&'\n prebuild_index: true\n indexing: full\n - mermaid2\n - codeinclude\n - git-revision-date-localized:\n enable_creation_date: true\n type: timeago\n fallback_to_build_date: true\n - minify:\n minify_html: true\n minify_js: true\n minify_css: true\n\nmarkdown_extensions:\n - dev.docs_plugins:\n base_path: \".\"\n - pymdownx.highlight:\n anchor_linenums: true\n line_spans: __span\n pygments_lang_class: true\n use_pygments: true\n noclasses: false\n - pymdownx.inlinehilite\n - pymdownx.superfences:\n custom_fences:\n - name: mermaid\n class: mermaid\n format: !!python/name:pymdownx.superfences.fence_code_format\n preserve_tabs: true\n - pymdownx.tabbed:\n alternate_style: true\n combine_header_slug: true\n - pymdownx.tasklist:\n custom_checkbox: true\n - pymdownx.emoji:\n emoji_generator: !!python/name:pymdownx.emoji.to_svg\n emoji_index: !!python/name:pymdownx.emoji.twemoji\n - pymdownx.snippets\n - admonition\n - pymdownx.details\n - attr_list\n - md_in_html\n - tables\n - meta\n - toc:\n permalink: true\n permalink_title: \"Anchor link to this section\"\n baselevel: 1\n toc_depth: 3\n slugify: !!python/object/apply:pymdownx.slugs.slugify\n kwds:\n case: lower\n\nnav:\n - Home: index.md\n - Overview:\n - overview/architecture.md\n - overview/features.md\n - Getting Started:\n - getting-started/installation.md\n - getting-started/quick-start.md\n - getting-started/mcp-integration.md\n - getting-started/examples.md\n - Configuration:\n - configuration/index.md\n - Architecture:\n - \"Graph Orchestration\": architecture/graph_orchestration.md\n - \"Workflow Diagrams\": architecture/workflow-diagrams.md\n - \"Agents\": architecture/agents.md\n - \"Orchestrators\": architecture/orchestrators.md\n - \"Tools\": architecture/tools.md\n - \"Middleware\": architecture/middleware.md\n - \"Services\": architecture/services.md\n - API Reference:\n - api/agents.md\n - api/tools.md\n - api/orchestrators.md\n - api/services.md\n - api/models.md\n - Contributing:\n - contributing/index.md\n - contributing/code-quality.md\n - contributing/code-style.md\n - contributing/error-handling.md\n - contributing/implementation-patterns.md\n - contributing/prompt-engineering.md\n - contributing/testing.md\n - License: LICENSE.md\n - Team: team.md\n\nextra:\n social:\n - icon: fontawesome/brands/github\n link: https://github.com/DeepCritical/GradioDemo\n name: GitHub\n - icon: material/web\n link: https://huggingface.co/spaces/DataQuests/DeepCritical\n name: HuggingFace Space\n version:\n provider: mike\n generator:\n enabled: false\n\ncopyright: Copyright © 2024 DeepCritical Team\n"},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#additional-documentation-improvements","title":"Additional Documentation Improvements","text":""},{"location":"MKDOCS_IMPROVEMENTS_ASSESSMENT/#content-structure","title":"Content Structure","text":"DeepCritical is developed by a team of researchers and developers working on AI-assisted research.
"},{"location":"team/#team-members","title":"Team Members","text":""},{"location":"team/#zj","title":"ZJ","text":"The DeepCritical team met online in the Alzheimer's Critical Literature Review Group in the Hugging Science initiative. We're building the agent framework we want to use for AI-assisted research to turn the vast amounts of clinical data into cures.
"},{"location":"team/#contributing","title":"Contributing","text":"We welcome contributions! See the Contributing Guide for details.
"},{"location":"team/#links","title":"Links","text":"This page documents the API for DeepCritical agents.
"},{"location":"api/agents/#knowledgegapagent","title":"KnowledgeGapAgent","text":"Module: src.agents.knowledge_gap
Purpose: Evaluates research state and identifies knowledge gaps.
"},{"location":"api/agents/#methods","title":"Methods","text":""},{"location":"api/agents/#evaluate","title":"evaluate","text":"Evaluates research completeness and identifies outstanding knowledge gaps.
Parameters: - query: Research query string - background_context: Background context for the query (default: \"\") - conversation_history: History of actions, findings, and thoughts as string (default: \"\") - iteration: Current iteration number (default: 0) - time_elapsed_minutes: Elapsed time in minutes (default: 0.0) - max_time_minutes: Maximum time limit in minutes (default: 10)
Returns: KnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps
Module: src.agents.tool_selector
Purpose: Selects appropriate tools for addressing knowledge gaps.
"},{"location":"api/agents/#methods_1","title":"Methods","text":""},{"location":"api/agents/#select_tools","title":"select_tools","text":"Selects tools for addressing a knowledge gap.
Parameters: - gap: The knowledge gap to address - query: Research query string - background_context: Optional background context (default: \"\") - conversation_history: History of actions, findings, and thoughts as string (default: \"\")
Returns: AgentSelectionPlan with list of AgentTask objects.
Module: src.agents.writer
Purpose: Generates final reports from research findings.
"},{"location":"api/agents/#methods_2","title":"Methods","text":""},{"location":"api/agents/#write_report","title":"write_report","text":"Generates a markdown report from research findings.
Parameters: - query: Research query string - findings: Research findings to include in report - output_length: Optional description of desired output length (default: \"\") - output_instructions: Optional additional instructions for report generation (default: \"\")
Returns: Markdown string with numbered citations.
"},{"location":"api/agents/#longwriteragent","title":"LongWriterAgent","text":"Module: src.agents.long_writer
Purpose: Long-form report generation with section-by-section writing.
"},{"location":"api/agents/#methods_3","title":"Methods","text":""},{"location":"api/agents/#write_next_section","title":"write_next_section","text":"Writes the next section of a long-form report.
Parameters: - original_query: The original research query - report_draft: Current report draft as string (all sections written so far) - next_section_title: Title of the section to write - next_section_draft: Draft content for the next section
Returns: LongWriterOutput with formatted section and references.
write_report","text":"Generates final report from draft.
Parameters: - query: Research query string - report_title: Title of the report - report_draft: Complete report draft
Returns: Final markdown report string.
"},{"location":"api/agents/#proofreaderagent","title":"ProofreaderAgent","text":"Module: src.agents.proofreader
Purpose: Proofreads and polishes report drafts.
"},{"location":"api/agents/#methods_4","title":"Methods","text":""},{"location":"api/agents/#proofread","title":"proofread","text":"Proofreads and polishes a report draft.
Parameters: - query: Research query string - report_title: Title of the report - report_draft: Report draft to proofread
Returns: Polished markdown string.
"},{"location":"api/agents/#thinkingagent","title":"ThinkingAgent","text":"Module: src.agents.thinking
Purpose: Generates observations from conversation history.
"},{"location":"api/agents/#methods_5","title":"Methods","text":""},{"location":"api/agents/#generate_observations","title":"generate_observations","text":"Generates observations from conversation history.
Parameters: - query: Research query string - background_context: Optional background context (default: \"\") - conversation_history: History of actions, findings, and thoughts as string (default: \"\") - iteration: Current iteration number (default: 1)
Returns: Observation string.
"},{"location":"api/agents/#inputparseragent","title":"InputParserAgent","text":"Module: src.agents.input_parser
Purpose: Parses and improves user queries, detects research mode.
"},{"location":"api/agents/#methods_6","title":"Methods","text":""},{"location":"api/agents/#parse","title":"parse","text":"Parses and improves a user query.
Parameters: - query: Original query string
Returns: ParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: \"iterative\" or \"deep\" - key_entities: List of key entities - research_questions: List of research questions
All agents have factory functions in src.agent_factory.agents:
Parameters: - model: Optional Pydantic AI model. If None, uses get_model() from settings. - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars)
Returns: Agent instance.
"},{"location":"api/agents/#see-also","title":"See Also","text":"This page documents the Pydantic models used throughout DeepCritical.
"},{"location":"api/models/#evidence","title":"Evidence","text":"Module: src.utils.models
Purpose: Represents evidence from search results.
Fields: - citation: Citation information (title, URL, date, authors) - content: Evidence text content - relevance: Relevance score (0.0-1.0) - metadata: Additional metadata dictionary
Module: src.utils.models
Purpose: Citation information for evidence.
Fields: - source: Source name (e.g., \"pubmed\", \"clinicaltrials\", \"europepmc\", \"web\", \"rag\") - title: Article/trial title - url: Source URL - date: Publication date (YYYY-MM-DD or \"Unknown\") - authors: List of authors (optional)
Module: src.utils.models
Purpose: Output from knowledge gap evaluation.
Fields: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps
Module: src.utils.models
Purpose: Plan for tool/agent selection.
Fields: - tasks: List of agent tasks to execute
Module: src.utils.models
Purpose: Individual agent task.
Fields: - gap: The knowledge gap being addressed (optional) - agent: Name of agent to use - query: The specific query for the agent - entity_website: The website of the entity being researched, if known (optional)
Module: src.utils.models
Purpose: Draft structure for long-form reports.
Fields: - sections: List of report sections
Module: src.utils.models
Purpose: Individual section in a report draft.
Fields: - section_title: The title of the section - section_content: The content of the section
Module: src.utils.models
Purpose: Parsed and improved query.
Fields: - original_query: Original query string - improved_query: Refined query string - research_mode: Research mode (\"iterative\" or \"deep\") - key_entities: List of key entities - research_questions: List of research questions
Module: src.utils.models
Purpose: Conversation history with iterations.
Fields: - history: List of iteration data
Module: src.utils.models
Purpose: Data for a single iteration.
Fields: - gap: The gap addressed in the iteration - tool_calls: The tool calls made - findings: The findings collected from tool calls - thought: The thinking done to reflect on the success of the iteration and next steps
Module: src.utils.models
Purpose: Event emitted during research execution.
Fields: - type: Event type (e.g., \"started\", \"search_complete\", \"complete\") - iteration: Iteration number (optional) - data: Event data dictionary
Module: src.utils.models
Purpose: Current budget status.
Fields: - tokens_used: Total tokens used - tokens_limit: Token budget limit - time_elapsed_seconds: Time elapsed in seconds - time_limit_seconds: Time budget limit (default: 600.0 seconds / 10 minutes) - iterations: Number of iterations completed - iterations_limit: Maximum iterations (default: 10) - iteration_tokens: Tokens used per iteration (iteration number -> token count)
This page documents the API for DeepCritical orchestrators.
"},{"location":"api/orchestrators/#iterativeresearchflow","title":"IterativeResearchFlow","text":"Module: src.orchestrator.research_flow
Purpose: Single-loop research with search-judge-synthesize cycles.
"},{"location":"api/orchestrators/#methods","title":"Methods","text":""},{"location":"api/orchestrators/#run","title":"run","text":"Runs iterative research flow.
Parameters: - query: Research query string - background_context: Background context (default: \"\") - output_length: Optional description of desired output length (default: \"\") - output_instructions: Optional additional instructions for report generation (default: \"\")
Returns: Final report string.
Note: max_iterations, max_time_minutes, and token_budget are constructor parameters, not run() parameters.
Module: src.orchestrator.research_flow
Purpose: Multi-section parallel research with planning and synthesis.
"},{"location":"api/orchestrators/#methods_1","title":"Methods","text":""},{"location":"api/orchestrators/#run_1","title":"run","text":"Runs deep research flow.
Parameters: - query: Research query string
Returns: Final report string.
Note: max_iterations_per_section, max_time_minutes, and token_budget are constructor parameters, not run() parameters.
Module: src.orchestrator.graph_orchestrator
Purpose: Graph-based execution using Pydantic AI agents as nodes.
"},{"location":"api/orchestrators/#methods_2","title":"Methods","text":""},{"location":"api/orchestrators/#run_2","title":"run","text":"Runs graph-based research orchestration.
Parameters: - query: Research query string
Yields: AgentEvent objects during graph execution.
Note: research_mode and use_graph are constructor parameters, not run() parameters.
Module: src.orchestrator_factory
Purpose: Factory for creating orchestrators.
"},{"location":"api/orchestrators/#functions","title":"Functions","text":""},{"location":"api/orchestrators/#create_orchestrator","title":"create_orchestrator","text":"Creates an orchestrator instance.
Parameters: - search_handler: Search handler protocol implementation (optional, required for simple mode) - judge_handler: Judge handler protocol implementation (optional, required for simple mode) - config: Configuration object (optional) - mode: Orchestrator mode (\"simple\", \"advanced\", \"magentic\", \"iterative\", \"deep\", \"auto\", or None for auto-detect) - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars)
Returns: Orchestrator instance.
Raises: - ValueError: If requirements not met
Modes: - \"simple\": Legacy orchestrator - \"advanced\" or \"magentic\": Magentic orchestrator (requires OpenAI API key) - None: Auto-detect based on API key availability
Module: src.orchestrator_magentic
Purpose: Multi-agent coordination using Microsoft Agent Framework.
"},{"location":"api/orchestrators/#methods_3","title":"Methods","text":""},{"location":"api/orchestrators/#run_3","title":"run","text":"Runs Magentic orchestration.
Parameters: - query: Research query string
Yields: AgentEvent objects converted from Magentic events.
Note: max_rounds and max_stalls are constructor parameters, not run() parameters.
Requirements: - agent-framework-core package - OpenAI API key
This page documents the API for DeepCritical services.
"},{"location":"api/services/#embeddingservice","title":"EmbeddingService","text":"Module: src.services.embeddings
Purpose: Local sentence-transformers for semantic search and deduplication.
"},{"location":"api/services/#methods","title":"Methods","text":""},{"location":"api/services/#embed","title":"embed","text":"Generates embedding for a text string.
Parameters: - text: Text to embed
Returns: Embedding vector as list of floats.
"},{"location":"api/services/#embed_batch","title":"embed_batch","text":"async def embed_batch(self, texts: list[str]) -> list[list[float]]\n Generates embeddings for multiple texts.
Parameters: - texts: List of texts to embed
Returns: List of embedding vectors.
"},{"location":"api/services/#similarity","title":"similarity","text":"async def similarity(self, text1: str, text2: str) -> float\n Calculates similarity between two texts.
Parameters: - text1: First text - text2: Second text
Returns: Similarity score (0.0-1.0).
"},{"location":"api/services/#find_duplicates","title":"find_duplicates","text":"async def find_duplicates(\n self,\n texts: list[str],\n threshold: float = 0.85\n) -> list[tuple[int, int]]\n Finds duplicate texts based on similarity threshold.
Parameters: - texts: List of texts to check - threshold: Similarity threshold (default: 0.85)
Returns: List of (index1, index2) tuples for duplicate pairs.
"},{"location":"api/services/#add_evidence","title":"add_evidence","text":"async def add_evidence(\n self,\n evidence_id: str,\n content: str,\n metadata: dict[str, Any]\n) -> None\n Adds evidence to vector store for semantic search.
Parameters: - evidence_id: Unique identifier for the evidence - content: Evidence text content - metadata: Additional metadata dictionary
search_similar","text":"async def search_similar(\n self,\n query: str,\n n_results: int = 5\n) -> list[dict[str, Any]]\n Finds semantically similar evidence.
Parameters: - query: Search query string - n_results: Number of results to return (default: 5)
Returns: List of dictionaries with id, content, metadata, and distance keys.
deduplicate","text":"async def deduplicate(\n self,\n new_evidence: list[Evidence],\n threshold: float = 0.9\n) -> list[Evidence]\n Removes semantically duplicate evidence.
Parameters: - new_evidence: List of evidence items to deduplicate - threshold: Similarity threshold (default: 0.9, where 0.9 = 90% similar is duplicate)
Returns: List of unique evidence items (not already in vector store).
"},{"location":"api/services/#factory-function","title":"Factory Function","text":""},{"location":"api/services/#get_embedding_service","title":"get_embedding_service","text":"@lru_cache(maxsize=1)\ndef get_embedding_service() -> EmbeddingService\n Returns singleton EmbeddingService instance.
"},{"location":"api/services/#llamaindexragservice","title":"LlamaIndexRAGService","text":"Module: src.services.rag
Purpose: Retrieval-Augmented Generation using LlamaIndex.
"},{"location":"api/services/#methods_1","title":"Methods","text":""},{"location":"api/services/#ingest_evidence","title":"ingest_evidence","text":"Ingests evidence into RAG service.
Parameters: - evidence_list: List of Evidence objects to ingest
Note: Supports multiple embedding providers (OpenAI, local sentence-transformers, Hugging Face).
"},{"location":"api/services/#retrieve","title":"retrieve","text":"def retrieve(\n self,\n query: str,\n top_k: int | None = None\n) -> list[dict[str, Any]]\n Retrieves relevant documents for a query.
Parameters: - query: Search query string - top_k: Number of top results to return (defaults to similarity_top_k from constructor)
Returns: List of dictionaries with text, score, and metadata keys.
query","text":"def query(\n self,\n query_str: str,\n top_k: int | None = None\n) -> str\n Queries RAG service and returns synthesized response.
Parameters: - query_str: Query string - top_k: Number of results to use (defaults to similarity_top_k from constructor)
Returns: Synthesized response string.
Raises: - ConfigurationError: If no LLM API key is available for query synthesis
ingest_documents","text":"def ingest_documents(self, documents: list[Any]) -> None\n Ingests raw LlamaIndex Documents.
Parameters: - documents: List of LlamaIndex Document objects
clear_collection","text":"def clear_collection(self) -> None\n Clears all documents from the collection.
"},{"location":"api/services/#factory-function_1","title":"Factory Function","text":""},{"location":"api/services/#get_rag_service","title":"get_rag_service","text":"def get_rag_service(\n collection_name: str = \"deepcritical_evidence\",\n oauth_token: str | None = None,\n **kwargs: Any\n) -> LlamaIndexRAGService\n Get or create a RAG service instance.
Parameters: - collection_name: Name of the ChromaDB collection (default: \"deepcritical_evidence\") - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars) - **kwargs: Additional arguments for LlamaIndexRAGService (e.g., use_openai_embeddings=False)
Returns: Configured LlamaIndexRAGService instance.
Note: By default, uses local embeddings (sentence-transformers) which require no API keys.
"},{"location":"api/services/#statisticalanalyzer","title":"StatisticalAnalyzer","text":"Module: src.services.statistical_analyzer
Purpose: Secure execution of AI-generated statistical code.
"},{"location":"api/services/#methods_2","title":"Methods","text":""},{"location":"api/services/#analyze","title":"analyze","text":"async def analyze(\n self,\n query: str,\n evidence: list[Evidence],\n hypothesis: dict[str, Any] | None = None\n) -> AnalysisResult\n Analyzes a research question using statistical methods.
Parameters: - query: The research question - evidence: List of Evidence objects to analyze - hypothesis: Optional hypothesis dict with drug, target, pathway, effect, confidence keys
Returns: AnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - confidence: Confidence in verdict (0.0-1.0) - statistical_evidence: Summary of statistical findings - code_generated: Python code that was executed - execution_output: Output from code execution - key_takeaways: Key takeaways from analysis - limitations: List of limitations
Note: Requires Modal credentials for sandbox execution.
"},{"location":"api/services/#see-also","title":"See Also","text":"This page documents the API for DeepCritical search tools.
"},{"location":"api/tools/#searchtool-protocol","title":"SearchTool Protocol","text":"All tools implement the SearchTool protocol:
class SearchTool(Protocol):\n @property\n def name(self) -> str: ...\n \n async def search(\n self, \n query: str, \n max_results: int = 10\n ) -> list[Evidence]: ...\n"},{"location":"api/tools/#pubmedtool","title":"PubMedTool","text":"Module: src.tools.pubmed
Purpose: Search peer-reviewed biomedical literature from PubMed.
"},{"location":"api/tools/#properties","title":"Properties","text":""},{"location":"api/tools/#name","title":"name","text":"@property\ndef name(self) -> str\n Returns tool name: \"pubmed\"
search","text":"async def search(\n self,\n query: str,\n max_results: int = 10\n) -> list[Evidence]\n Searches PubMed for articles.
Parameters: - query: Search query string - max_results: Maximum number of results to return (default: 10)
Returns: List of Evidence objects with PubMed articles.
Raises: - SearchError: If search fails (timeout, HTTP error, XML parsing error) - RateLimitError: If rate limit is exceeded (429 status code)
Note: Uses NCBI E-utilities (ESearch \u2192 EFetch). Rate limit: 0.34s between requests. Handles single vs. multiple articles.
"},{"location":"api/tools/#clinicaltrialstool","title":"ClinicalTrialsTool","text":"Module: src.tools.clinicaltrials
Purpose: Search ClinicalTrials.gov for interventional studies.
"},{"location":"api/tools/#properties_1","title":"Properties","text":""},{"location":"api/tools/#name_1","title":"name","text":"@property\ndef name(self) -> str\n Returns tool name: \"clinicaltrials\"
search","text":"async def search(\n self,\n query: str,\n max_results: int = 10\n) -> list[Evidence]\n Searches ClinicalTrials.gov for trials.
Parameters: - query: Search query string - max_results: Maximum number of results to return (default: 10)
Returns: List of Evidence objects with clinical trials.
Note: Only returns interventional studies with status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION. Uses requests library (NOT httpx - WAF blocks httpx). Runs in thread pool for async compatibility.
Raises: - SearchError: If search fails (HTTP error, request exception)
Module: src.tools.europepmc
Purpose: Search Europe PMC for preprints and peer-reviewed articles.
"},{"location":"api/tools/#properties_2","title":"Properties","text":""},{"location":"api/tools/#name_2","title":"name","text":"@property\ndef name(self) -> str\n Returns tool name: \"europepmc\"
search","text":"async def search(\n self,\n query: str,\n max_results: int = 10\n) -> list[Evidence]\n Searches Europe PMC for articles and preprints.
Parameters: - query: Search query string - max_results: Maximum number of results to return (default: 10)
Returns: List of Evidence objects with articles/preprints.
Note: Includes both preprints (marked with [PREPRINT - Not peer-reviewed]) and peer-reviewed articles. Handles preprint markers. Builds URLs from DOI or PMID.
Raises: - SearchError: If search fails (HTTP error, connection error)
Module: src.tools.rag_tool
Purpose: Semantic search within collected evidence.
"},{"location":"api/tools/#initialization","title":"Initialization","text":"def __init__(\n self,\n rag_service: LlamaIndexRAGService | None = None,\n oauth_token: str | None = None\n) -> None\n Parameters: - rag_service: Optional RAG service instance. If None, will be lazy-initialized. - oauth_token: Optional OAuth token from HuggingFace login (for RAG LLM)
name","text":"@property\ndef name(self) -> str\n Returns tool name: \"rag\"
search","text":"async def search(\n self,\n query: str,\n max_results: int = 10\n) -> list[Evidence]\n Searches collected evidence using semantic similarity.
Parameters: - query: Search query string - max_results: Maximum number of results to return (default: 10)
Returns: List of Evidence objects from collected evidence.
Raises: - ConfigurationError: If RAG service is unavailable
Note: Requires evidence to be ingested into RAG service first. Wraps LlamaIndexRAGService. Returns Evidence from RAG results.
Module: src.tools.search_handler
Purpose: Orchestrates parallel searches across multiple tools.
"},{"location":"api/tools/#initialization_1","title":"Initialization","text":"def __init__(\n self,\n tools: list[SearchTool],\n timeout: float = 30.0,\n include_rag: bool = False,\n auto_ingest_to_rag: bool = True,\n oauth_token: str | None = None\n) -> None\n Parameters: - tools: List of search tools to use - timeout: Timeout for each search in seconds (default: 30.0) - include_rag: Whether to include RAG tool in searches (default: False) - auto_ingest_to_rag: Whether to automatically ingest results into RAG (default: True) - oauth_token: Optional OAuth token from HuggingFace login (for RAG LLM)
execute","text":"Searches multiple tools in parallel.
Parameters: - query: Search query string - max_results_per_tool: Maximum results per tool (default: 10)
Returns: SearchResult with: - query: The search query - evidence: Aggregated list of evidence - sources_searched: List of source names searched - total_found: Total number of results - errors: List of error messages from failed tools
Raises: - SearchError: If search times out
Note: Uses asyncio.gather() for parallel execution. Handles tool failures gracefully (returns errors in SearchResult.errors). Automatically ingests evidence into RAG if enabled.
DeepCritical uses Pydantic AI agents for all AI-powered operations. All agents follow a consistent pattern and use structured output types.
"},{"location":"architecture/agents/#agent-pattern","title":"Agent Pattern","text":""},{"location":"architecture/agents/#pydantic-ai-agents","title":"Pydantic AI Agents","text":"Pydantic AI agents use the Agent class with the following structure:
__init__(model: Any | None = None)async def evaluate(), async def write_report())def create_agent_name(model: Any | None = None, oauth_token: str | None = None) -> AgentNameNote: Factory functions accept an optional oauth_token parameter for HuggingFace authentication, which takes priority over environment variables.
Agents use get_model() from src/agent_factory/judges.py if no model is provided. This supports:
The model selection is based on the configured LLM_PROVIDER in settings.
Agents return fallback values on failure rather than raising exceptions:
KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])All errors are logged with context using structlog.
"},{"location":"architecture/agents/#input-validation","title":"Input Validation","text":"All agents validate inputs:
Agents use structured output types from src/utils/models.py:
KnowledgeGapOutput: Research completeness evaluationAgentSelectionPlan: Tool selection planReportDraft: Long-form report structureParsedQuery: Query parsing and mode detectionFor text output (writer agents), agents return str directly.
File: src/agents/knowledge_gap.py
Purpose: Evaluates research state and identifies knowledge gaps.
Output: KnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps
Methods: - async def evaluate(query, background_context, conversation_history, iteration, time_elapsed_minutes, max_time_minutes) -> KnowledgeGapOutput
File: src/agents/tool_selector.py
Purpose: Selects appropriate tools for addressing knowledge gaps.
Output: AgentSelectionPlan with list of AgentTask objects.
Available Agents: - WebSearchAgent: General web search for fresh information - SiteCrawlerAgent: Research specific entities/companies - RAGAgent: Semantic search within collected evidence
File: src/agents/writer.py
Purpose: Generates final reports from research findings.
Output: Markdown string with numbered citations.
Methods: - async def write_report(query, findings, output_length, output_instructions) -> str
Features: - Validates inputs - Truncates very long findings (max 50000 chars) with warning - Retry logic for transient failures (3 retries) - Citation validation before returning
"},{"location":"architecture/agents/#long-writer-agent","title":"Long Writer Agent","text":"File: src/agents/long_writer.py
Purpose: Long-form report generation with section-by-section writing.
Input/Output: Uses ReportDraft models.
Methods: - async def write_next_section(query, draft, section_title, section_content) -> LongWriterOutput - async def write_report(query, report_title, report_draft) -> str
Features: - Writes sections iteratively - Aggregates references across sections - Reformats section headings and references - Deduplicates and renumbers references
"},{"location":"architecture/agents/#proofreader-agent","title":"Proofreader Agent","text":"File: src/agents/proofreader.py
Purpose: Proofreads and polishes report drafts.
Input: ReportDraft Output: Polished markdown string
Methods: - async def proofread(query, report_title, report_draft) -> str
Features: - Removes duplicate content across sections - Adds executive summary if multiple sections - Preserves all references and citations - Improves flow and readability
"},{"location":"architecture/agents/#thinking-agent","title":"Thinking Agent","text":"File: src/agents/thinking.py
Purpose: Generates observations from conversation history.
Output: Observation string
Methods: - async def generate_observations(query, background_context, conversation_history) -> str
File: src/agents/input_parser.py
Purpose: Parses and improves user queries, detects research mode.
Output: ParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: \"iterative\" or \"deep\" - key_entities: List of key entities - research_questions: List of research questions
The following agents use the BaseAgent pattern from agent-framework and are used exclusively with MagenticOrchestrator:
File: src/agents/hypothesis_agent.py
Purpose: Generates mechanistic hypotheses based on evidence.
Pattern: BaseAgent from agent-framework
Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse
Features: - Uses internal Pydantic AI Agent with HypothesisAssessment output type - Accesses shared evidence_store for evidence - Uses embedding service for diverse evidence selection (MMR algorithm) - Stores hypotheses in shared context
File: src/agents/search_agent.py
Purpose: Wraps SearchHandler as an agent for Magentic orchestrator.
Pattern: BaseAgent from agent-framework
Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse
Features: - Executes searches via SearchHandlerProtocol - Deduplicates evidence using embedding service - Searches for semantically related evidence - Updates shared evidence store
File: src/agents/analysis_agent.py
Purpose: Performs statistical analysis using Modal sandbox.
Pattern: BaseAgent from agent-framework
Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse
Features: - Wraps StatisticalAnalyzer service - Analyzes evidence and hypotheses - Returns verdict (SUPPORTED/REFUTED/INCONCLUSIVE) - Stores analysis results in shared context
File: src/agents/report_agent.py
Purpose: Generates structured scientific reports from evidence and hypotheses.
Pattern: BaseAgent from agent-framework
Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse
Features: - Uses internal Pydantic AI Agent with ResearchReport output type - Accesses shared evidence store and hypotheses - Validates citations before returning - Formats report as markdown
File: src/agents/judge_agent.py
Purpose: Evaluates evidence quality and determines if sufficient for synthesis.
Pattern: BaseAgent from agent-framework
Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse - async def run_stream(messages, thread, **kwargs) -> AsyncIterable[AgentRunResponseUpdate]
Features: - Wraps JudgeHandlerProtocol - Accesses shared evidence store - Returns JudgeAssessment with sufficient flag, confidence, and recommendation
DeepCritical uses two distinct agent patterns:
"},{"location":"architecture/agents/#1-pydantic-ai-agents-traditional-pattern","title":"1. Pydantic AI Agents (Traditional Pattern)","text":"These agents use the Pydantic AI Agent class directly and are used in iterative and deep research flows:
Agent(model, output_type, system_prompt)__init__(model: Any | None = None)async def evaluate(), async def write_report())KnowledgeGapAgent, ToolSelectorAgent, WriterAgent, LongWriterAgent, ProofreaderAgent, ThinkingAgent, InputParserAgentThese agents use the BaseAgent class from agent-framework and are used in Magentic orchestrator:
BaseAgent from agent-framework with async def run() method__init__(evidence_store, embedding_service, ...)async def run(messages, thread, **kwargs) -> AgentRunResponseHypothesisAgent, SearchAgent, AnalysisAgent, ReportAgent, JudgeAgentNote: Magentic agents are used exclusively with the MagenticOrchestrator and follow the agent-framework protocol for multi-agent coordination.
All agents have factory functions in src/agent_factory/agents.py:
Factory functions: - Use get_model() if no model provided - Accept oauth_token parameter for HuggingFace authentication - Raise ConfigurationError if creation fails - Log agent creation
DeepCritical implements a graph-based orchestration system for research workflows using Pydantic AI agents as nodes. This enables better parallel execution, conditional routing, and state management compared to simple agent chains.
"},{"location":"architecture/graph_orchestration/#graph-patterns","title":"Graph Patterns","text":""},{"location":"architecture/graph_orchestration/#iterative-research-graph","title":"Iterative Research Graph","text":"The iterative research graph follows this pattern:
[Input] \u2192 [Thinking] \u2192 [Knowledge Gap] \u2192 [Decision: Complete?]\n \u2193 No \u2193 Yes\n [Tool Selector] [Writer]\n \u2193\n [Execute Tools] \u2192 [Loop Back]\n Node IDs: thinking \u2192 knowledge_gap \u2192 continue_decision \u2192 tool_selector/writer \u2192 execute_tools \u2192 (loop back to thinking)
Special Node Handling: - execute_tools: State node that uses search_handler to execute searches and add evidence to workflow state - continue_decision: Decision node that routes based on research_complete flag from KnowledgeGapOutput
The deep research graph follows this pattern:
[Input] \u2192 [Planner] \u2192 [Store Plan] \u2192 [Parallel Loops] \u2192 [Collect Drafts] \u2192 [Synthesizer]\n \u2193 \u2193 \u2193\n [Loop1] [Loop2] [Loop3]\n Node IDs: planner \u2192 store_plan \u2192 parallel_loops \u2192 collect_drafts \u2192 synthesizer
Special Node Handling: - planner: Agent node that creates ReportPlan with report outline - store_plan: State node that stores ReportPlan in context for parallel loops - parallel_loops: Parallel node that executes IterativeResearchFlow instances for each section - collect_drafts: State node that collects section drafts from parallel loops - synthesizer: Agent node that calls LongWriterAgent.write_report() directly with ReportDraft
\nsequenceDiagram\n actor User\n participant GraphOrchestrator\n participant InputParser\n participant GraphBuilder\n participant GraphExecutor\n participant Agent\n participant BudgetTracker\n participant WorkflowState\n\n User->>GraphOrchestrator: run(query)\n GraphOrchestrator->>InputParser: detect_research_mode(query)\n InputParser-->>GraphOrchestrator: mode (iterative/deep)\n GraphOrchestrator->>GraphBuilder: build_graph(mode)\n GraphBuilder-->>GraphOrchestrator: ResearchGraph\n GraphOrchestrator->>WorkflowState: init_workflow_state()\n GraphOrchestrator->>BudgetTracker: create_budget()\n GraphOrchestrator->>GraphExecutor: _execute_graph(graph)\n \n loop For each node in graph\n GraphExecutor->>Agent: execute_node(agent_node)\n Agent->>Agent: process_input\n Agent-->>GraphExecutor: result\n GraphExecutor->>WorkflowState: update_state(result)\n GraphExecutor->>BudgetTracker: add_tokens(used)\n GraphExecutor->>BudgetTracker: check_budget()\n alt Budget exceeded\n GraphExecutor->>GraphOrchestrator: emit(error_event)\n else Continue\n GraphExecutor->>GraphOrchestrator: emit(progress_event)\n end\n end\n \n GraphOrchestrator->>User: AsyncGenerator[AgentEvent]\n"},{"location":"architecture/graph_orchestration/#iterative-research","title":"Iterative Research","text":"sequenceDiagram\n participant IterativeFlow\n participant ThinkingAgent\n participant KnowledgeGapAgent\n participant ToolSelector\n participant ToolExecutor\n participant JudgeHandler\n participant WriterAgent\n\n IterativeFlow->>IterativeFlow: run(query)\n \n loop Until complete or max_iterations\n IterativeFlow->>ThinkingAgent: generate_observations()\n ThinkingAgent-->>IterativeFlow: observations\n \n IterativeFlow->>KnowledgeGapAgent: evaluate_gaps()\n KnowledgeGapAgent-->>IterativeFlow: KnowledgeGapOutput\n \n alt Research complete\n IterativeFlow->>WriterAgent: create_final_report()\n WriterAgent-->>IterativeFlow: final_report\n else Gaps remain\n IterativeFlow->>ToolSelector: select_agents(gap)\n ToolSelector-->>IterativeFlow: AgentSelectionPlan\n \n IterativeFlow->>ToolExecutor: execute_tool_tasks()\n ToolExecutor-->>IterativeFlow: ToolAgentOutput[]\n \n IterativeFlow->>JudgeHandler: assess_evidence()\n JudgeHandler-->>IterativeFlow: should_continue\n end\n end"},{"location":"architecture/graph_orchestration/#graph-structure","title":"Graph Structure","text":""},{"location":"architecture/graph_orchestration/#nodes","title":"Nodes","text":"Graph nodes represent different stages in the research workflow:
Examples: KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent
State Nodes: Update or read workflow state
Examples: Update evidence, update conversation history
Decision Nodes: Make routing decisions based on conditions
Examples: Continue research vs. complete research
Parallel Nodes: Execute multiple nodes concurrently
Edges define transitions between nodes:
Condition: None (always True)
Conditional Edges: Traversed based on condition
Example: If research complete \u2192 go to writer, else \u2192 continue loop
Parallel Edges: Used for parallel execution branches
State is managed via WorkflowState using ContextVar for thread-safe isolation:
State transitions occur at state nodes, which update the global workflow state.
"},{"location":"architecture/graph_orchestration/#execution-flow","title":"Execution Flow","text":"create_iterative_graph() or create_deep_graph()ResearchGraph.validate_structure()GraphOrchestrator._execute_graph()agent.run() with transformed inputstate_updater functiondecision_function to get next node IDasyncio.gather()asyncio.gather() for parallel nodesGraphExecutionContext.update_state()AgentEvent objects during execution for UIThe GraphExecutionContext class manages execution state during graph traversal:
WorkflowState instanceBudgetTracker instance for budget enforcementMethods: - set_node_result(node_id, result): Store result from node execution - get_node_result(node_id): Retrieve stored result - has_visited(node_id): Check if node was visited - mark_visited(node_id): Mark node as visited - update_state(updater, data): Update workflow state
Decision nodes evaluate conditions and return next node IDs:
research_complete \u2192 writer, else \u2192 tool selectorParallel nodes execute multiple nodes concurrently:
Budget constraints are enforced at decision nodes:
If any budget is exceeded, execution routes to exit node.
"},{"location":"architecture/graph_orchestration/#error-handling","title":"Error Handling","text":"Errors are handled at multiple levels:
Errors are logged and yield error events for UI.
"},{"location":"architecture/graph_orchestration/#backward-compatibility","title":"Backward Compatibility","text":"Graph execution is optional via feature flag:
USE_GRAPH_EXECUTION=true: Use graph-based executionUSE_GRAPH_EXECUTION=false: Use agent chain execution (existing)This allows gradual migration and fallback if needed.
"},{"location":"architecture/graph_orchestration/#see-also","title":"See Also","text":"DeepCritical uses middleware for state management, budget tracking, and workflow coordination.
"},{"location":"architecture/middleware/#state-management","title":"State Management","text":""},{"location":"architecture/middleware/#workflowstate","title":"WorkflowState","text":"File: src/middleware/state_machine.py
Purpose: Thread-safe state management for research workflows
Implementation: Uses ContextVar for thread-safe isolation
State Components: - evidence: list[Evidence]: Collected evidence from searches - conversation: Conversation: Iteration history (gaps, tool calls, findings, thoughts) - embedding_service: Any: Embedding service for semantic search
Methods: - add_evidence(new_evidence: list[Evidence]) -> int: Adds evidence with URL-based deduplication. Returns the number of new items added (excluding duplicates). - async search_related(query: str, n_results: int = 5) -> list[Evidence]: Semantic search for related evidence using embedding service
Initialization:
Access:
"},{"location":"architecture/middleware/#workflow-manager","title":"Workflow Manager","text":"File: src/middleware/workflow_manager.py
Purpose: Coordinates parallel research loops
Methods: - async add_loop(loop_id: str, query: str) -> ResearchLoop: Add a new research loop to manage - async run_loops_parallel(loop_configs: list[dict], loop_func: Callable, judge_handler: Any | None = None, budget_tracker: Any | None = None) -> list[Any]: Run multiple research loops in parallel. Takes configuration dicts and a loop function. - async update_loop_status(loop_id: str, status: LoopStatus, error: str | None = None): Update loop status - async sync_loop_evidence_to_state(loop_id: str): Synchronize evidence from a specific loop to global state
Features: - Uses asyncio.gather() for parallel execution - Handles errors per loop (doesn't fail all if one fails) - Tracks loop status: pending, running, completed, failed, cancelled - Evidence deduplication across parallel loops
Usage:
from src.middleware.workflow_manager import WorkflowManager\n\nmanager = WorkflowManager()\nawait manager.add_loop(\"loop1\", \"Research query 1\")\nawait manager.add_loop(\"loop2\", \"Research query 2\")\n\nasync def run_research(config: dict) -> str:\n loop_id = config[\"loop_id\"]\n query = config[\"query\"]\n # ... research logic ...\n return \"report\"\n\nresults = await manager.run_loops_parallel(\n loop_configs=[\n {\"loop_id\": \"loop1\", \"query\": \"Research query 1\"},\n {\"loop_id\": \"loop2\", \"query\": \"Research query 2\"},\n ],\n loop_func=run_research,\n)\n"},{"location":"architecture/middleware/#budget-tracker","title":"Budget Tracker","text":"File: src/middleware/budget_tracker.py
Purpose: Tracks and enforces resource limits
Budget Components: - Tokens: LLM token usage - Time: Elapsed time in seconds - Iterations: Number of iterations
Methods: - create_budget(loop_id: str, tokens_limit: int = 100000, time_limit_seconds: float = 600.0, iterations_limit: int = 10) -> BudgetStatus: Create a budget for a specific loop - add_tokens(loop_id: str, tokens: int): Add token usage to a loop's budget - start_timer(loop_id: str): Start time tracking for a loop - update_timer(loop_id: str): Update elapsed time for a loop - increment_iteration(loop_id: str): Increment iteration count for a loop - check_budget(loop_id: str) -> tuple[bool, str]: Check if a loop's budget has been exceeded. Returns (exceeded: bool, reason: str) - can_continue(loop_id: str) -> bool: Check if a loop can continue based on budget
Token Estimation: - estimate_tokens(text: str) -> int: ~4 chars per token - estimate_llm_call_tokens(prompt: str, response: str) -> int: Estimate LLM call tokens
Usage:
from src.middleware.budget_tracker import BudgetTracker\n\ntracker = BudgetTracker()\nbudget = tracker.create_budget(\n loop_id=\"research_loop\",\n tokens_limit=100000,\n time_limit_seconds=600,\n iterations_limit=10\n)\ntracker.start_timer(\"research_loop\")\n# ... research operations ...\ntracker.add_tokens(\"research_loop\", 5000)\ntracker.update_timer(\"research_loop\")\nexceeded, reason = tracker.check_budget(\"research_loop\")\nif exceeded:\n # Budget exceeded, stop research\n pass\nif not tracker.can_continue(\"research_loop\"):\n # Budget exceeded, stop research\n pass\n"},{"location":"architecture/middleware/#models","title":"Models","text":"All middleware models are defined in src/utils/models.py:
IterationData: Data for a single iterationConversation: Conversation history with iterationsResearchLoop: Research loop state and configurationBudgetStatus: Current budget statusAll middleware components use ContextVar for thread-safe isolation:
DeepCritical supports multiple orchestration patterns for research workflows.
"},{"location":"architecture/orchestrators/#research-flows","title":"Research Flows","text":""},{"location":"architecture/orchestrators/#iterativeresearchflow","title":"IterativeResearchFlow","text":"File: src/orchestrator/research_flow.py
Pattern: Generate observations \u2192 Evaluate gaps \u2192 Select tools \u2192 Execute \u2192 Judge \u2192 Continue/Complete
Agents Used: - KnowledgeGapAgent: Evaluates research completeness - ToolSelectorAgent: Selects tools for addressing gaps - ThinkingAgent: Generates observations - WriterAgent: Creates final report - JudgeHandler: Assesses evidence sufficiency
Features: - Tracks iterations, time, budget - Supports graph execution (use_graph=True) and agent chains (use_graph=False) - Iterates until research complete or constraints met
Usage:
"},{"location":"architecture/orchestrators/#deepresearchflow","title":"DeepResearchFlow","text":"File: src/orchestrator/research_flow.py
Pattern: Planner \u2192 Parallel iterative loops per section \u2192 Synthesizer
Agents Used: - PlannerAgent: Breaks query into report sections - IterativeResearchFlow: Per-section research (parallel) - LongWriterAgent or ProofreaderAgent: Final synthesis
Features: - Uses WorkflowManager for parallel execution - Budget tracking per section and globally - State synchronization across parallel loops - Supports graph execution and agent chains
Usage:
"},{"location":"architecture/orchestrators/#graph-orchestrator","title":"Graph Orchestrator","text":"File: src/orchestrator/graph_orchestrator.py
Purpose: Graph-based execution using Pydantic AI agents as nodes
Features: - Uses graph execution (use_graph=True) or agent chains (use_graph=False) as fallback - Routes based on research mode (iterative/deep/auto) - Streams AgentEvent objects for UI - Uses GraphExecutionContext to manage execution state
Node Types: - Agent Nodes: Execute Pydantic AI agents - State Nodes: Update or read workflow state - Decision Nodes: Make routing decisions - Parallel Nodes: Execute multiple nodes concurrently
Edge Types: - Sequential Edges: Always traversed - Conditional Edges: Traversed based on condition - Parallel Edges: Used for parallel execution branches
Special Node Handling:
The GraphOrchestrator has special handling for certain nodes:
execute_tools node: State node that uses search_handler to execute searches and add evidence to workflow stateparallel_loops node: Parallel node that executes IterativeResearchFlow instances for each section in deep research modesynthesizer node: Agent node that calls LongWriterAgent.write_report() directly with ReportDraft instead of using agent.run()writer node: Agent node that calls WriterAgent.write_report() directly with findings instead of using agent.run()GraphExecutionContext:
The orchestrator uses GraphExecutionContext to manage execution state: - Tracks current node, visited nodes, and node results - Manages workflow state and budget tracker - Provides methods to store and retrieve node execution results
File: src/orchestrator_factory.py
Purpose: Factory for creating orchestrators
Modes: - Simple: Legacy orchestrator (backward compatible) - Advanced: Magentic orchestrator (requires OpenAI API key) - Auto-detect: Chooses based on API key availability
Usage:
"},{"location":"architecture/orchestrators/#magentic-orchestrator","title":"Magentic Orchestrator","text":"File: src/orchestrator_magentic.py
Purpose: Multi-agent coordination using Microsoft Agent Framework
Features: - Uses agent-framework-core - ChatAgent pattern with internal LLMs per agent - MagenticBuilder with participants: - searcher: SearchAgent (wraps SearchHandler) - hypothesizer: HypothesisAgent (generates hypotheses) - judge: JudgeAgent (evaluates evidence) - reporter: ReportAgent (generates final report) - Manager orchestrates agents via chat client (OpenAI or HuggingFace) - Event-driven: converts Magentic events to AgentEvent for UI streaming via _process_event() method - Supports max rounds, stall detection, and reset handling
Event Processing:
The orchestrator processes Magentic events and converts them to AgentEvent: - MagenticOrchestratorMessageEvent \u2192 AgentEvent with type based on message content - MagenticAgentMessageEvent \u2192 AgentEvent with type based on agent name - MagenticAgentDeltaEvent \u2192 AgentEvent for streaming updates - MagenticFinalResultEvent \u2192 AgentEvent with type \"complete\"
Requirements: - agent-framework-core package - OpenAI API key or HuggingFace authentication
File: src/orchestrator_hierarchical.py
Purpose: Hierarchical orchestrator using middleware and sub-teams
Features: - Uses SubIterationMiddleware with ResearchTeam and LLMSubIterationJudge - Adapts Magentic ChatAgent to SubIterationTeam protocol - Event-driven via asyncio.Queue for coordination - Supports sub-iteration patterns for complex research tasks
File: src/legacy_orchestrator.py
Purpose: Linear search-judge-synthesize loop
Features: - Uses SearchHandlerProtocol and JudgeHandlerProtocol - Generator-based design yielding AgentEvent objects - Backward compatibility for simple use cases
All orchestrators must initialize workflow state:
"},{"location":"architecture/orchestrators/#event-streaming","title":"Event Streaming","text":"All orchestrators yield AgentEvent objects:
Event Types: - started: Research started - searching: Search in progress - search_complete: Search completed - judging: Evidence evaluation in progress - judge_complete: Evidence evaluation completed - looping: Iteration in progress - hypothesizing: Generating hypotheses - analyzing: Statistical analysis in progress - analysis_complete: Statistical analysis completed - synthesizing: Synthesizing results - complete: Research completed - error: Error occurred - streaming: Streaming update (delta events)
Event Structure:
"},{"location":"architecture/orchestrators/#see-also","title":"See Also","text":"DeepCritical provides several services for embeddings, RAG, and statistical analysis.
"},{"location":"architecture/services/#embedding-service","title":"Embedding Service","text":"File: src/services/embeddings.py
Purpose: Local sentence-transformers for semantic search and deduplication
Features: - No API Key Required: Uses local sentence-transformers models - Async-Safe: All operations use run_in_executor() to avoid blocking the event loop - ChromaDB Storage: In-memory vector storage for embeddings - Deduplication: 0.9 similarity threshold by default (90% similarity = duplicate, configurable)
Model: Configurable via settings.local_embedding_model (default: all-MiniLM-L6-v2)
Methods: - async def embed(text: str) -> list[float]: Generate embeddings (async-safe via run_in_executor()) - async def embed_batch(texts: list[str]) -> list[list[float]]: Batch embedding (more efficient) - async def add_evidence(evidence_id: str, content: str, metadata: dict[str, Any]) -> None: Add evidence to vector store - async def search_similar(query: str, n_results: int = 5) -> list[dict[str, Any]]: Find semantically similar evidence - async def deduplicate(new_evidence: list[Evidence], threshold: float = 0.9) -> list[Evidence]: Remove semantically duplicate evidence
Usage:
from src.services.embeddings import get_embedding_service\n\nservice = get_embedding_service()\nembedding = await service.embed(\"text to embed\")\n"},{"location":"architecture/services/#llamaindex-rag-service","title":"LlamaIndex RAG Service","text":"File: src/services/llamaindex_rag.py
Purpose: Retrieval-Augmented Generation using LlamaIndex
Features: - Multiple Embedding Providers: OpenAI embeddings (requires OPENAI_API_KEY) or local sentence-transformers (no API key) - Multiple LLM Providers: HuggingFace LLM (preferred) or OpenAI LLM (fallback) for query synthesis - ChromaDB Storage: Vector database for document storage (supports in-memory mode) - Metadata Preservation: Preserves source, title, URL, date, authors - Lazy Initialization: Graceful fallback if dependencies not available
Initialization Parameters: - use_openai_embeddings: bool | None: Force OpenAI embeddings (None = auto-detect) - use_in_memory: bool: Use in-memory ChromaDB client (useful for tests) - oauth_token: str | None: Optional OAuth token from HuggingFace login (takes priority over env vars)
Methods: - async def ingest_evidence(evidence: list[Evidence]) -> None: Ingest evidence into RAG - async def retrieve(query: str, top_k: int = 5) -> list[Document]: Retrieve relevant documents - async def query(query: str, top_k: int = 5) -> str: Query with RAG
Usage:
from src.services.llamaindex_rag import get_rag_service\n\nservice = get_rag_service(\n use_openai_embeddings=False, # Use local embeddings\n use_in_memory=True, # Use in-memory ChromaDB\n oauth_token=token # Optional HuggingFace token\n)\nif service:\n documents = await service.retrieve(\"query\", top_k=5)\n"},{"location":"architecture/services/#statistical-analyzer","title":"Statistical Analyzer","text":"File: src/services/statistical_analyzer.py
Purpose: Secure execution of AI-generated statistical code
Features: - Modal Sandbox: Secure, isolated execution environment - Code Generation: Generates Python code via LLM - Library Pinning: Version-pinned libraries in SANDBOX_LIBRARIES - Network Isolation: block_network=True by default
Libraries Available: - pandas, numpy, scipy - matplotlib, scikit-learn - statsmodels
Output: AnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - code: Generated analysis code - output: Execution output - error: Error message if execution failed
Usage:
from src.services.statistical_analyzer import StatisticalAnalyzer\n\nanalyzer = StatisticalAnalyzer()\nresult = await analyzer.analyze(\n hypothesis=\"Metformin reduces cancer risk\",\n evidence=evidence_list\n)\n"},{"location":"architecture/services/#singleton-pattern","title":"Singleton Pattern","text":"Services use singleton patterns for lazy initialization:
EmbeddingService: Uses a global variable pattern:
LlamaIndexRAGService: Direct instantiation (no caching):
This ensures: - Single instance per process - Lazy initialization - No dependencies required at import time
"},{"location":"architecture/services/#service-availability","title":"Service Availability","text":"Services check availability before use:
from src.utils.config import settings\n\nif settings.modal_available:\n # Use Modal sandbox\n pass\n\nif settings.has_openai_key:\n # Use OpenAI embeddings for RAG\n pass\n"},{"location":"architecture/services/#see-also","title":"See Also","text":"DeepCritical implements a protocol-based search tool system for retrieving evidence from multiple sources.
"},{"location":"architecture/tools/#searchtool-protocol","title":"SearchTool Protocol","text":"All tools implement the SearchTool protocol from src/tools/base.py:
All tools use the @retry decorator from tenacity:
Tools with API rate limits implement _rate_limit() method and use shared rate limiters from src/tools/rate_limiter.py.
Tools raise custom exceptions:
SearchError: General search failuresRateLimitError: Rate limit exceededTools handle HTTP errors (429, 500, timeout) and return empty lists on non-critical errors (with warning logs).
"},{"location":"architecture/tools/#query-preprocessing","title":"Query Preprocessing","text":"Tools use preprocess_query() from src/tools/query_utils.py to:
All tools convert API responses to Evidence objects with:
Citation: Title, URL, date, authorscontent: Evidence textrelevance_score: 0.0-1.0 relevance scoremetadata: Additional metadataMissing fields are handled gracefully with defaults.
"},{"location":"architecture/tools/#tool-implementations","title":"Tool Implementations","text":""},{"location":"architecture/tools/#pubmed-tool","title":"PubMed Tool","text":"File: src/tools/pubmed.py
API: NCBI E-utilities (ESearch \u2192 EFetch)
Rate Limiting: - 0.34s between requests (3 req/sec without API key) - 0.1s between requests (10 req/sec with NCBI API key)
Features: - XML parsing with xmltodict - Handles single vs. multiple articles - Query preprocessing - Evidence conversion with metadata extraction
File: src/tools/clinicaltrials.py
API: ClinicalTrials.gov API v2
Important: Uses requests library (NOT httpx) because WAF blocks httpx TLS fingerprint.
Execution: Runs in thread pool: await asyncio.to_thread(requests.get, ...)
Filtering: - Only interventional studies - Status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION
Features: - Parses nested JSON structure - Extracts trial metadata - Evidence conversion
"},{"location":"architecture/tools/#europe-pmc-tool","title":"Europe PMC Tool","text":"File: src/tools/europepmc.py
API: Europe PMC REST API
Features: - Handles preprint markers: [PREPRINT - Not peer-reviewed] - Builds URLs from DOI or PMID - Checks pubTypeList for preprint detection - Includes both preprints and peer-reviewed articles
File: src/tools/rag_tool.py
Purpose: Semantic search within collected evidence
Implementation: Wraps LlamaIndexRAGService
Features: - Returns Evidence from RAG results - Handles evidence ingestion - Semantic similarity search - Metadata preservation
"},{"location":"architecture/tools/#search-handler","title":"Search Handler","text":"File: src/tools/search_handler.py
Purpose: Orchestrates parallel searches across multiple tools
Initialization Parameters: - tools: list[SearchTool]: List of search tools to use - timeout: float = 30.0: Timeout for each search in seconds - include_rag: bool = False: Whether to include RAG tool in searches - auto_ingest_to_rag: bool = True: Whether to automatically ingest results into RAG - oauth_token: str | None = None: Optional OAuth token from HuggingFace login (for RAG LLM)
Methods: - async def execute(query: str, max_results_per_tool: int = 10) -> SearchResult: Execute search across all tools in parallel
Features: - Uses asyncio.gather() with return_exceptions=True for parallel execution - Aggregates results into SearchResult with evidence and metadata - Handles tool failures gracefully (continues with other tools) - Deduplicates results by URL - Automatically ingests results into RAG if auto_ingest_to_rag=True - Can add RAG tool dynamically via add_rag_tool() method
Tools are registered in the search handler:
from src.tools.pubmed import PubMedTool\nfrom src.tools.clinicaltrials import ClinicalTrialsTool\nfrom src.tools.europepmc import EuropePMCTool\nfrom src.tools.search_handler import SearchHandler\n\nsearch_handler = SearchHandler(\n tools=[\n PubMedTool(),\n ClinicalTrialsTool(),\n EuropePMCTool(),\n ],\n include_rag=True, # Include RAG tool for semantic search\n auto_ingest_to_rag=True, # Automatically ingest results into RAG\n oauth_token=token # Optional HuggingFace token for RAG LLM\n)\n\n# Execute search\nresult = await search_handler.execute(\"query\", max_results_per_tool=10)\n"},{"location":"architecture/tools/#see-also","title":"See Also","text":"Architecture Pattern: Microsoft Magentic Orchestration Design Philosophy: Simple, dynamic, manager-driven coordination Key Innovation: Intelligent manager replaces rigid sequential phases
"},{"location":"architecture/workflow-diagrams/#1-high-level-magentic-workflow","title":"1. High-Level Magentic Workflow","text":"flowchart TD\n Start([User Query]) --> Manager[Magentic Manager<br/>Plan \u2022 Select \u2022 Assess \u2022 Adapt]\n\n Manager -->|Plans| Task1[Task Decomposition]\n Task1 --> Manager\n\n Manager -->|Selects & Executes| HypAgent[Hypothesis Agent]\n Manager -->|Selects & Executes| SearchAgent[Search Agent]\n Manager -->|Selects & Executes| AnalysisAgent[Analysis Agent]\n Manager -->|Selects & Executes| ReportAgent[Report Agent]\n\n HypAgent -->|Results| Manager\n SearchAgent -->|Results| Manager\n AnalysisAgent -->|Results| Manager\n ReportAgent -->|Results| Manager\n\n Manager -->|Assesses Quality| Decision{Good Enough?}\n Decision -->|No - Refine| Manager\n Decision -->|No - Different Agent| Manager\n Decision -->|No - Stalled| Replan[Reset Plan]\n Replan --> Manager\n\n Decision -->|Yes| Synthesis[Synthesize Final Result]\n Synthesis --> Output([Research Report])\n\n style Start fill:#e1f5e1\n style Manager fill:#ffe6e6\n style HypAgent fill:#fff4e6\n style SearchAgent fill:#fff4e6\n style AnalysisAgent fill:#fff4e6\n style ReportAgent fill:#fff4e6\n style Decision fill:#ffd6d6\n style Synthesis fill:#d4edda\n style Output fill:#e1f5e1"},{"location":"architecture/workflow-diagrams/#2-magentic-manager-the-6-phase-cycle","title":"2. Magentic Manager: The 6-Phase Cycle","text":"flowchart LR\n P1[1. Planning<br/>Analyze task<br/>Create strategy] --> P2[2. Agent Selection<br/>Pick best agent<br/>for subtask]\n P2 --> P3[3. Execution<br/>Run selected<br/>agent with tools]\n P3 --> P4[4. Assessment<br/>Evaluate quality<br/>Check progress]\n P4 --> Decision{Quality OK?<br/>Progress made?}\n Decision -->|Yes| P6[6. Synthesis<br/>Combine results<br/>Generate report]\n Decision -->|No| P5[5. Iteration<br/>Adjust plan<br/>Try again]\n P5 --> P2\n P6 --> Done([Complete])\n\n style P1 fill:#fff4e6\n style P2 fill:#ffe6e6\n style P3 fill:#e6f3ff\n style P4 fill:#ffd6d6\n style P5 fill:#fff3cd\n style P6 fill:#d4edda\n style Done fill:#e1f5e1"},{"location":"architecture/workflow-diagrams/#3-simplified-agent-architecture","title":"3. Simplified Agent Architecture","text":"graph TB\n subgraph \"Orchestration Layer\"\n Manager[Magentic Manager<br/>\u2022 Plans workflow<br/>\u2022 Selects agents<br/>\u2022 Assesses quality<br/>\u2022 Adapts strategy]\n SharedContext[(Shared Context<br/>\u2022 Hypotheses<br/>\u2022 Search Results<br/>\u2022 Analysis<br/>\u2022 Progress)]\n Manager <--> SharedContext\n end\n\n subgraph \"Specialist Agents\"\n HypAgent[Hypothesis Agent<br/>\u2022 Domain understanding<br/>\u2022 Hypothesis generation<br/>\u2022 Testability refinement]\n SearchAgent[Search Agent<br/>\u2022 Multi-source search<br/>\u2022 RAG retrieval<br/>\u2022 Result ranking]\n AnalysisAgent[Analysis Agent<br/>\u2022 Evidence extraction<br/>\u2022 Statistical analysis<br/>\u2022 Code execution]\n ReportAgent[Report Agent<br/>\u2022 Report assembly<br/>\u2022 Visualization<br/>\u2022 Citation formatting]\n end\n\n subgraph \"MCP Tools\"\n WebSearch[Web Search<br/>PubMed \u2022 arXiv \u2022 bioRxiv]\n CodeExec[Code Execution<br/>Sandboxed Python]\n RAG[RAG Retrieval<br/>Vector DB \u2022 Embeddings]\n Viz[Visualization<br/>Charts \u2022 Graphs]\n end\n\n Manager -->|Selects & Directs| HypAgent\n Manager -->|Selects & Directs| SearchAgent\n Manager -->|Selects & Directs| AnalysisAgent\n Manager -->|Selects & Directs| ReportAgent\n\n HypAgent --> SharedContext\n SearchAgent --> SharedContext\n AnalysisAgent --> SharedContext\n ReportAgent --> SharedContext\n\n SearchAgent --> WebSearch\n SearchAgent --> RAG\n AnalysisAgent --> CodeExec\n ReportAgent --> CodeExec\n ReportAgent --> Viz\n\n style Manager fill:#ffe6e6\n style SharedContext fill:#ffe6f0\n style HypAgent fill:#fff4e6\n style SearchAgent fill:#fff4e6\n style AnalysisAgent fill:#fff4e6\n style ReportAgent fill:#fff4e6\n style WebSearch fill:#e6f3ff\n style CodeExec fill:#e6f3ff\n style RAG fill:#e6f3ff\n style Viz fill:#e6f3ff"},{"location":"architecture/workflow-diagrams/#4-dynamic-workflow-example","title":"4. Dynamic Workflow Example","text":"sequenceDiagram\n participant User\n participant Manager\n participant HypAgent\n participant SearchAgent\n participant AnalysisAgent\n participant ReportAgent\n\n User->>Manager: \"Research protein folding in Alzheimer's\"\n\n Note over Manager: PLAN: Generate hypotheses \u2192 Search \u2192 Analyze \u2192 Report\n\n Manager->>HypAgent: Generate 3 hypotheses\n HypAgent-->>Manager: Returns 3 hypotheses\n Note over Manager: ASSESS: Good quality, proceed\n\n Manager->>SearchAgent: Search literature for hypothesis 1\n SearchAgent-->>Manager: Returns 15 papers\n Note over Manager: ASSESS: Good results, continue\n\n Manager->>SearchAgent: Search for hypothesis 2\n SearchAgent-->>Manager: Only 2 papers found\n Note over Manager: ASSESS: Insufficient, refine search\n\n Manager->>SearchAgent: Refined query for hypothesis 2\n SearchAgent-->>Manager: Returns 12 papers\n Note over Manager: ASSESS: Better, proceed\n\n Manager->>AnalysisAgent: Analyze evidence for all hypotheses\n AnalysisAgent-->>Manager: Returns analysis with code\n Note over Manager: ASSESS: Complete, generate report\n\n Manager->>ReportAgent: Create comprehensive report\n ReportAgent-->>Manager: Returns formatted report\n Note over Manager: SYNTHESIZE: Combine all results\n\n Manager->>User: Final Research Report"},{"location":"architecture/workflow-diagrams/#5-manager-decision-logic","title":"5. Manager Decision Logic","text":"flowchart TD\n Start([Manager Receives Task]) --> Plan[Create Initial Plan]\n\n Plan --> Select[Select Agent for Next Subtask]\n Select --> Execute[Execute Agent]\n Execute --> Collect[Collect Results]\n\n Collect --> Assess[Assess Quality & Progress]\n\n Assess --> Q1{Quality Sufficient?}\n Q1 -->|No| Q2{Same Agent Can Fix?}\n Q2 -->|Yes| Feedback[Provide Specific Feedback]\n Feedback --> Execute\n Q2 -->|No| Different[Try Different Agent]\n Different --> Select\n\n Q1 -->|Yes| Q3{Task Complete?}\n Q3 -->|No| Q4{Making Progress?}\n Q4 -->|Yes| Select\n Q4 -->|No - Stalled| Replan[Reset Plan & Approach]\n Replan --> Plan\n\n Q3 -->|Yes| Synth[Synthesize Final Result]\n Synth --> Done([Return Report])\n\n style Start fill:#e1f5e1\n style Plan fill:#fff4e6\n style Select fill:#ffe6e6\n style Execute fill:#e6f3ff\n style Assess fill:#ffd6d6\n style Q1 fill:#ffe6e6\n style Q2 fill:#ffe6e6\n style Q3 fill:#ffe6e6\n style Q4 fill:#ffe6e6\n style Synth fill:#d4edda\n style Done fill:#e1f5e1"},{"location":"architecture/workflow-diagrams/#6-hypothesis-agent-workflow","title":"6. Hypothesis Agent Workflow","text":"flowchart LR\n Input[Research Query] --> Domain[Identify Domain<br/>& Key Concepts]\n Domain --> Context[Retrieve Background<br/>Knowledge]\n Context --> Generate[Generate 3-5<br/>Initial Hypotheses]\n Generate --> Refine[Refine for<br/>Testability]\n Refine --> Rank[Rank by<br/>Quality Score]\n Rank --> Output[Return Top<br/>Hypotheses]\n\n Output --> Struct[Hypothesis Structure:<br/>\u2022 Statement<br/>\u2022 Rationale<br/>\u2022 Testability Score<br/>\u2022 Data Requirements<br/>\u2022 Expected Outcomes]\n\n style Input fill:#e1f5e1\n style Output fill:#fff4e6\n style Struct fill:#e6f3ff"},{"location":"architecture/workflow-diagrams/#7-search-agent-workflow","title":"7. Search Agent Workflow","text":"flowchart TD\n Input[Hypotheses] --> Strategy[Formulate Search<br/>Strategy per Hypothesis]\n\n Strategy --> Multi[Multi-Source Search]\n\n Multi --> PubMed[PubMed Search<br/>via MCP]\n Multi --> ArXiv[arXiv Search<br/>via MCP]\n Multi --> BioRxiv[bioRxiv Search<br/>via MCP]\n\n PubMed --> Aggregate[Aggregate Results]\n ArXiv --> Aggregate\n BioRxiv --> Aggregate\n\n Aggregate --> Filter[Filter & Rank<br/>by Relevance]\n Filter --> Dedup[Deduplicate<br/>Cross-Reference]\n Dedup --> Embed[Embed Documents<br/>via MCP]\n Embed --> Vector[(Vector DB)]\n Vector --> RAGRetrieval[RAG Retrieval<br/>Top-K per Hypothesis]\n RAGRetrieval --> Output[Return Contextualized<br/>Search Results]\n\n style Input fill:#fff4e6\n style Multi fill:#ffe6e6\n style Vector fill:#ffe6f0\n style Output fill:#e6f3ff"},{"location":"architecture/workflow-diagrams/#8-analysis-agent-workflow","title":"8. Analysis Agent Workflow","text":"flowchart TD\n Input1[Hypotheses] --> Extract\n Input2[Search Results] --> Extract[Extract Evidence<br/>per Hypothesis]\n\n Extract --> Methods[Determine Analysis<br/>Methods Needed]\n\n Methods --> Branch{Requires<br/>Computation?}\n Branch -->|Yes| GenCode[Generate Python<br/>Analysis Code]\n Branch -->|No| Qual[Qualitative<br/>Synthesis]\n\n GenCode --> Execute[Execute Code<br/>via MCP Sandbox]\n Execute --> Interpret1[Interpret<br/>Results]\n Qual --> Interpret2[Interpret<br/>Findings]\n\n Interpret1 --> Synthesize[Synthesize Evidence<br/>Across Sources]\n Interpret2 --> Synthesize\n\n Synthesize --> Verdict[Determine Verdict<br/>per Hypothesis]\n Verdict --> Support[\u2022 Supported<br/>\u2022 Refuted<br/>\u2022 Inconclusive]\n Support --> Gaps[Identify Knowledge<br/>Gaps & Limitations]\n Gaps --> Output[Return Analysis<br/>Report]\n\n style Input1 fill:#fff4e6\n style Input2 fill:#e6f3ff\n style Execute fill:#ffe6e6\n style Output fill:#e6ffe6"},{"location":"architecture/workflow-diagrams/#9-report-agent-workflow","title":"9. Report Agent Workflow","text":"flowchart TD\n Input1[Query] --> Assemble\n Input2[Hypotheses] --> Assemble\n Input3[Search Results] --> Assemble\n Input4[Analysis] --> Assemble[Assemble Report<br/>Sections]\n\n Assemble --> Exec[Executive Summary]\n Assemble --> Intro[Introduction]\n Assemble --> Methods[Methods]\n Assemble --> Results[Results per<br/>Hypothesis]\n Assemble --> Discussion[Discussion]\n Assemble --> Future[Future Directions]\n Assemble --> Refs[References]\n\n Results --> VizCheck{Needs<br/>Visualization?}\n VizCheck -->|Yes| GenViz[Generate Viz Code]\n GenViz --> ExecViz[Execute via MCP<br/>Create Charts]\n ExecViz --> Combine\n VizCheck -->|No| Combine[Combine All<br/>Sections]\n\n Exec --> Combine\n Intro --> Combine\n Methods --> Combine\n Discussion --> Combine\n Future --> Combine\n Refs --> Combine\n\n Combine --> Format[Format Output]\n Format --> MD[Markdown]\n Format --> PDF[PDF]\n Format --> JSON[JSON]\n\n MD --> Output[Return Final<br/>Report]\n PDF --> Output\n JSON --> Output\n\n style Input1 fill:#e1f5e1\n style Input2 fill:#fff4e6\n style Input3 fill:#e6f3ff\n style Input4 fill:#e6ffe6\n style Output fill:#d4edda"},{"location":"architecture/workflow-diagrams/#10-data-flow--event-streaming","title":"10. Data Flow & Event Streaming","text":"flowchart TD\n User[\ud83d\udc64 User] -->|Research Query| UI[Gradio UI]\n UI -->|Submit| Manager[Magentic Manager]\n\n Manager -->|Event: Planning| UI\n Manager -->|Select Agent| HypAgent[Hypothesis Agent]\n HypAgent -->|Event: Delta/Message| UI\n HypAgent -->|Hypotheses| Context[(Shared Context)]\n\n Context -->|Retrieved by| Manager\n Manager -->|Select Agent| SearchAgent[Search Agent]\n SearchAgent -->|MCP Request| WebSearch[Web Search Tool]\n WebSearch -->|Results| SearchAgent\n SearchAgent -->|Event: Delta/Message| UI\n SearchAgent -->|Documents| Context\n SearchAgent -->|Embeddings| VectorDB[(Vector DB)]\n\n Context -->|Retrieved by| Manager\n Manager -->|Select Agent| AnalysisAgent[Analysis Agent]\n AnalysisAgent -->|MCP Request| CodeExec[Code Execution Tool]\n CodeExec -->|Results| AnalysisAgent\n AnalysisAgent -->|Event: Delta/Message| UI\n AnalysisAgent -->|Analysis| Context\n\n Context -->|Retrieved by| Manager\n Manager -->|Select Agent| ReportAgent[Report Agent]\n ReportAgent -->|MCP Request| CodeExec\n ReportAgent -->|Event: Delta/Message| UI\n ReportAgent -->|Report| Context\n\n Manager -->|Event: Final Result| UI\n UI -->|Display| User\n\n style User fill:#e1f5e1\n style UI fill:#e6f3ff\n style Manager fill:#ffe6e6\n style Context fill:#ffe6f0\n style VectorDB fill:#ffe6f0\n style WebSearch fill:#f0f0f0\n style CodeExec fill:#f0f0f0"},{"location":"architecture/workflow-diagrams/#11-mcp-tool-architecture","title":"11. MCP Tool Architecture","text":"graph TB\n subgraph \"Agent Layer\"\n Manager[Magentic Manager]\n HypAgent[Hypothesis Agent]\n SearchAgent[Search Agent]\n AnalysisAgent[Analysis Agent]\n ReportAgent[Report Agent]\n end\n\n subgraph \"MCP Protocol Layer\"\n Registry[MCP Tool Registry<br/>\u2022 Discovers tools<br/>\u2022 Routes requests<br/>\u2022 Manages connections]\n end\n\n subgraph \"MCP Servers\"\n Server1[Web Search Server<br/>localhost:8001<br/>\u2022 PubMed<br/>\u2022 arXiv<br/>\u2022 bioRxiv]\n Server2[Code Execution Server<br/>localhost:8002<br/>\u2022 Sandboxed Python<br/>\u2022 Package management]\n Server3[RAG Server<br/>localhost:8003<br/>\u2022 Vector embeddings<br/>\u2022 Similarity search]\n Server4[Visualization Server<br/>localhost:8004<br/>\u2022 Chart generation<br/>\u2022 Plot rendering]\n end\n\n subgraph \"External Services\"\n PubMed[PubMed API]\n ArXiv[arXiv API]\n BioRxiv[bioRxiv API]\n Modal[Modal Sandbox]\n ChromaDB[(ChromaDB)]\n end\n\n SearchAgent -->|Request| Registry\n AnalysisAgent -->|Request| Registry\n ReportAgent -->|Request| Registry\n\n Registry --> Server1\n Registry --> Server2\n Registry --> Server3\n Registry --> Server4\n\n Server1 --> PubMed\n Server1 --> ArXiv\n Server1 --> BioRxiv\n Server2 --> Modal\n Server3 --> ChromaDB\n\n style Manager fill:#ffe6e6\n style Registry fill:#fff4e6\n style Server1 fill:#e6f3ff\n style Server2 fill:#e6f3ff\n style Server3 fill:#e6f3ff\n style Server4 fill:#e6f3ff"},{"location":"architecture/workflow-diagrams/#12-progress-tracking--stall-detection","title":"12. Progress Tracking & Stall Detection","text":"stateDiagram-v2\n [*] --> Initialization: User Query\n\n Initialization --> Planning: Manager starts\n\n Planning --> AgentExecution: Select agent\n\n AgentExecution --> Assessment: Collect results\n\n Assessment --> QualityCheck: Evaluate output\n\n QualityCheck --> AgentExecution: Poor quality<br/>(retry < max_rounds)\n QualityCheck --> Planning: Poor quality<br/>(try different agent)\n QualityCheck --> NextAgent: Good quality<br/>(task incomplete)\n QualityCheck --> Synthesis: Good quality<br/>(task complete)\n\n NextAgent --> AgentExecution: Select next agent\n\n state StallDetection <<choice>>\n Assessment --> StallDetection: Check progress\n StallDetection --> Planning: No progress<br/>(stall count < max)\n StallDetection --> ErrorRecovery: No progress<br/>(max stalls reached)\n\n ErrorRecovery --> PartialReport: Generate partial results\n PartialReport --> [*]\n\n Synthesis --> FinalReport: Combine all outputs\n FinalReport --> [*]\n\n note right of QualityCheck\n Manager assesses:\n \u2022 Output completeness\n \u2022 Quality metrics\n \u2022 Progress made\n end note\n\n note right of StallDetection\n Stall = no new progress\n after agent execution\n Triggers plan reset\n end note"},{"location":"architecture/workflow-diagrams/#13-gradio-ui-integration","title":"13. Gradio UI Integration","text":"graph TD\n App[Gradio App<br/>DeepCritical Research Agent]\n\n App --> Input[Input Section]\n App --> Status[Status Section]\n App --> Output[Output Section]\n\n Input --> Query[Research Question<br/>Text Area]\n Input --> Controls[Controls]\n Controls --> MaxHyp[Max Hypotheses: 1-10]\n Controls --> MaxRounds[Max Rounds: 5-20]\n Controls --> Submit[Start Research Button]\n\n Status --> Log[Real-time Event Log<br/>\u2022 Manager planning<br/>\u2022 Agent selection<br/>\u2022 Execution updates<br/>\u2022 Quality assessment]\n Status --> Progress[Progress Tracker<br/>\u2022 Current agent<br/>\u2022 Round count<br/>\u2022 Stall count]\n\n Output --> Tabs[Tabbed Results]\n Tabs --> Tab1[Hypotheses Tab<br/>Generated hypotheses with scores]\n Tabs --> Tab2[Search Results Tab<br/>Papers & sources found]\n Tabs --> Tab3[Analysis Tab<br/>Evidence & verdicts]\n Tabs --> Tab4[Report Tab<br/>Final research report]\n Tab4 --> Download[Download Report<br/>MD / PDF / JSON]\n\n Submit -.->|Triggers| Workflow[Magentic Workflow]\n Workflow -.->|MagenticOrchestratorMessageEvent| Log\n Workflow -.->|MagenticAgentDeltaEvent| Log\n Workflow -.->|MagenticAgentMessageEvent| Log\n Workflow -.->|MagenticFinalResultEvent| Tab4\n\n style App fill:#e1f5e1\n style Input fill:#fff4e6\n style Status fill:#e6f3ff\n style Output fill:#e6ffe6\n style Workflow fill:#ffe6e6"},{"location":"architecture/workflow-diagrams/#14-complete-system-context","title":"14. Complete System Context","text":"graph LR\n User[\ud83d\udc64 Researcher<br/>Asks research questions] -->|Submits query| DC[DeepCritical<br/>Magentic Workflow]\n\n DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]\n DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]\n DC -->|Biology search| BioRxiv[bioRxiv API<br/>Biology preprints]\n DC -->|Agent reasoning| Claude[Claude API<br/>Sonnet 4 / Opus]\n DC -->|Code execution| Modal[Modal Sandbox<br/>Safe Python env]\n DC -->|Vector storage| Chroma[ChromaDB<br/>Embeddings & RAG]\n\n DC -->|Deployed on| HF[HuggingFace Spaces<br/>Gradio 6.0]\n\n PubMed -->|Results| DC\n ArXiv -->|Results| DC\n BioRxiv -->|Results| DC\n Claude -->|Responses| DC\n Modal -->|Output| DC\n Chroma -->|Context| DC\n\n DC -->|Research report| User\n\n style User fill:#e1f5e1\n style DC fill:#ffe6e6\n style PubMed fill:#e6f3ff\n style ArXiv fill:#e6f3ff\n style BioRxiv fill:#e6f3ff\n style Claude fill:#ffd6d6\n style Modal fill:#f0f0f0\n style Chroma fill:#ffe6f0\n style HF fill:#d4edda"},{"location":"architecture/workflow-diagrams/#15-workflow-timeline-simplified","title":"15. Workflow Timeline (Simplified)","text":"gantt\n title DeepCritical Magentic Workflow - Typical Execution\n dateFormat mm:ss\n axisFormat %M:%S\n\n section Manager Planning\n Initial planning :p1, 00:00, 10s\n\n section Hypothesis Agent\n Generate hypotheses :h1, after p1, 30s\n Manager assessment :h2, after h1, 5s\n\n section Search Agent\n Search hypothesis 1 :s1, after h2, 20s\n Search hypothesis 2 :s2, after s1, 20s\n Search hypothesis 3 :s3, after s2, 20s\n RAG processing :s4, after s3, 15s\n Manager assessment :s5, after s4, 5s\n\n section Analysis Agent\n Evidence extraction :a1, after s5, 15s\n Code generation :a2, after a1, 20s\n Code execution :a3, after a2, 25s\n Synthesis :a4, after a3, 20s\n Manager assessment :a5, after a4, 5s\n\n section Report Agent\n Report assembly :r1, after a5, 30s\n Visualization :r2, after r1, 15s\n Formatting :r3, after r2, 10s\n\n section Manager Synthesis\n Final synthesis :f1, after r3, 10s"},{"location":"architecture/workflow-diagrams/#key-differences-from-original-design","title":"Key Differences from Original Design","text":"Aspect Original (Judge-in-Loop) New (Magentic) Control Flow Fixed sequential phases Dynamic agent selection Quality Control Separate Judge Agent Manager assessment built-in Retry Logic Phase-level with feedback Agent-level with adaptation Flexibility Rigid 4-phase pipeline Adaptive workflow Complexity 5 agents (including Judge) 4 agents (no Judge) Progress Tracking Manual state management Built-in round/stall detection Agent Coordination Sequential handoff Manager-driven dynamic selection Error Recovery Retry same phase Try different agent or replan"},{"location":"architecture/workflow-diagrams/#simplified-design-principles","title":"Simplified Design Principles","text":"Simple 4-Agent Setup:
Manager handles quality assessment in its instructions: - Checks hypothesis quality (testable, novel, clear) - Validates search results (relevant, authoritative, recent) - Assesses analysis soundness (methodology, evidence, conclusions) - Ensures report completeness (all sections, proper citations)
No separate Judge Agent needed - manager does it all!
Document Version: 2.0 (Magentic Simplified) Last Updated: 2025-11-24 Architecture: Microsoft Magentic Orchestration Pattern Agents: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager License: MIT
"},{"location":"architecture/workflow-diagrams/#see-also","title":"See Also","text":"DeepCritical uses Pydantic Settings for centralized configuration management. All settings are defined in the Settings class in src/utils/config.py and can be configured via environment variables or a .env file.
The configuration system provides:
.env file (if present)settings instance for easy access throughout the codebase.env file in the project rootOPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN)The [Settings][settings-class] class extends BaseSettings from pydantic_settings and defines all application configuration:
View source
"},{"location":"configuration/#singleton-instance","title":"Singleton Instance","text":"A global settings instance is available for import:
View source
"},{"location":"configuration/#usage-pattern","title":"Usage Pattern","text":"Access configuration throughout the codebase:
from src.utils.config import settings\n\n# Check if API keys are available\nif settings.has_openai_key:\n # Use OpenAI\n pass\n\n# Access configuration values\nmax_iterations = settings.max_iterations\nweb_search_provider = settings.web_search_provider\n"},{"location":"configuration/#required-configuration","title":"Required Configuration","text":""},{"location":"configuration/#llm-provider","title":"LLM Provider","text":"You must configure at least one LLM provider. The system supports:
OPENAI_API_KEYANTHROPIC_API_KEYHF_TOKEN or HUGGINGFACE_API_KEY (can work without key for public models)LLM_PROVIDER=openai\nOPENAI_API_KEY=your_openai_api_key_here\nOPENAI_MODEL=gpt-5.1\n The default model is defined in the Settings class:
LLM_PROVIDER=anthropic\nANTHROPIC_API_KEY=your_anthropic_api_key_here\nANTHROPIC_MODEL=claude-sonnet-4-5-20250929\n The default model is defined in the Settings class:
HuggingFace can work without an API key for public models, but an API key provides higher rate limits:
# Option 1: Using HF_TOKEN (preferred)\nHF_TOKEN=your_huggingface_token_here\n\n# Option 2: Using HUGGINGFACE_API_KEY (alternative)\nHUGGINGFACE_API_KEY=your_huggingface_api_key_here\n\n# Default model\nHUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct\n The HuggingFace token can be set via either environment variable:
"},{"location":"configuration/#optional-configuration","title":"Optional Configuration","text":""},{"location":"configuration/#embedding-configuration","title":"Embedding Configuration","text":"DeepCritical supports multiple embedding providers for semantic search and RAG:
# Embedding Provider: \"openai\", \"local\", or \"huggingface\"\nEMBEDDING_PROVIDER=local\n\n# OpenAI Embedding Model (used by LlamaIndex RAG)\nOPENAI_EMBEDDING_MODEL=text-embedding-3-small\n\n# Local Embedding Model (sentence-transformers, used by EmbeddingService)\nLOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2\n\n# HuggingFace Embedding Model\nHUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2\n The embedding provider configuration:
Note: OpenAI embeddings require OPENAI_API_KEY. The local provider (default) uses sentence-transformers and requires no API key.
DeepCritical supports multiple web search providers:
# Web Search Provider: \"serper\", \"searchxng\", \"brave\", \"tavily\", or \"duckduckgo\"\n# Default: \"duckduckgo\" (no API key required)\nWEB_SEARCH_PROVIDER=duckduckgo\n\n# Serper API Key (for Google search via Serper)\nSERPER_API_KEY=your_serper_api_key_here\n\n# SearchXNG Host URL (for self-hosted search)\nSEARCHXNG_HOST=http://localhost:8080\n\n# Brave Search API Key\nBRAVE_API_KEY=your_brave_api_key_here\n\n# Tavily API Key\nTAVILY_API_KEY=your_tavily_api_key_here\n The web search provider configuration:
Note: DuckDuckGo is the default and requires no API key, making it ideal for development and testing.
"},{"location":"configuration/#pubmed-configuration","title":"PubMed Configuration","text":"PubMed search supports optional NCBI API key for higher rate limits:
# NCBI API Key (optional, for higher rate limits: 10 req/sec vs 3 req/sec)\nNCBI_API_KEY=your_ncbi_api_key_here\n The PubMed tool uses this configuration:
"},{"location":"configuration/#agent-configuration","title":"Agent Configuration","text":"Control agent behavior and research loop execution:
# Maximum iterations per research loop (1-50, default: 10)\nMAX_ITERATIONS=10\n\n# Search timeout in seconds\nSEARCH_TIMEOUT=30\n\n# Use graph-based execution for research flows\nUSE_GRAPH_EXECUTION=false\n The agent configuration fields:
"},{"location":"configuration/#budget--rate-limiting-configuration","title":"Budget & Rate Limiting Configuration","text":"Control resource limits for research loops:
# Default token budget per research loop (1000-1000000, default: 100000)\nDEFAULT_TOKEN_LIMIT=100000\n\n# Default time limit per research loop in minutes (1-120, default: 10)\nDEFAULT_TIME_LIMIT_MINUTES=10\n\n# Default iterations limit per research loop (1-50, default: 10)\nDEFAULT_ITERATIONS_LIMIT=10\n The budget configuration with validation:
"},{"location":"configuration/#rag-service-configuration","title":"RAG Service Configuration","text":"Configure the Retrieval-Augmented Generation service:
# ChromaDB collection name for RAG\nRAG_COLLECTION_NAME=deepcritical_evidence\n\n# Number of top results to retrieve from RAG (1-50, default: 5)\nRAG_SIMILARITY_TOP_K=5\n\n# Automatically ingest evidence into RAG\nRAG_AUTO_INGEST=true\n The RAG configuration:
"},{"location":"configuration/#chromadb-configuration","title":"ChromaDB Configuration","text":"Configure the vector database for embeddings and RAG:
# ChromaDB storage path\nCHROMA_DB_PATH=./chroma_db\n\n# Whether to persist ChromaDB to disk\nCHROMA_DB_PERSIST=true\n\n# ChromaDB server host (for remote ChromaDB, optional)\nCHROMA_DB_HOST=localhost\n\n# ChromaDB server port (for remote ChromaDB, optional)\nCHROMA_DB_PORT=8000\n The ChromaDB configuration:
"},{"location":"configuration/#external-services","title":"External Services","text":""},{"location":"configuration/#modal-configuration","title":"Modal Configuration","text":"Modal is used for secure sandbox execution of statistical analysis:
# Modal Token ID (for Modal sandbox execution)\nMODAL_TOKEN_ID=your_modal_token_id_here\n\n# Modal Token Secret\nMODAL_TOKEN_SECRET=your_modal_token_secret_here\n The Modal configuration:
"},{"location":"configuration/#logging-configuration","title":"Logging Configuration","text":"Configure structured logging:
# Log Level: \"DEBUG\", \"INFO\", \"WARNING\", or \"ERROR\"\nLOG_LEVEL=INFO\n The logging configuration:
Logging is configured via the configure_logging() function:
The Settings class provides helpful properties for checking configuration state:
Check which API keys are available:
Usage:
from src.utils.config import settings\n\n# Check API key availability\nif settings.has_openai_key:\n # Use OpenAI\n pass\n\nif settings.has_anthropic_key:\n # Use Anthropic\n pass\n\nif settings.has_huggingface_key:\n # Use HuggingFace\n pass\n\nif settings.has_any_llm_key:\n # At least one LLM is available\n pass\n"},{"location":"configuration/#service-availability","title":"Service Availability","text":"Check if external services are configured:
Usage:
from src.utils.config import settings\n\n# Check service availability\nif settings.modal_available:\n # Use Modal sandbox\n pass\n\nif settings.web_search_available:\n # Web search is configured\n pass\n"},{"location":"configuration/#api-key-retrieval","title":"API Key Retrieval","text":"Get the API key for the configured provider:
For OpenAI-specific operations (e.g., Magentic mode):
"},{"location":"configuration/#configuration-usage-in-codebase","title":"Configuration Usage in Codebase","text":"The configuration system is used throughout the codebase:
"},{"location":"configuration/#llm-factory","title":"LLM Factory","text":"The LLM factory uses settings to create appropriate models:
"},{"location":"configuration/#embedding-service","title":"Embedding Service","text":"The embedding service uses local embedding model configuration:
"},{"location":"configuration/#orchestrator-factory","title":"Orchestrator Factory","text":"The orchestrator factory uses settings to determine mode:
"},{"location":"configuration/#environment-variables-reference","title":"Environment Variables Reference","text":""},{"location":"configuration/#required-at-least-one-llm","title":"Required (at least one LLM)","text":"OPENAI_API_KEY - OpenAI API key (required for OpenAI provider)ANTHROPIC_API_KEY - Anthropic API key (required for Anthropic provider)HF_TOKEN or HUGGINGFACE_API_KEY - HuggingFace API token (optional, can work without for public models)LLM_PROVIDER - Provider to use: \"openai\", \"anthropic\", or \"huggingface\" (default: \"huggingface\")OPENAI_MODEL - OpenAI model name (default: \"gpt-5.1\")ANTHROPIC_MODEL - Anthropic model name (default: \"claude-sonnet-4-5-20250929\")HUGGINGFACE_MODEL - HuggingFace model ID (default: \"meta-llama/Llama-3.1-8B-Instruct\")EMBEDDING_PROVIDER - Provider: \"openai\", \"local\", or \"huggingface\" (default: \"local\")OPENAI_EMBEDDING_MODEL - OpenAI embedding model (default: \"text-embedding-3-small\")LOCAL_EMBEDDING_MODEL - Local sentence-transformers model (default: \"all-MiniLM-L6-v2\")HUGGINGFACE_EMBEDDING_MODEL - HuggingFace embedding model (default: \"sentence-transformers/all-MiniLM-L6-v2\")WEB_SEARCH_PROVIDER - Provider: \"serper\", \"searchxng\", \"brave\", \"tavily\", or \"duckduckgo\" (default: \"duckduckgo\")SERPER_API_KEY - Serper API key (required for Serper provider)SEARCHXNG_HOST - SearchXNG host URL (required for SearchXNG provider)BRAVE_API_KEY - Brave Search API key (required for Brave provider)TAVILY_API_KEY - Tavily API key (required for Tavily provider)NCBI_API_KEY - NCBI API key (optional, increases rate limit from 3 to 10 req/sec)MAX_ITERATIONS - Maximum iterations per research loop (1-50, default: 10)SEARCH_TIMEOUT - Search timeout in seconds (default: 30)USE_GRAPH_EXECUTION - Use graph-based execution (default: false)DEFAULT_TOKEN_LIMIT - Default token budget per research loop (1000-1000000, default: 100000)DEFAULT_TIME_LIMIT_MINUTES - Default time limit in minutes (1-120, default: 10)DEFAULT_ITERATIONS_LIMIT - Default iterations limit (1-50, default: 10)RAG_COLLECTION_NAME - ChromaDB collection name (default: \"deepcritical_evidence\")RAG_SIMILARITY_TOP_K - Number of top results to retrieve (1-50, default: 5)RAG_AUTO_INGEST - Automatically ingest evidence into RAG (default: true)CHROMA_DB_PATH - ChromaDB storage path (default: \"./chroma_db\")CHROMA_DB_PERSIST - Whether to persist ChromaDB to disk (default: true)CHROMA_DB_HOST - ChromaDB server host (optional, for remote ChromaDB)CHROMA_DB_PORT - ChromaDB server port (optional, for remote ChromaDB)MODAL_TOKEN_ID - Modal token ID (optional, for Modal sandbox execution)MODAL_TOKEN_SECRET - Modal token secret (optional, for Modal sandbox execution)LOG_LEVEL - Log level: \"DEBUG\", \"INFO\", \"WARNING\", or \"ERROR\" (default: \"INFO\")Settings are validated on load using Pydantic validation:
ge=1, le=50 for max_iterations)Literal[\"openai\", \"anthropic\", \"huggingface\"])get_api_key() or get_openai_api_key()The max_iterations field has range validation:
The llm_provider field has literal validation:
Configuration errors raise ConfigurationError from src/utils/exceptions.py:
```22:25:src/utils/exceptions.py class ConfigurationError(DeepCriticalError): \"\"\"Raised when configuration is invalid.\"\"\"
pass\n ```
"},{"location":"configuration/#error-handling-example","title":"Error Handling Example","text":"python from src.utils.config import settings from src.utils.exceptions import ConfigurationError try: api_key = settings.get_api_key() except ConfigurationError as e: print(f\"Configuration error: {e}\")
get_api_key() is called but the required API key is not setllm_provider is set to an unsupported value.env File: Store sensitive keys in .env file (add to .gitignore)has_openai_key before accessing API keysConfigurationError when calling get_api_key()The following configurations are planned for future phases:
Thank you for your interest in contributing to The DETERMINATOR! This guide will help you get started.
Note on Project Names: \"The DETERMINATOR\" is the product name, \"DeepCritical\" is the organization/project name, and \"determinator\" is the Python package name.
"},{"location":"contributing/#git-workflow","title":"Git Workflow","text":"main: Production-ready (GitHub)dev: Development integration (GitHub)yourname-devmain or dev on HuggingFaceDeepCritical/GradioDemo (source of truth, PRs, code review)DataQuests/DeepCritical (deployment/demo)determinator (Python package name in pyproject.toml)This project uses a dual repository setup:
DeepCritical/GradioDemo): Source of truth for code, PRs, and code reviewDataQuests/DeepCritical): Deployment target for the Gradio demoWhen cloning, set up remotes as follows:
# Clone from GitHub\ngit clone https://github.com/DeepCritical/GradioDemo.git\ncd GradioDemo\n\n# Add HuggingFace remote (optional, for deployment)\ngit remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/DeepCritical\n Important: Never push directly to main or dev on HuggingFace. Always work through GitHub PRs. GitHub is the source of truth; HuggingFace is for deployment/demo only.
This project uses uv as the package manager. All commands should be prefixed with uv run to ensure they run in the correct environment.
# Install uv if you haven't already (recommended: standalone installer)\n# Unix/macOS/Linux:\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Windows (PowerShell):\npowershell -ExecutionPolicy ByPass -c \"irm https://astral.sh/uv/install.ps1 | iex\"\n\n# Alternative: pipx install uv\n# Or: pip install uv\n\n# Sync all dependencies including dev extras\nuv sync --all-extras\n\n# Install pre-commit hooks\nuv run pre-commit install\n"},{"location":"contributing/#development-commands","title":"Development Commands","text":"# Installation\nuv sync --all-extras # Install all dependencies including dev\nuv run pre-commit install # Install pre-commit hooks\n\n# Code Quality Checks (run all before committing)\nuv run ruff check src tests # Lint with ruff\nuv run ruff format src tests # Format with ruff\nuv run mypy src # Type checking\nuv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m \"not openai\" -p no:logfire # Tests with coverage\n\n# Testing Commands\nuv run pytest tests/unit/ -v -m \"not openai\" -p no:logfire # Run unit tests (excludes OpenAI tests)\nuv run pytest tests/ -v -m \"huggingface\" -p no:logfire # Run HuggingFace tests\nuv run pytest tests/ -v -p no:logfire # Run all tests\nuv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m \"not openai\" -p no:logfire # Tests with terminal coverage\nuv run pytest --cov=src --cov-report=html -p no:logfire # Generate HTML coverage report (opens htmlcov/index.html)\n\n# Documentation Commands\nuv run mkdocs build # Build documentation\nuv run mkdocs serve # Serve documentation locally (http://127.0.0.1:8000)\n"},{"location":"contributing/#test-markers","title":"Test Markers","text":"The project uses pytest markers to categorize tests. See Testing Guidelines for details:
unit: Unit tests (mocked, fast)integration: Integration tests (real APIs)slow: Slow testsopenai: Tests requiring OpenAI API keyhuggingface: Tests requiring HuggingFace API keyembedding_provider: Tests requiring API-based embedding providerslocal_embeddings: Tests using local embeddingsNote: The -p no:logfire flag disables the logfire plugin to avoid conflicts during testing.
Fork the repository on GitHub: DeepCritical/GradioDemo
Clone your fork:
git clone https://github.com/yourusername/GradioDemo.git\ncd GradioDemo\n uv sync --all-extras\nuv run pre-commit install\n git checkout -b yourname-feature-name\n Make your changes following the guidelines below
Run checks:
uv run ruff check src tests\nuv run mypy src\nuv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m \"not openai\" -p no:logfire\n git commit -m \"Description of changes\"\ngit push origin yourname-feature-name\n mypy --strictruff for linting and formattingraise SearchError(...) from estructlogunit, integration, slow@lru_cache(maxsize=1)# CRITICAL: ...src/mcp_tools.py for Claude Desktopmcp_server=True in demo.launch()/gradio_api/mcp/ssr_mode=False to fix hydration issues in HF Spacesfrom e when raising exceptionsmypy --strictuv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m \"not openai\" -p no:logfiresrc/: Main source codetests/: Test files (unit/ and integration/)docs/: Documentation source files (MkDocs)examples/: Example usage scriptspyproject.toml: Project configuration and dependencies.pre-commit-config.yaml: Pre-commit hook configurationThank you for contributing to The DETERMINATOR!
"},{"location":"contributing/code-quality/","title":"Code Quality & Documentation","text":"This document outlines code quality standards and documentation requirements for The DETERMINATOR.
"},{"location":"contributing/code-quality/#linting","title":"Linting","text":"pyproject.toml:PLR0913: Too many arguments (agents need many params)PLR0912: Too many branches (complex orchestrator logic)PLR0911: Too many return statements (complex agent logic)PLR2004: Magic values (statistical constants)PLW0603: Global statement (singleton pattern)PLC0415: Lazy imports for optional dependenciesE402: Module level import not at top (needed for pytest.importorskip)E501: Line too long (ignore line length violations)RUF100: Unused noqa (version differences between local/CI)mypy --strict complianceignore_missing_imports = true (for optional dependencies)reference_repos/, examples/Pre-commit hooks run automatically on commit to ensure code quality. Configuration is in .pre-commit-config.yaml.
# Install dependencies (includes pre-commit package)\nuv sync --all-extras\n\n# Set up git hooks (must be run separately)\nuv run pre-commit install\n Note: uv sync --all-extras installs the pre-commit package, but you must run uv run pre-commit install separately to set up the git hooks.
The following hooks run automatically on commit:
src/ (excludes tests/, reference_repos/)Auto-fixes: Yes
ruff-format: Formats code with ruff
src/ (excludes tests/, reference_repos/)Auto-fixes: Yes
mypy: Type checking
src/ (excludes folder/)Additional dependencies: pydantic, pydantic-settings, tenacity, pydantic-ai
pytest-unit: Runs unit tests (excludes OpenAI and embedding_provider tests)
tests/unit/ with -m \"not openai and not embedding_provider\"Always runs: Yes (not just on changed files)
pytest-local-embeddings: Runs local embedding tests
tests/ with -m \"local_embeddings\"To run pre-commit hooks manually (without committing):
uv run pre-commit run --all-files\n"},{"location":"contributing/code-quality/#troubleshooting","title":"Troubleshooting","text":"git commit --no-verify (not recommended)uv run pre-commit installuv sync --all-extrasDocumentation is built using MkDocs. Source files are in docs/, and the configuration is in mkdocs.yml.
# Build documentation\nuv run mkdocs build\n\n# Serve documentation locally (http://127.0.0.1:8000)\nuv run mkdocs serve\n The documentation site is published at: https://deepcritical.github.io/GradioDemo/
"},{"location":"contributing/code-quality/#docstrings","title":"Docstrings","text":"Example:
"},{"location":"contributing/code-quality/#code-comments","title":"Code Comments","text":"requests not httpx for ClinicalTrials)# CRITICAL: ...This document outlines the code style and conventions for The DETERMINATOR.
"},{"location":"contributing/code-style/#package-manager","title":"Package Manager","text":"This project uses uv as the package manager. All commands should be prefixed with uv run to ensure they run in the correct environment.
# Install uv if you haven't already (recommended: standalone installer)\n# Unix/macOS/Linux:\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Windows (PowerShell):\npowershell -ExecutionPolicy ByPass -c \"irm https://astral.sh/uv/install.ps1 | iex\"\n\n# Alternative: pipx install uv\n# Or: pip install uv\n\n# Sync all dependencies including dev extras\nuv sync --all-extras\n"},{"location":"contributing/code-style/#running-commands","title":"Running Commands","text":"All development commands should use uv run prefix:
# Instead of: pytest tests/\nuv run pytest tests/\n\n# Instead of: ruff check src\nuv run ruff check src\n\n# Instead of: mypy src\nuv run mypy src\n This ensures commands run in the correct virtual environment managed by uv.
mypy --strict compliance (no Any unless absolutely necessary)TYPE_CHECKING imports for circular dependencies:src/utils/models.py)model_config = {\"frozen\": True}) for immutabilityField() with descriptions for all model fieldsge=, le=, min_length=, max_length= constraintsasync def, await)asyncio.gather() for parallel operationsrun_in_executor():loop = asyncio.get_running_loop()\nresult = await loop.run_in_executor(None, cpu_bound_function, args)\n This document outlines error handling and logging conventions for The DETERMINATOR.
"},{"location":"contributing/error-handling/#exception-hierarchy","title":"Exception Hierarchy","text":"Use custom exception hierarchy (src/utils/exceptions.py):
raise SearchError(...) from estructlog:logger.error(\"Operation failed\", error=str(e), context=value)\n structlog for all logging (NOT print or logging)import structlog; logger = structlog.get_logger()logger.info(\"event\", key=value)logger.info(\"Starting search\", query=query, tools=[t.name for t in tools])\nlogger.warning(\"Search tool failed\", tool=tool.name, error=str(result))\nlogger.error(\"Assessment failed\", error=str(e))\n"},{"location":"contributing/error-handling/#error-chaining","title":"Error Chaining","text":"Always preserve exception context:
try:\n result = await api_call()\nexcept httpx.HTTPError as e:\n raise SearchError(f\"API call failed: {e}\") from e\n"},{"location":"contributing/error-handling/#see-also","title":"See Also","text":"This document outlines common implementation patterns used in The DETERMINATOR.
"},{"location":"contributing/implementation-patterns/#search-tools","title":"Search Tools","text":"All tools implement SearchTool protocol (src/tools/base.py):
name propertyasync def search(query, max_results) -> list[Evidence]@retry decorator from tenacity for resilience_rate_limit() for APIs with limits (e.g., PubMed)SearchError or RateLimitError on failuresExample pattern:
class MySearchTool:\n @property\n def name(self) -> str:\n return \"mytool\"\n \n @retry(stop=stop_after_attempt(3), wait=wait_exponential(...))\n async def search(self, query: str, max_results: int = 10) -> list[Evidence]:\n # Implementation\n return evidence_list\n"},{"location":"contributing/implementation-patterns/#judge-handlers","title":"Judge Handlers","text":"JudgeHandlerProtocol (async def assess(question, evidence) -> JudgeAssessment)Agent with output_type=JudgeAssessmentsrc/prompts/judge.pyMockJudgeHandler, HFInferenceJudgeHandlerJudgeAssessment (never raise exceptions)src/agent_factory/)ContextVar for thread-safe state (src/agents/state.py)@lru_cache)Use @lru_cache(maxsize=1) for singletons:
This document outlines prompt engineering guidelines and citation validation rules.
"},{"location":"contributing/prompt-engineering/#judge-prompts","title":"Judge Prompts","text":"src/prompts/judge.pyformat_user_prompt() and format_empty_evidence_prompt() helperstruncate_at_sentence())format_hypothesis_prompt() with embeddings for diversityvalidate_references() from src/utils/citation_validator.pyselect_diverse_evidence() for MMR-based selectionThis document outlines testing requirements and guidelines for The DETERMINATOR.
"},{"location":"contributing/testing/#test-structure","title":"Test Structure","text":"tests/unit/ (mocked, fast)tests/integration/ (real APIs, marked @pytest.mark.integration)unit, integration, slow, openai, huggingface, embedding_provider, local_embeddingsThe project uses pytest markers to categorize tests. These markers are defined in pyproject.toml:
@pytest.mark.unit: Unit tests (mocked, fast) - Run with -m \"unit\"@pytest.mark.integration: Integration tests (real APIs) - Run with -m \"integration\"@pytest.mark.slow: Slow tests - Run with -m \"slow\"@pytest.mark.openai: Tests requiring OpenAI API key - Run with -m \"openai\" or exclude with -m \"not openai\"@pytest.mark.huggingface: Tests requiring HuggingFace API key or using HuggingFace models - Run with -m \"huggingface\"@pytest.mark.embedding_provider: Tests requiring API-based embedding providers (OpenAI, etc.) - Run with -m \"embedding_provider\"@pytest.mark.local_embeddings: Tests using local embeddings (sentence-transformers, ChromaDB) - Run with -m \"local_embeddings\"# Run only unit tests (excludes OpenAI tests by default)\nuv run pytest tests/unit/ -v -m \"not openai\" -p no:logfire\n\n# Run HuggingFace tests\nuv run pytest tests/ -v -m \"huggingface\" -p no:logfire\n\n# Run all tests\nuv run pytest tests/ -v -p no:logfire\n\n# Run only local embedding tests\nuv run pytest tests/ -v -m \"local_embeddings\" -p no:logfire\n\n# Exclude slow tests\nuv run pytest tests/ -v -m \"not slow\" -p no:logfire\n Note: The -p no:logfire flag disables the logfire plugin to avoid conflicts during testing.
respx for httpx mockingpytest-mock for general mockingMockJudgeHandler)tests/conftest.py: mock_httpx_client, mock_llm_responsetests/unit/src/uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m \"not openai\" -p no:logfire# Run unit tests (default, excludes OpenAI tests)\nuv run pytest tests/unit/ -v -m \"not openai\" -p no:logfire\n\n# Run HuggingFace tests\nuv run pytest tests/ -v -m \"huggingface\" -p no:logfire\n\n# Run all tests\nuv run pytest tests/ -v -p no:logfire\n"},{"location":"contributing/testing/#test-examples","title":"Test Examples","text":"@pytest.mark.unit\nasync def test_pubmed_search(mock_httpx_client):\n tool = PubMedTool()\n results = await tool.search(\"metformin\", max_results=5)\n assert len(results) > 0\n assert all(isinstance(r, Evidence) for r in results)\n\n@pytest.mark.integration\nasync def test_real_pubmed_search():\n tool = PubMedTool()\n results = await tool.search(\"metformin\", max_results=3)\n assert len(results) <= 3\n"},{"location":"contributing/testing/#test-coverage","title":"Test Coverage","text":""},{"location":"contributing/testing/#terminal-coverage-report","title":"Terminal Coverage Report","text":"uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m \"not openai\" -p no:logfire\n This shows coverage with missing lines highlighted in the terminal output.
"},{"location":"contributing/testing/#html-coverage-report","title":"HTML Coverage Report","text":"uv run pytest --cov=src --cov-report=html -p no:logfire\n This generates an HTML coverage report in htmlcov/index.html. Open this file in your browser to see detailed coverage information.
__init__.py, TYPE_CHECKING blockspyproject.toml under [tool.coverage.*]This page provides examples of using The DETERMINATOR for various research tasks.
"},{"location":"getting-started/examples/#basic-research-query","title":"Basic Research Query","text":""},{"location":"getting-started/examples/#example-1-drug-information","title":"Example 1: Drug Information","text":"Query:
What are the latest treatments for Alzheimer's disease?\n What The DETERMINATOR Does: 1. Searches PubMed for recent papers 2. Searches ClinicalTrials.gov for active trials 3. Evaluates evidence quality 4. Synthesizes findings into a comprehensive report
"},{"location":"getting-started/examples/#example-2-clinical-trial-search","title":"Example 2: Clinical Trial Search","text":"Query:
What clinical trials are investigating metformin for cancer prevention?\n What The DETERMINATOR Does:
Query:
Review the evidence for using metformin as an anti-aging intervention, \nincluding clinical trials, mechanisms of action, and safety profile.\n What The DETERMINATOR Does: 1. Uses deep research mode (multi-section) 2. Searches multiple sources in parallel 3. Generates sections on: - Clinical trials - Mechanisms of action - Safety profile 4. Synthesizes comprehensive report
"},{"location":"getting-started/examples/#example-4-hypothesis-testing","title":"Example 4: Hypothesis Testing","text":"Query:
Test the hypothesis that regular exercise reduces Alzheimer's disease risk.\n What The DETERMINATOR Does: 1. Generates testable hypotheses 2. Searches for supporting/contradicting evidence 3. Performs statistical analysis (if Modal configured) 4. Provides verdict: SUPPORTED, REFUTED, or INCONCLUSIVE
"},{"location":"getting-started/examples/#mcp-tool-examples","title":"MCP Tool Examples","text":""},{"location":"getting-started/examples/#using-search_pubmed","title":"Using search_pubmed","text":"Search PubMed for \"CRISPR gene editing cancer therapy\"\n"},{"location":"getting-started/examples/#using-search_clinical_trials","title":"Using search_clinical_trials","text":"Find active clinical trials for \"diabetes type 2 treatment\"\n"},{"location":"getting-started/examples/#using-search_all","title":"Using search_all","text":"Search all sources for \"COVID-19 vaccine side effects\"\n"},{"location":"getting-started/examples/#using-analyze_hypothesis","title":"Using analyze_hypothesis","text":"Analyze whether vitamin D supplementation reduces COVID-19 severity\n"},{"location":"getting-started/examples/#code-examples","title":"Code Examples","text":""},{"location":"getting-started/examples/#python-api-usage","title":"Python API Usage","text":"from src.orchestrator_factory import create_orchestrator\nfrom src.tools.search_handler import SearchHandler\nfrom src.agent_factory.judges import create_judge_handler\n\n# Create orchestrator\nsearch_handler = SearchHandler()\njudge_handler = create_judge_handler()\n # Run research query\nquery = \"What are the latest treatments for Alzheimer's disease?\"\nasync for event in orchestrator.run(query):\n print(f\"Event: {event.type} - {event.data}\")\n"},{"location":"getting-started/examples/#gradio-ui-integration","title":"Gradio UI Integration","text":"import gradio as gr\nfrom src.app import create_research_interface\n\n# Create interface\ninterface = create_research_interface()\n\n# Launch\ninterface.launch(server_name=\"0.0.0.0\", server_port=7860)\n"},{"location":"getting-started/examples/#research-patterns","title":"Research Patterns","text":""},{"location":"getting-started/examples/#iterative-research","title":"Iterative Research","text":"Single-loop research with search-judge-synthesize cycles:
from src.orchestrator.research_flow import IterativeResearchFlow\n async for event in flow.run(query):\n # Handle events\n pass\n"},{"location":"getting-started/examples/#deep-research","title":"Deep Research","text":"Multi-section parallel research:
from src.orchestrator.research_flow import DeepResearchFlow\n async for event in flow.run(query):\n # Handle events\n pass\n"},{"location":"getting-started/examples/#configuration-examples","title":"Configuration Examples","text":""},{"location":"getting-started/examples/#basic-configuration","title":"Basic Configuration","text":"# .env file\nLLM_PROVIDER=openai\nOPENAI_API_KEY=your_key_here\nMAX_ITERATIONS=10\n"},{"location":"getting-started/examples/#advanced-configuration","title":"Advanced Configuration","text":"# .env file\nLLM_PROVIDER=anthropic\nANTHROPIC_API_KEY=your_key_here\nEMBEDDING_PROVIDER=local\nWEB_SEARCH_PROVIDER=duckduckgo\nMAX_ITERATIONS=20\nDEFAULT_TOKEN_LIMIT=200000\nUSE_GRAPH_EXECUTION=true\n"},{"location":"getting-started/examples/#next-steps","title":"Next Steps","text":"This guide will help you install and set up DeepCritical on your system.
"},{"location":"getting-started/installation/#prerequisites","title":"Prerequisites","text":"uv package manager (recommended) or pipuv is a fast Python package installer and resolver. Install it using the standalone installer (recommended):
Unix/macOS/Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh\n Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c \"irm https://astral.sh/uv/install.ps1 | iex\"\n Alternative methods:
# Using pipx (recommended if you have pipx installed)\npipx install uv\n\n# Or using pip\npip install uv\n After installation, restart your terminal or add ~/.cargo/bin to your PATH.
git clone https://github.com/DeepCritical/GradioDemo.git\ncd GradioDemo\n"},{"location":"getting-started/installation/#3-install-dependencies","title":"3. Install Dependencies","text":"Using uv (recommended):
uv sync\n Using pip:
pip install -e .\n"},{"location":"getting-started/installation/#4-install-optional-dependencies","title":"4. Install Optional Dependencies","text":"For embeddings support (local sentence-transformers):
uv sync --extra embeddings\n For Modal sandbox execution:
uv sync --extra modal\n For Magentic orchestration:
uv sync --extra magentic\n Install all extras:
uv sync --all-extras\n"},{"location":"getting-started/installation/#5-configure-environment-variables","title":"5. Configure Environment Variables","text":"Create a .env file in the project root:
# Required: At least one LLM provider\nLLM_PROVIDER=openai # or \"anthropic\" or \"huggingface\"\nOPENAI_API_KEY=your_openai_api_key_here\n\n# Optional: Other services\nNCBI_API_KEY=your_ncbi_api_key_here # For higher PubMed rate limits\nMODAL_TOKEN_ID=your_modal_token_id\nMODAL_TOKEN_SECRET=your_modal_token_secret\n See the Configuration Guide for all available options.
"},{"location":"getting-started/installation/#6-verify-installation","title":"6. Verify Installation","text":"Run the application:
uv run gradio run src/app.py\n Open your browser to http://localhost:7860 to verify the installation.
For development, install dev dependencies:
uv sync --all-extras --dev\n Install pre-commit hooks:
uv run pre-commit install\n"},{"location":"getting-started/installation/#troubleshooting","title":"Troubleshooting","text":""},{"location":"getting-started/installation/#common-issues","title":"Common Issues","text":"Import Errors: - Ensure you've installed all required dependencies - Check that Python 3.11+ is being used
API Key Errors: - Verify your .env file is in the project root - Check that API keys are correctly formatted - Ensure at least one LLM provider is configured
Module Not Found: - Run uv sync or pip install -e . again - Check that you're in the correct virtual environment
Port Already in Use: - Change the port in src/app.py or use environment variable - Kill the process using port 7860
The DETERMINATOR exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients.
"},{"location":"getting-started/mcp-integration/#what-is-mcp","title":"What is MCP?","text":"The Model Context Protocol (MCP) is a standard for connecting AI assistants to external tools and data sources. The DETERMINATOR implements an MCP server that exposes its search capabilities as MCP tools.
"},{"location":"getting-started/mcp-integration/#mcp-server-url","title":"MCP Server URL","text":"When running locally:
http://localhost:7860/gradio_api/mcp/\n"},{"location":"getting-started/mcp-integration/#claude-desktop-configuration","title":"Claude Desktop Configuration","text":""},{"location":"getting-started/mcp-integration/#1-locate-configuration-file","title":"1. Locate Configuration File","text":"macOS:
~/Library/Application Support/Claude/claude_desktop_config.json\n Windows:
%APPDATA%\\Claude\\claude_desktop_config.json\n Linux:
~/.config/Claude/claude_desktop_config.json\n"},{"location":"getting-started/mcp-integration/#2-add-the-determinator-server","title":"2. Add The DETERMINATOR Server","text":"Edit claude_desktop_config.json and add:
{\n \"mcpServers\": {\n \"determinator\": {\n \"url\": \"http://localhost:7860/gradio_api/mcp/\"\n }\n }\n}\n"},{"location":"getting-started/mcp-integration/#3-restart-claude-desktop","title":"3. Restart Claude Desktop","text":"Close and restart Claude Desktop for changes to take effect.
"},{"location":"getting-started/mcp-integration/#4-verify-connection","title":"4. Verify Connection","text":"In Claude Desktop, you should see The DETERMINATOR tools available: - search_pubmed - search_clinical_trials - search_biorxiv - search_all - analyze_hypothesis
Search peer-reviewed biomedical literature from PubMed.
Parameters: - query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)
Example:
Search PubMed for \"metformin diabetes\"\n"},{"location":"getting-started/mcp-integration/#search_clinical_trials","title":"search_clinical_trials","text":"Search ClinicalTrials.gov for interventional studies.
Parameters: - query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)
Example:
Search clinical trials for \"Alzheimer's disease treatment\"\n"},{"location":"getting-started/mcp-integration/#search_biorxiv","title":"search_biorxiv","text":"Search bioRxiv/medRxiv preprints via Europe PMC.
Parameters: - query (string): Search query - max_results (integer, optional): Maximum number of results (default: 10)
Example:
Search bioRxiv for \"CRISPR gene editing\"\n"},{"location":"getting-started/mcp-integration/#search_all","title":"search_all","text":"Search all sources simultaneously (PubMed, ClinicalTrials.gov, Europe PMC).
Parameters: - query (string): Search query - max_results (integer, optional): Maximum number of results per source (default: 10)
Example:
Search all sources for \"COVID-19 vaccine efficacy\"\n"},{"location":"getting-started/mcp-integration/#analyze_hypothesis","title":"analyze_hypothesis","text":"Perform secure statistical analysis using Modal sandboxes.
Parameters: - hypothesis (string): Hypothesis to analyze - data (string, optional): Data description or code
Example:
Analyze the hypothesis that metformin reduces cancer risk\n"},{"location":"getting-started/mcp-integration/#using-tools-in-claude-desktop","title":"Using Tools in Claude Desktop","text":"Once configured, you can ask Claude to use DeepCritical tools:
Use DeepCritical to search PubMed for recent papers on Alzheimer's disease treatments.\n Claude will automatically: 1. Call the appropriate DeepCritical tool 2. Retrieve results 3. Use the results in its response
"},{"location":"getting-started/mcp-integration/#troubleshooting","title":"Troubleshooting","text":""},{"location":"getting-started/mcp-integration/#connection-issues","title":"Connection Issues","text":"Server Not Found: - Ensure DeepCritical is running (uv run gradio run src/app.py) - Verify the URL in claude_desktop_config.json is correct - Check that port 7860 is not blocked by firewall
Tools Not Appearing: - Restart Claude Desktop after configuration changes - Check Claude Desktop logs for errors - Verify MCP server is accessible at the configured URL
"},{"location":"getting-started/mcp-integration/#authentication","title":"Authentication","text":"If DeepCritical requires authentication: - Configure API keys in DeepCritical settings - Use HuggingFace OAuth login - Ensure API keys are valid
"},{"location":"getting-started/mcp-integration/#advanced-configuration","title":"Advanced Configuration","text":""},{"location":"getting-started/mcp-integration/#custom-port","title":"Custom Port","text":"If running on a different port, update the URL:
{\n \"mcpServers\": {\n \"deepcritical\": {\n \"url\": \"http://localhost:8080/gradio_api/mcp/\"\n }\n }\n}\n"},{"location":"getting-started/mcp-integration/#multiple-instances","title":"Multiple Instances","text":"You can configure multiple DeepCritical instances:
{\n \"mcpServers\": {\n \"deepcritical-local\": {\n \"url\": \"http://localhost:7860/gradio_api/mcp/\"\n },\n \"deepcritical-remote\": {\n \"url\": \"https://your-server.com/gradio_api/mcp/\"\n }\n }\n}\n"},{"location":"getting-started/mcp-integration/#next-steps","title":"Next Steps","text":"Deploy with docker instandly with a single command :
docker run -it -p 7860:7860 --platform=linux/amd64 \\\n -e DB_KEY=\"YOUR_VALUE_HERE\" \\\n -e SERP_API=\"YOUR_VALUE_HERE\" \\\n -e INFERENCE_API=\"YOUR_VALUE_HERE\" \\\n -e MODAL_TOKEN_ID=\"YOUR_VALUE_HERE\" \\\n -e MODAL_TOKEN_SECRET=\"YOUR_VALUE_HERE\" \\\n -e NCBI_API_KEY=\"YOUR_VALUE_HERE\" \\\n -e SERPER_API_KEY=\"YOUR_VALUE_HERE\" \\\n -e CHROMA_DB_PATH=\"./chroma_db\" \\\n -e CHROMA_DB_HOST=\"localhost\" \\\n -e CHROMA_DB_PORT=\"8000\" \\\n -e RAG_COLLECTION_NAME=\"deepcritical_evidence\" \\\n -e RAG_SIMILARITY_TOP_K=\"5\" \\\n -e RAG_AUTO_INGEST=\"true\" \\\n -e USE_GRAPH_EXECUTION=\"false\" \\\n -e DEFAULT_TOKEN_LIMIT=\"100000\" \\\n -e DEFAULT_TIME_LIMIT_MINUTES=\"10\" \\\n -e DEFAULT_ITERATIONS_LIMIT=\"10\" \\\n -e WEB_SEARCH_PROVIDER=\"duckduckgo\" \\\n -e MAX_ITERATIONS=\"10\" \\\n -e SEARCH_TIMEOUT=\"30\" \\\n -e LOG_LEVEL=\"DEBUG\" \\\n -e EMBEDDING_PROVIDER=\"local\" \\\n -e OPENAI_EMBEDDING_MODEL=\"text-embedding-3-small\" \\\n -e LOCAL_EMBEDDING_MODEL=\"BAAI/bge-small-en-v1.5\" \\\n -e HUGGINGFACE_EMBEDDING_MODEL=\"sentence-transformers/all-MiniLM-L6-v2\" \\\n -e HF_FALLBACK_MODELS=\"Qwen/Qwen3-Next-80B-A3B-Thinking,Qwen/Qwen3-Next-80B-A3B-Instruct,meta-llama/Llama-3.3-70B-Instruct,meta-llama/Llama-3.1-8B-Instruct,HuggingFaceH4/zephyr-7b-beta,Qwen/Qwen2-7B-Instruct\" \\\n -e HUGGINGFACE_MODEL=\"Qwen/Qwen3-Next-80B-A3B-Thinking\" \\\n registry.hf.space/dataquests-deepcritical:latest python src/app.py\n ```\n\n## Quick start guide\n\nGet up and running with The DETERMINATOR in minutes.\n\n## Start the Application\n\n```bash\ngradio src/app.py\n Open your browser to http://localhost:7860.
Type your research question in the chat interface, for example: - \"What are the latest treatments for Alzheimer's disease?\" - \"Review the evidence for metformin in cancer prevention\" - \"What clinical trials are investigating COVID-19 vaccines?\"
Click \"Submit\" or press Enter. The system will: - Generate observations about your query - Identify knowledge gaps - Search multiple sources (PubMed, ClinicalTrials.gov, Europe PMC) - Evaluate evidence quality - Synthesize findings into a report
Watch the real-time progress in the chat interface: - Search operations and results - Evidence evaluation - Report generation - Final research report with citations
"},{"location":"getting-started/quick-start/#authentication","title":"Authentication","text":""},{"location":"getting-started/quick-start/#huggingface-oauth-recommended","title":"HuggingFace OAuth (Recommended)","text":"What are the side effects of metformin?\n"},{"location":"getting-started/quick-start/#complex-query","title":"Complex Query","text":"Review the evidence for using metformin as an anti-aging intervention, \nincluding clinical trials, mechanisms of action, and safety profile.\n"},{"location":"getting-started/quick-start/#clinical-trial-query","title":"Clinical Trial Query","text":"What are the active clinical trials investigating Alzheimer's disease treatments?\n"},{"location":"getting-started/quick-start/#next-steps","title":"Next Steps","text":"The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations). The system automatically determines if medical knowledge sources are needed and adapts its search strategy accordingly. It supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming.
"},{"location":"overview/architecture/#core-architecture","title":"Core Architecture","text":""},{"location":"overview/architecture/#orchestration-patterns","title":"Orchestration Patterns","text":"src/orchestrator/graph_orchestrator.py):AsyncGenerator[AgentEvent] for real-time UI updatesFallback to agent chains when graph execution is disabled
Deep Research Flow (src/orchestrator/research_flow.py):
PlannerAgent to break query into report sectionsIterativeResearchFlow instances in parallel per section via WorkflowManagerLongWriterAgent or ProofreaderAgentuse_graph=True) and agent chains (use_graph=False)State synchronization across parallel loops
Iterative Research Flow (src/orchestrator/research_flow.py):
KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent, WriterAgentJudgeHandler assesses evidence sufficiencySupports graph execution and agent chains
Magentic Orchestrator (src/orchestrator_magentic.py):
agent-framework-coreMagenticBuilder with participants: searcher, hypothesizer, judge, reporterOpenAIChatClientAgentEvent for UI streamingSupports long-running workflows with max rounds and stall/reset handling
Hierarchical Orchestrator (src/orchestrator_hierarchical.py):
SubIterationMiddleware with ResearchTeam and LLMSubIterationJudgeSubIterationTeam protocolasyncio.Queue for coordinationSupports sub-iteration patterns for complex research tasks
Legacy Simple Mode (src/legacy_orchestrator.py):
SearchHandlerProtocol and JudgeHandlerProtocolAgentEvent objectsThe system is designed for long-running research tasks with comprehensive state management and streaming:
AgentEvent objects via AsyncGeneratorstarted, searching, search_complete, judging, judge_complete, looping, synthesizing, hypothesizing, complete, errorMetadata includes iteration numbers, tool names, result counts, durations
Budget Tracking (src/middleware/budget_tracker.py):
Budget summaries for monitoring
Workflow Manager (src/middleware/workflow_manager.py):
pending, running, completed, failed, cancelledEvidence deduplication across parallel loops
State Management (src/middleware/state_machine.py):
ContextVar for concurrent requestsWorkflowState tracks: evidence, conversation history, embedding serviceSupports both iterative and deep research patterns
Gradio UI (src/app.py):
The graph orchestrator (src/orchestrator/graph_orchestrator.py) implements a flexible graph-based execution model:
Node Types:
KnowledgeGapAgent, ToolSelectorAgent)Edge Types:
Graph Patterns:
[Input] \u2192 [Thinking] \u2192 [Knowledge Gap] \u2192 [Decision: Complete?] \u2192 [Tool Selector] or [Writer][Input] \u2192 [Planner] \u2192 [Parallel Iterative Loops] \u2192 [Synthesizer]Execution Flow:
asyncio.gather()src/orchestrator/, src/orchestrator_*.py)src/orchestrator/research_flow.py)src/agent_factory/graph_builder.py)src/agents/, src/agent_factory/agents.py)src/tools/)src/agent_factory/judges.py)src/services/embeddings.py)src/services/statistical_analyzer.py)src/services/multimodal_processing.py, src/services/audio_processing.py)src/middleware/)src/mcp_tools.py)src/app.py)The system supports complex research workflows through:
ResearchLoop instancesasyncio.gather()Handles loop failures gracefully
Deep Research Pattern: Breaks complex queries into sections
Final synthesis combines all section results
State Synchronization: Thread-safe evidence sharing
src/orchestrator_factory.py):Lazy imports for optional dependencies
Orchestrator Modes (selected in UI or via factory):
simple: Legacy linear search-judge loop (Free Tier)advanced or magentic: Multi-agent coordination using Microsoft Agent Framework (requires OpenAI API key)iterative: Knowledge-gap-driven research with single loop (Free Tier)deep: Parallel section-based research with planning (Free Tier)auto: Intelligent mode detection based on query complexity (Free Tier)
Graph Research Modes (used within graph orchestrator, separate from orchestrator mode):
iterative: Single research loop patterndeep: Multi-section parallel research patternauto: Auto-detect pattern based on query complexity
Execution Modes:
use_graph=True: Graph-based execution (parallel, conditional routing)use_graph=False: Agent chains (sequential, backward compatible)Note: The UI provides separate controls for orchestrator mode and graph research mode. When using graph-based orchestrators (iterative/deep/auto), the graph research mode determines the specific pattern used within the graph execution.
"},{"location":"overview/features/","title":"Features","text":"The DETERMINATOR provides a comprehensive set of features for AI-assisted research:
"},{"location":"overview/features/#core-features","title":"Core Features","text":""},{"location":"overview/features/#multi-source-search","title":"Multi-Source Search","text":"HF_TOKEN or HUGGINGFACE_API_KEY)Orchestrator Modes: - simple: Legacy linear search-judge loop - advanced (or magentic): Multi-agent coordination (requires OpenAI API key) - iterative: Knowledge-gap-driven research with single loop - deep: Parallel section-based research with planning - auto: Intelligent mode detection based on query complexity
Graph Research Modes (used within graph orchestrator): - iterative: Single research loop pattern - deep: Multi-section parallel research pattern - auto: Auto-detect pattern based on query complexity
Execution Modes: - use_graph=True: Graph-based execution with parallel and conditional routing - use_graph=False: Agent chains with sequential execution (backward compatible)
AsyncGenerator[AgentEvent].env filesGet started with DeepCritical in minutes.
"},{"location":"overview/quick-start/#installation","title":"Installation","text":"# Install uv if you haven't already (recommended: standalone installer)\n# Unix/macOS/Linux:\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Windows (PowerShell):\npowershell -ExecutionPolicy ByPass -c \"irm https://astral.sh/uv/install.ps1 | iex\"\n\n# Alternative: pipx install uv\n# Or: pip install uv\n\n# Sync dependencies\nuv sync\n"},{"location":"overview/quick-start/#run-the-ui","title":"Run the UI","text":"# Start the Gradio app\nuv run gradio run src/app.py\n Open your browser to http://localhost:7860.
Authentication is mandatory - you must authenticate before using the application. The app will display an error message if you try to use it without authentication.
HuggingFace OAuth Login (Recommended): - Click the \"Sign in with HuggingFace\" button at the top of the app - Your HuggingFace API token will be automatically used for AI inference - No need to manually enter API keys when logged in
Manual API Key (Alternative): - Set environment variable HF_TOKEN or HUGGINGFACE_API_KEY before starting the app - The app will automatically use these tokens if OAuth login is not available - Supports HuggingFace API keys only (OpenAI/Anthropic keys are not used in the current implementation)
Multimodal Features: - Configure image/audio input and output in the sidebar settings - Image OCR and audio STT/TTS can be enabled/disabled independently - TTS voice and speed can be customized in the Audio Output settings
"},{"location":"overview/quick-start/#3-mcp-integration-optional","title":"3. MCP Integration (Optional)","text":"Connect DeepCritical to Claude Desktop:
Add to your claude_desktop_config.json:
{\n \"mcpServers\": {\n \"deepcritical\": {\n \"url\": \"http://localhost:7860/gradio_api/mcp/\"\n }\n }\n}\n Restart Claude Desktop
search_pubmed: Search peer-reviewed biomedical literaturesearch_clinical_trials: Search ClinicalTrials.govsearch_biorxiv: Search bioRxiv/medRxiv preprintssearch_neo4j: Search Neo4j knowledge graph for papers and disease relationshipssearch_all: Search all sources simultaneouslyanalyze_hypothesis: Secure statistical analysis using Modal sandboxesNote: The application automatically uses all available search tools (Neo4j, PubMed, ClinicalTrials.gov, Europe PMC, Web search, RAG) based on query analysis. Neo4j knowledge graph search is included by default for biomedical queries.
"},{"location":"overview/quick-start/#next-steps","title":"Next Steps","text":"DeepCritical is developed by a team of researchers and developers working on AI-assisted research.
The DeepCritical team met online in the Alzheimer's Critical Literature Review Group in the Hugging Science initiative. We're building the agent framework we want to use for AI-assisted research to turn the vast amounts of clinical data into cures.
We welcome contributions! See the Contributing Guide for details.
DeepCritical is developed by a team of researchers and developers working on AI-assisted research.
The DeepCritical team met online in the Alzheimer's Critical Literature Review Group in the Hugging Science initiative. We're building the agent framework we want to use for AI-assisted research to turn the vast amounts of clinical data into cures.
We welcome contributions! See the Contributing Guide for details.