Skip to content

Conversation

@vitali87
Copy link
Owner

@vitali87 vitali87 commented Dec 9, 2025

Summary

  • Add RAGDeps dataclass to centralize all tool dependencies
  • Refactor tools to use RunContext[RAGDeps] instead of closure-based factories
  • Update create_rag_orchestrator to use deps_type parameter
  • Simplify MCP tools to use service classes directly

Changes

  • Created codebase_rag/deps.py with RAGDeps dataclass
  • Created codebase_rag/exceptions.py to break circular imports
  • Refactored all tools to accept RunContext[RAGDeps] as first parameter
  • Updated main.py to create deps at runtime and pass to agent
  • Simplified MCP registry by removing intermediate tool wrappers
  • Updated all related tests

Benefits

  • Explicit, type-safe dependency injection
  • Reduced coupling between components
  • Simplified tool registration (no factory functions needed)
  • Improved testability through proper dependency injection pattern

Test plan

  • All existing tests pass
  • MCP tools work correctly with service classes
  • Pre-commit hooks pass

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @vitali87, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the codebase to implement a robust dependency injection pattern using Pydantic AI's RunContext and a new RAGDeps dataclass. The primary goal is to centralize and explicitly manage dependencies for various tools, improving type safety, reducing component coupling, and simplifying the overall architecture. This change streamlines tool registration and enhances the testability of the system by making dependencies more manageable and visible.

Highlights

  • Dependency Injection with Pydantic AI: Introduced a new RAGDeps dataclass to centralize all tool dependencies, enabling explicit and type-safe dependency injection for the RAG orchestrator.
  • Refactored Tool Usage: All tools have been refactored to accept RunContext[RAGDeps] as their first parameter, moving away from closure-based factories and simplifying tool registration.
  • Simplified MCP Tools: The Model Context Protocol (MCP) tools now directly utilize service classes, removing intermediate tool wrappers and reducing coupling between components.
  • Improved Modularity and Testability: The changes lead to a more modular architecture, reduced coupling, and enhanced testability through a proper dependency injection pattern.
  • Centralized Dependency Creation: A new _create_deps function was added to main.py to handle the instantiation of all service classes at runtime, which are then passed to the agent.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant refactoring of dependency management and tool integration within the codebase, primarily by implementing a RAGDeps dataclass for explicit, type-safe dependency injection. The LLMGenerationError exception was moved to its own file. Many docstrings were removed from various functions, including Typer commands and internal utility functions, which a reviewer noted degrades the CLI's self-documentation and removes useful explanations of non-obvious behavior. The main.py and mcp/tools.py files were updated to leverage the new RAGDeps structure, passing dependencies explicitly to functions and agents, and directly calling service methods instead of using factory functions to create pydantic_ai.Tool objects. The create_rag_orchestrator function was modified to directly list tool functions, now accepting RAGDeps as a context. Correspondingly, all tool files (code_retrieval.py, codebase_query.py, directory_lister.py, document_analyzer.py, file_editor.py, file_reader.py, file_writer.py, semantic_search.py, shell_command.py) were refactored to accept RunContext[RAGDeps] as their first argument, removing their create_*_tool factory functions and their internal service class instantiations. Integration tests for MCP tools were updated to reflect these changes, directly interacting with service methods and asserting on ingestor.fetch_all calls, though one reviewer suggested strengthening a test to also verify cypher_gen.generate calls.

Comment on lines +14 to +21
cypher_generator: Any
code_retriever: Any
file_reader: Any
file_writer: Any
file_editor: Any
shell_commander: Any
directory_lister: Any
document_analyzer: Any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The RAGDeps dataclass uses Any for most of its fields. This undermines the goal of "Explicit, type-safe dependency injection" mentioned in the PR description. To improve type safety and developer experience, you should use specific class types for these dependencies.

To avoid circular imports that this might cause, you can use a typing.TYPE_CHECKING block and string forward references.

First, add these imports at the top of the file:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from .services.llm import CypherGenerator
    from .tools.code_retrieval import CodeRetriever
    from .tools.directory_lister import DirectoryLister
    from .tools.document_analyzer import DocumentAnalyzer
    from .tools.file_editor import FileEditor
    from .tools.file_reader import FileReader
    from .tools.file_writer import FileWriter
    from .tools.shell_command import ShellCommander

Then, update the dataclass fields:

Suggested change
cypher_generator: Any
code_retriever: Any
file_reader: Any
file_writer: Any
file_editor: Any
shell_commander: Any
directory_lister: Any
document_analyzer: Any
cypher_generator: "CypherGenerator"
code_retriever: "CodeRetriever"
file_reader: "FileReader"
file_writer: "FileWriter"
file_editor: "FileEditor"
shell_commander: "ShellCommander"
directory_lister: "DirectoryLister"
document_analyzer: "DocumentAnalyzer"

help="Number of buffered nodes/relationships before flushing to Memgraph",
),
) -> None:
"""Starts the Codebase RAG CLI."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The docstring for this Typer command has been removed. Docstrings on Typer command functions are used to generate the help text for the CLI (e.g., graph-code start --help). Removing them degrades the user experience by providing no description for the command. This issue also affects other Typer commands in this file: index, export, optimize, mcp-server, graph-loader-command, and language-command. Please restore the docstrings for all Typer commands to ensure the CLI remains self-documenting.



def get_multiline_input(prompt_text: str = "Ask a question") -> str:
"""Get multiline input from user with Ctrl+J to submit."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The docstring for get_multiline_input was useful as it explained the key bindings for submitting (Ctrl+J) versus creating a new line (Enter). This is non-obvious behavior that is important for maintainers to understand. Please consider restoring it.

await mcp_registry.query_code_graph(query)

mcp_registry._query_tool.function.assert_called_once_with(query)
mcp_registry.ingestor.fetch_all.assert_called_once() # type: ignore[attr-defined]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test is weaker than its previous version. It only asserts that ingestor.fetch_all was called, but doesn't verify that the natural language query was passed to the cypher generator. You could strengthen this test by also asserting that cypher_gen.generate was called with the correct query.

Suggested change
mcp_registry.ingestor.fetch_all.assert_called_once() # type: ignore[attr-defined]
mcp_registry.cypher_gen.generate.assert_called_once_with(query)
mcp_registry.ingestor.fetch_all.assert_called_once()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants