Skip to content

Latest commit

 

History

History
63 lines (52 loc) · 5.69 KB

File metadata and controls

63 lines (52 loc) · 5.69 KB

AGENTS.md - Gemini Desktop Agent

This file provides guidance for AI agents working on this codebase.

Plugin Management

  • WebSearch Plugin: The custom WebSearch plugin located in plugins/web_search_plugin/ has been effectively disabled for Gemini Function Calling. Its function_declaration has been removed from its manifest.json. This is because the agent now utilizes Gemini's native google_search tool, which provides grounding and citation capabilities. The old plugin's code remains in the repository for reference or potential future use cases not covered by native function calling.
  • Plugin Manifests: When adding new plugins that are intended to be used with Gemini's Function Calling, ensure that the manifest.json includes a valid function_declaration section. Refer to the Calculator plugin (plugins/calculator_plugin/manifest.json) for an example.

API Key Management

  • The application requires a Google API Key for accessing Gemini models.
  • Users are prompted to enter their API key on first launch.
  • The key can be optionally saved to a configuration file (config.json) in the user's application data directory.
  • Handle API key errors gracefully and provide clear instructions to the user.

UI Development (PySide6) - Glasmorphismus Theme

  • Overall Design: The application is being transitioned to a modern, minimalist design using Glasmorphismus principles. This involves dark themes, transparency, blur effects (simulated where direct blur isn't feasible in Qt Widgets), and vibrant accent colors.
  • Three-Column Layout: The main window (MainApp) now implements a three-column layout inspired by Google AI Studio:
    • Left Column (LeftColumnWidget): Navigation and Chat History. Contains user profile (placeholder), history search, new chat button, and the chat history list (history_list_widget).
    • Middle Column (MiddleColumnWidget): Main Chat/Interaction Area. Includes a chat header (KI profile, status, action icons - placeholders), the main chat display (chat_area), and an input area at the bottom (input field, attachment buttons, send button, placeholder prompt chips).
    • Right Column (RightColumnWidget): Settings, Tools, and Details. Contains title, placeholders for model selection, temperature controls, tool toggles, and the context/sources panel (context_panel).
  • Styling:
    • A global stylesheet (get_base_stylesheet() in main.py) defines the foundational look and feel, using constants from STYLE_CONSTANTS.
    • Key widgets are styled using object names (e.g., QWidget#LeftColumnWidget, QTextEdit#ChatArea) for specific Glasmorphismus panel effects (semi-transparent backgrounds, rounded borders).
    • The aim is a "frosted glass" effect. True dynamic background blur on arbitrary widgets is complex and currently simulated through layered transparencies over a dark main window background.
    • Native OS window decorations (title bar for close/minimize/maximize) are retained.
  • Key UI Files:
    • main.py: Contains MainApp and the primary layout and styling logic.
    • quick_ask_overlay.py: Implements the separate Insight Overlay, which also uses Glasmorphismus.
  • Development Notes:
    • Styling chat bubbles with full Glasmorphismus effects within QTextEdit is challenging. This will require further work, potentially using custom delegates if QSS proves insufficient.
    • Focus on maintaining readability and usability despite the transparent and layered design.
  • Utilize QThreadPool and QRunnable for long-running tasks (like API calls) to keep the UI responsive.
  • Ensure that UI elements are updated on the main thread, typically using Qt signals and slots if updates are triggered from worker threads.

General Coding Practices

  • Follow PEP 8 guidelines for Python code.
  • Write clear and concise comments where necessary.
  • Add unit tests for new functionalities if feasible.

Background Mode and Auto-Start

  • System Tray Icon: The application now integrates with the system tray (if available).
    • Closing the main window hides it to the tray.
    • The tray icon menu provides options to "Show/Hide Agent" and "Quit Agent".
    • Left-clicking the tray icon also toggles the main window's visibility.
    • app.setQuitOnLastWindowClosed(False) is used to enable background running.
  • Auto-Start Functionality:
    • A "Start with System" option is available in the tray menu.
    • This feature relies on the autorun PyPI library (pip install autorun). If the library is not installed, this menu option will be disabled, and a warning will be shown if the user tries to enable it.
    • The application registers itself to run on system startup using autorun.add("GeminiDesktopAgent", sys.executable, args=[os.path.abspath(__file__), "--started-from-autorun"]).
    • The --started-from-autorun argument is used to launch the application with its main window initially hidden.
    • Auto-start preference is stored in config.json under the auto_start_enabled key.
    • Error handling is in place for enabling/disabling auto-start, including feedback to the user.
  • Configuration (config.json):
    • Located in ~/.gemini_desktop_agent/config.json.
    • Stores api_key and auto_start_enabled (boolean).
    • The application attempts to handle corrupted config files by resetting to defaults.
  • Icon:
    • The application will look for icons/app_icon.png for the tray and window icon. If not found, it falls back to system/standard icons or a dynamically generated one. It's recommended to create and place an app_icon.png (e.g., 64x64 or 128x128) in an icons directory at the project root.