This file provides guidance for AI agents working on this codebase.
- WebSearch Plugin: The custom
WebSearchplugin located inplugins/web_search_plugin/has been effectively disabled for Gemini Function Calling. Itsfunction_declarationhas been removed from itsmanifest.json. This is because the agent now utilizes Gemini's nativegoogle_searchtool, which provides grounding and citation capabilities. The old plugin's code remains in the repository for reference or potential future use cases not covered by native function calling. - Plugin Manifests: When adding new plugins that are intended to be used with Gemini's Function Calling, ensure that the
manifest.jsonincludes a validfunction_declarationsection. Refer to theCalculatorplugin (plugins/calculator_plugin/manifest.json) for an example.
- The application requires a Google API Key for accessing Gemini models.
- Users are prompted to enter their API key on first launch.
- The key can be optionally saved to a configuration file (
config.json) in the user's application data directory. - Handle API key errors gracefully and provide clear instructions to the user.
- Overall Design: The application is being transitioned to a modern, minimalist design using Glasmorphismus principles. This involves dark themes, transparency, blur effects (simulated where direct blur isn't feasible in Qt Widgets), and vibrant accent colors.
- Three-Column Layout: The main window (
MainApp) now implements a three-column layout inspired by Google AI Studio:- Left Column (
LeftColumnWidget): Navigation and Chat History. Contains user profile (placeholder), history search, new chat button, and the chat history list (history_list_widget). - Middle Column (
MiddleColumnWidget): Main Chat/Interaction Area. Includes a chat header (KI profile, status, action icons - placeholders), the main chat display (chat_area), and an input area at the bottom (input field, attachment buttons, send button, placeholder prompt chips). - Right Column (
RightColumnWidget): Settings, Tools, and Details. Contains title, placeholders for model selection, temperature controls, tool toggles, and the context/sources panel (context_panel).
- Left Column (
- Styling:
- A global stylesheet (
get_base_stylesheet()inmain.py) defines the foundational look and feel, using constants fromSTYLE_CONSTANTS. - Key widgets are styled using object names (e.g.,
QWidget#LeftColumnWidget,QTextEdit#ChatArea) for specific Glasmorphismus panel effects (semi-transparent backgrounds, rounded borders). - The aim is a "frosted glass" effect. True dynamic background blur on arbitrary widgets is complex and currently simulated through layered transparencies over a dark main window background.
- Native OS window decorations (title bar for close/minimize/maximize) are retained.
- A global stylesheet (
- Key UI Files:
main.py: ContainsMainAppand the primary layout and styling logic.quick_ask_overlay.py: Implements the separate Insight Overlay, which also uses Glasmorphismus.
- Development Notes:
- Styling chat bubbles with full Glasmorphismus effects within
QTextEditis challenging. This will require further work, potentially using custom delegates if QSS proves insufficient. - Focus on maintaining readability and usability despite the transparent and layered design.
- Styling chat bubbles with full Glasmorphismus effects within
- Utilize
QThreadPoolandQRunnablefor long-running tasks (like API calls) to keep the UI responsive. - Ensure that UI elements are updated on the main thread, typically using Qt signals and slots if updates are triggered from worker threads.
- Follow PEP 8 guidelines for Python code.
- Write clear and concise comments where necessary.
- Add unit tests for new functionalities if feasible.
- System Tray Icon: The application now integrates with the system tray (if available).
- Closing the main window hides it to the tray.
- The tray icon menu provides options to "Show/Hide Agent" and "Quit Agent".
- Left-clicking the tray icon also toggles the main window's visibility.
app.setQuitOnLastWindowClosed(False)is used to enable background running.
- Auto-Start Functionality:
- A "Start with System" option is available in the tray menu.
- This feature relies on the
autorunPyPI library (pip install autorun). If the library is not installed, this menu option will be disabled, and a warning will be shown if the user tries to enable it. - The application registers itself to run on system startup using
autorun.add("GeminiDesktopAgent", sys.executable, args=[os.path.abspath(__file__), "--started-from-autorun"]). - The
--started-from-autorunargument is used to launch the application with its main window initially hidden. - Auto-start preference is stored in
config.jsonunder theauto_start_enabledkey. - Error handling is in place for enabling/disabling auto-start, including feedback to the user.
- Configuration (
config.json):- Located in
~/.gemini_desktop_agent/config.json. - Stores
api_keyandauto_start_enabled(boolean). - The application attempts to handle corrupted config files by resetting to defaults.
- Located in
- Icon:
- The application will look for
icons/app_icon.pngfor the tray and window icon. If not found, it falls back to system/standard icons or a dynamically generated one. It's recommended to create and place anapp_icon.png(e.g., 64x64 or 128x128) in aniconsdirectory at the project root.
- The application will look for