中文文档 | English
A Node.js/TypeScript-based interactive AI agent system that provides a rich Web UI for real-time, multimodal interaction between AI models and users. This tool implements the Model Context Protocol (MCP) and is designed for use with AI editors like Cursor, Windsurf, and other intelligent development environments.
- Multi-modal Interaction: Support for text and image inputs with real-time feedback
- Web Notifications: Browser-native notifications ensure you never miss important messages, even when the tab is in the background
- Desktop GUI: Automatically downloads and opens the interface in a Sidecar window
- Dual MCP Tools:
solicit-input: Interactive mode that collects user feedback through a web interfacenotify-user: Notification mode that sends information to users without waiting for response
- Lazy Server Start: HTTP server only starts when needed to avoid port conflicts
- Session Management: Smart session handling with timeouts and auto-reconnection
- Real-time Communication: WebSocket-based live updates between frontend and backend
- Responsive UI: Modern, clean interface with dark/light theme support
- Page Visibility Detection: Automatically detects when users switch to background tabs
- Backend: Node.js, Express.js, TypeScript, WebSocket, Winston
- Frontend: TypeScript, HTML5, CSS3, Web Notifications API, Page Visibility API
- Protocol: Model Context Protocol (MCP)
- Libraries: marked.js, highlight.js, lucide icons
-
Install the package globally:
npm install -g dynamic-interaction
-
Or install from source:
git clone https://github.com/stone2401/dynamic-interaction.git cd dynamic-interaction pnpm install pnpm run build make link # Links as global CLI tool
During installation, the appropriate Sidecar binary is downloaded automatically so the desktop GUI works out of the box.
As MCP Server: Add to your MCP client configuration (e.g., Claude Desktop):
"dynamic-interaction": {
"command": "npx",
"args": [
"-y",
"dynamic-interaction@latest"
],
"env": {}
},AI Rule add below content:
- **Interaction Channel**: All interactions with the user must be conducted through the `dynamic-interaction` interface.
- **Solicit Input (`solicit-input`)**: This is the **only** way for the AI to initiate any interaction. Whether it's requesting approval, reporting an obstacle, seeking guidance, or asking to conclude the conversation after a task is complete, it must be done through this interface. After calling it, the AI must pause and wait for the user's response.
- **Notify User (`notify-user`)**: This interface is used to send one-way notifications to the user that do not require an immediate response, such as mid-task progress updates. After calling it, the AI can continue its execution without waiting.
- **Task Closure Constraint**: The AI is **strictly prohibited** from unilaterally deciding to end the current conversation or task. After all tasks are completed, it must use the `solicit-input` interface.
Direct CLI Usage:
# Start the MCP server
dynamic-interaction
# Development mode
pnpm run dev
# Build project
pnpm run buildEnvironment variables can be configured via .env file or direct export:
| Variable | Description | Default |
|---|---|---|
PORT |
HTTP server port | 10086 |
SESSION_TIMEOUT |
Session timeout in seconds | 300 |
TIMEOUT_PROMPT |
Default prompt on session timeout | "continue" |
DEFAULT_LANGUAGE |
Default interface language ("zh"、"en" 等) | "zh" |
AUTO_OPEN_SIDECAR |
Launch Sidecar desktop GUI on server start | true |
| Variable | Description | Default |
|---|---|---|
LOG_ENABLED |
Enable logging system | false |
LOG_DIR |
Log files storage directory | ~/.dynamic-interaction/logs |
LOG_ERROR_FILE |
Error log filename | error.log |
LOG_COMBINED_FILE |
Combined log filename | combined.log |
LOG_LEVEL |
Log level (error, warn, info, http, verbose, debug, silly) | info |
LOG_COLORIZE |
Colorized console output | true |
LOG_TO_FILE |
Output logs to file (requires LOG_ENABLED=true) | true |
Example:
PORT=8080 LOG_ENABLED=true dynamic-interactionThe system provides comprehensive notification support:
- Browser Notifications: Native browser notifications when the tab is in background
- Permission Management: Automatic permission requests and status checking
- Page Visibility Detection: Uses Page Visibility API to detect when users switch tabs
- Smart Notification Logic: Only shows browser notifications when the page is not visible
-
solicit-input
- Opens interactive web interface
- Supports text and image inputs
- Real-time session management
- Automatic cleanup on timeout
-
notify-user
- Sends notifications without waiting for user input
- Shows browser notifications for background users
- Customizable notification content
- Modular Design: Clean separation between MCP server, HTTP server, and WebSocket transport
- Session Management: Smart session handling with timeouts and cleanup
- State Management: Centralized server state management
- Error Handling: Comprehensive error handling and graceful degradation
- Security: Input validation and sanitization
src/
├── mcp/ # MCP server implementation
├── server/ # HTTP server and WebSocket handling
├── public/ # Frontend assets
│ ├── ts/ # TypeScript frontend code
│ ├── css/ # Stylesheets
│ └── index.html # Main UI
├── types/ # Shared TypeScript interfaces
├── utils/ # Utility functions
└── config.ts # Configuration management
# Install dependencies
pnpm install
# Development mode with hot reload
pnpm run dev
# Build for production
pnpm run build
# Start built application
pnpm start
# Alternative build using Makefile
make build
make startThe frontend uses a modular TypeScript architecture:
- Services: WebSocket communication, notifications, themes
- Components: UI components like modals, status bars, feedback forms
- Utils: Helper functions and DOM utilities
- Core: Application core, event system, and type definitions
For detailed documentation, see the docs/ directory:
- Architecture overview
- API documentation
- Deployment guides
- Configuration options
We welcome contributions! Please feel free to submit a Pull Request or create an Issue.
This project is licensed under the MIT License.
Built for the AI development community to enhance human-AI interaction in modern development workflows.
