Releases: isac322/kwin-mcp
v0.7.0 — Live Desktop Automation
kwin-mcp v0.7.0 transforms the project from a virtual-only testing tool into a dual-mode desktop automation platform — now supporting both isolated virtual sessions and live desktop connections. AI agents can connect to a real KDE Plasma desktop (or a KWin instance inside a container) to automate, observe, and collaborate on a running session, unlocking use cases like "share my screen" AI collaboration and container-based agent desktops.
Highlights
-
session_connect— Live desktop automation — New MCP tool to attach to an existing KWin session instead of creating an isolated virtual one. Defaults to$DBUS_SESSION_BUS_ADDRESSand$WAYLAND_DISPLAYfrom the environment. All 30 MCP tools (mouse, keyboard, touch, clipboard, accessibility tree, screenshot, window management) work identically in both virtual and live modes. -
--default-live-sessionflag — Start the MCP server or CLI in live-session-default mode. When active,session_connectbecomes the default andsession_startrequires explicit invocation. Supports bothkwin-mcp --default-live-sessionandkwin-mcp-cli --default-live-session. -
Safe live session lifecycle —
session_stopon a live session only disconnects — it never kills KWin or pre-existing applications. Clipboard is always enabled for live sessions (noenable_clipboardparameter needed). -
Container desktop support — Connect to KWin running inside
systemd-nspawnor other container environments by passing explicitdbus_addressandwayland_displayparameters tosession_connect.
New Tool
| Tool | Parameters | Description |
|---|---|---|
session_connect |
dbus_address? str, wayland_display? str, keep_screenshots? bool (false) |
Connect to an existing KWin session (real desktop or container) |
Architecture
Both session modes share the same 30 MCP tools and delegate to AutomationEngine (core.py):
session_start (virtual) → dbus-run-session + kwin_wayland --virtual
session_connect (live) → existing KWin (real desktop / container)
Technologies: AT-SPI2 for accessibility, KWin EIS D-Bus + libei for input injection, KWin ScreenShot2 D-Bus for screenshots, PyGObject for GObject introspection, wl-clipboard and wtype for clipboard and Unicode input.
Installation
# Using uv (recommended)
uv tool install kwin-mcp
# Or using pip
pip install kwin-mcpFull Changelog: v0.6.0...v0.7.0
v0.6.0
kwin-mcp v0.6.0 adds home directory isolation for reproducible testing environments and smarter AT-SPI2 queries with state/role filtering and automatic retry logic.
Highlights
-
Home directory isolation —
session_startnow supportsisolate_home=trueto create a temporary HOME with isolated XDG directories (XDG_CONFIG_HOME,XDG_DATA_HOME,XDG_CACHE_HOME,XDG_STATE_HOME), preventing apps from reading or writing host user settings. Usekeep_home=trueto preserve the isolated home aftersession_stopfor inspection. -
AT-SPI2 state-aware queries —
find_ui_elementsgains astatesparameter for filtering by AT-SPI2 states (e.g.["focused"],["active", "visible"]), andwait_for_elementgainsexpected_statesfor polling until elements reach specific states. -
Role-based tree filtering —
accessibility_treenow accepts aroleparameter to filter the tree to specific element types (e.g."button","check box"). Non-matching elements are hidden but their children are still traversed. -
Window state markers —
list_windowsnow shows per-window titles with[active]/[focused]state markers via AT-SPI2. -
AT-SPI2 retry logic — Subprocess queries (
_run_atspi) now retry once with a 0.5s delay on failure, improving resilience against transient AT-SPI2 bus instability.
Installation
# Using uv (recommended)
uv tool install kwin-mcp
# Or using pip
pip install kwin-mcpFull Changelog: v0.5.1...v0.6.0
v0.5.1
kwin-mcp v0.5.1 fixes screen resolution parameters and adds screenshot preservation for debugging.
What's Changed
- Fix screen resolution:
session_startscreen_width/screen_heightparameters were being ignored — now correctly passed as--width/--heightflags tokwin_wayland keep_screenshotsoption: Newsession_startparameter to preserve screenshot files aftersession_stop, useful for debugging and CI artifact collection- SEO & documentation: Added documentation guidelines to CLAUDE.md, docs-seo custom agent, release-notes skill, GitHub issue/PR templates, and CONTRIBUTING.md
Full Changelog: v0.5.0...v0.5.1
v0.5.0
kwin-mcp v0.5.0 introduces AutomationEngine — a clean separation of MCP-independent automation logic — and an interactive CLI for rapid testing of all 29 MCP tools.
Highlights
- AutomationEngine (
core.py): All automation logic (session, input, screenshot, accessibility) extracted fromserver.pyinto a standalone class, making it reusable outside MCP contexts - Interactive CLI (
kwin-mcp-cli): New entry point with REPL and pipe mode for testing tools without an MCP client - Thin MCP server:
server.pyreduced to lightweight wrappers delegating to AutomationEngine - Improved session startup: Cleaner AT-SPI2 bus address propagation and reduced launcher sleep time
Installation
uv tool install kwin-mcp
pip install kwin-mcpFull Changelog: v0.4.2...v0.5.0
v0.4.2
What's Changed
Improve MCP tool spec quality for better LLM integration
- Add
Annotated[type, Field(description=...)]to all parameters across all 29 tools, so MCP clients expose per-parameter descriptions in JSON Schema - Remove redundant docstring
Args:sections that duplicated Field descriptions - Improve tool-level docstrings with prerequisites, return value formats, and usage guidance
- Add coordinate system explanation to mouse/touch tools (0,0 = top-left)
- Clarify
keyboard_typevskeyboard_type_unicodedistinction - Document
enable_clipboard=trueprerequisite for clipboard tools - Add dbus-send argument format examples to
dbus_call - Note AT-SPI2 visibility limitation in
list_windows - Clarify case-insensitive substring matching in
focus_window
Full Changelog: v0.4.1...v0.4.2
v0.4.1
Fixed
- Explicitly pass
KWIN_WAYLAND_NO_PERMISSION_CHECKSandKWIN_SCREENSHOT_NO_PERMISSION_CHECKSenv vars directly to the KWin process in the wrapper script - Environment variable inheritance through
dbus-run-sessionwas unreliable, causing the restricted Wayland protocol access added in v0.4.0 (org_kde_plasma_window_management, etc.) and.desktopfileX-KDE-Wayland-Interfacesdeclarations to not take effect
Full Changelog: v0.4.0...v0.4.1
v0.4.0
Highlights
Restricted Wayland Protocol Access (Critical)
Set KWIN_WAYLAND_NO_PERMISSION_CHECKS=1 in isolated sessions, allowing clients to bind KWin-blacklisted Wayland protocols.
- Enables access to
org_kde_plasma_window_management,org_kde_kwin_fake_input, and other restricted protocols - Apps using Plasma's TasksModel / window management APIs can now be tested
- Same pattern as the existing
KWIN_SCREENSHOT_NO_PERMISSION_CHECKS— safe in isolated virtual sessions
App stdout/stderr Capture
launch_app and session_start now redirect app output to per-app log files.
- New
read_app_logMCP tool to retrieve logs by PID - Isolated log files per app (
app_{name}_{counter}.log) - Automatic cleanup on session stop
Wayland Protocol Diagnostics
New wayland_info MCP tool enumerates Wayland globals exposed in the session.
filter_protocolparameter for searching specific protocols- Useful for debugging protocol access issues
Environment Variable Passthrough & Shell Parsing
session_startandlaunch_appnow accept anenvparameter for passing extra environment variables to launched apps- Command parsing changed from
str.split()toshlex.split()— properly handles quoted shell commands (e.g.bash -c 'echo hello world')
Internal Changes
Session.launch_app()return type:int→AppInfodataclass (pid, command, log_path, process)SessionInfonow tracks all launched apps viaapps: dict[int, AppInfo]- MCP tool count: 27 → 30
Full Changelog: v0.3.0...v0.4.0
v0.3.0
Added
- M5.1 E2E input features: touch input (tap, swipe, pinch, multi-finger swipe), clipboard (get/set), Unicode text input (wtype/wl-copy fallback), window management (launch_app, list_windows, focus_window),
dbus_call, andwait_for_element— 17 new MCP tools total
Fixed
- External binary missing errors now return helpful install instructions instead of raw
FileNotFoundError(affectswl-clipboard,wtype,dbus-send,spectacle)
v0.2.0
What's New
Composite Frame Capture
Action tools now support capturing screenshots at specified delays after an action completes — all within a single MCP call. This enables AI agents to observe transient UI states like hover effects, click animations, and menu transitions without extra round-trips.
New parameter: screenshot_after_ms on mouse_click, mouse_move, mouse_drag, keyboard_type, keyboard_key
mouse_click(x=100, y=200, screenshot_after_ms=[0, 50, 100, 200, 500])
Returns the action result plus captured frame paths:
Clicked left at (100, 200)
Captured 5 frames:
0ms: /tmp/.../frame_000_0ms.png (75.8 KB)
50ms: /tmp/.../frame_001_50ms.png (78.5 KB)
...
Fast D-Bus Screenshot Capture
Frame capture uses KWin's ScreenShot2 D-Bus interface directly (~30-70ms per frame), bypassing the spectacle CLI (~200-300ms). Burst capture is further optimized with a two-phase pipeline: raw frame capture with accurate timing, then deferred PNG encoding.
Full Changelog
v0.1.0 — Initial Release
The first release of kwin-mcp — an MCP server for GUI automation on KDE Plasma 6 Wayland.
This release enables AI agents (like Claude Code) to autonomously launch, interact with, and observe any Wayland GUI application in a fully isolated virtual KWin session.
Highlights
- Isolated KWin sessions —
session_startcreates a sandboxed environment usingdbus-run-session+kwin_wayland --virtual, completely isolated from the host desktop (D-Bus, display, and input). - 11 MCP tools for the full observe-act loop: session management, screenshot capture, accessibility tree inspection, element search, mouse input (click, move, scroll, drag), and keyboard input (typing and key combinations).
- Zero authorization prompts — Connects directly to KWin's private EIS D-Bus interface, bypassing the XDG RemoteDesktop portal.
- Structured UI understanding — AT-SPI2 accessibility tree provides widget roles, names, states, coordinates, and available actions without requiring screenshot analysis.
- Full input emulation via libei — absolute pointer positioning, US QWERTY keyboard with modifier support (Ctrl, Alt, Shift, Super), smooth drag interpolation.
Available Tools
| Tool | Description |
|---|---|
session_start |
Start an isolated KWin Wayland session, optionally launching an app |
session_stop |
Stop the session and clean up all processes |
screenshot |
Capture a screenshot of the virtual display (PNG) |
accessibility_tree |
Get the AT-SPI2 widget tree |
find_ui_elements |
Search for UI elements by name, role, or description |
mouse_click |
Click (left/right/middle, single/double) at coordinates |
mouse_move |
Move the cursor to coordinates |
mouse_scroll |
Scroll vertically or horizontally |
mouse_drag |
Drag between two points with smooth interpolation |
keyboard_type |
Type text (US QWERTY) |
keyboard_key |
Press key combinations (e.g., ctrl+c, alt+F4) |
Installation
# Using uv (recommended)
uv tool install kwin-mcp
# Or using pip
pip install kwin-mcpRequirements
- KDE Plasma 6 (Wayland session)
- Python 3.12+
- System packages:
spectacle,at-spi2-core,python-gobject,dbus-python
See the README for distro-specific installation instructions.