Skip to content

Releases: isac322/kwin-mcp

v0.7.0 — Live Desktop Automation

29 Mar 08:42
Immutable release. Only release title and notes can be modified.
a119f39

Choose a tag to compare

kwin-mcp v0.7.0 transforms the project from a virtual-only testing tool into a dual-mode desktop automation platform — now supporting both isolated virtual sessions and live desktop connections. AI agents can connect to a real KDE Plasma desktop (or a KWin instance inside a container) to automate, observe, and collaborate on a running session, unlocking use cases like "share my screen" AI collaboration and container-based agent desktops.

Highlights

  • session_connect — Live desktop automation — New MCP tool to attach to an existing KWin session instead of creating an isolated virtual one. Defaults to $DBUS_SESSION_BUS_ADDRESS and $WAYLAND_DISPLAY from the environment. All 30 MCP tools (mouse, keyboard, touch, clipboard, accessibility tree, screenshot, window management) work identically in both virtual and live modes.

  • --default-live-session flag — Start the MCP server or CLI in live-session-default mode. When active, session_connect becomes the default and session_start requires explicit invocation. Supports both kwin-mcp --default-live-session and kwin-mcp-cli --default-live-session.

  • Safe live session lifecyclesession_stop on a live session only disconnects — it never kills KWin or pre-existing applications. Clipboard is always enabled for live sessions (no enable_clipboard parameter needed).

  • Container desktop support — Connect to KWin running inside systemd-nspawn or other container environments by passing explicit dbus_address and wayland_display parameters to session_connect.

New Tool

Tool Parameters Description
session_connect dbus_address? str, wayland_display? str, keep_screenshots? bool (false) Connect to an existing KWin session (real desktop or container)

Architecture

Both session modes share the same 30 MCP tools and delegate to AutomationEngine (core.py):

session_start (virtual) → dbus-run-session + kwin_wayland --virtual
session_connect (live)  → existing KWin (real desktop / container)

Technologies: AT-SPI2 for accessibility, KWin EIS D-Bus + libei for input injection, KWin ScreenShot2 D-Bus for screenshots, PyGObject for GObject introspection, wl-clipboard and wtype for clipboard and Unicode input.

Installation

# Using uv (recommended)
uv tool install kwin-mcp

# Or using pip
pip install kwin-mcp

Full Changelog: v0.6.0...v0.7.0

v0.6.0

25 Feb 06:56
35bfc5a

Choose a tag to compare

kwin-mcp v0.6.0 adds home directory isolation for reproducible testing environments and smarter AT-SPI2 queries with state/role filtering and automatic retry logic.

Highlights

  • Home directory isolationsession_start now supports isolate_home=true to create a temporary HOME with isolated XDG directories (XDG_CONFIG_HOME, XDG_DATA_HOME, XDG_CACHE_HOME, XDG_STATE_HOME), preventing apps from reading or writing host user settings. Use keep_home=true to preserve the isolated home after session_stop for inspection.

  • AT-SPI2 state-aware queriesfind_ui_elements gains a states parameter for filtering by AT-SPI2 states (e.g. ["focused"], ["active", "visible"]), and wait_for_element gains expected_states for polling until elements reach specific states.

  • Role-based tree filteringaccessibility_tree now accepts a role parameter to filter the tree to specific element types (e.g. "button", "check box"). Non-matching elements are hidden but their children are still traversed.

  • Window state markerslist_windows now shows per-window titles with [active]/[focused] state markers via AT-SPI2.

  • AT-SPI2 retry logic — Subprocess queries (_run_atspi) now retry once with a 0.5s delay on failure, improving resilience against transient AT-SPI2 bus instability.

Installation

# Using uv (recommended)
uv tool install kwin-mcp

# Or using pip
pip install kwin-mcp

Full Changelog: v0.5.1...v0.6.0

v0.5.1

23 Feb 13:49
c2bf989

Choose a tag to compare

kwin-mcp v0.5.1 fixes screen resolution parameters and adds screenshot preservation for debugging.

What's Changed

  • Fix screen resolution: session_start screen_width/screen_height parameters were being ignored — now correctly passed as --width/--height flags to kwin_wayland
  • keep_screenshots option: New session_start parameter to preserve screenshot files after session_stop, useful for debugging and CI artifact collection
  • SEO & documentation: Added documentation guidelines to CLAUDE.md, docs-seo custom agent, release-notes skill, GitHub issue/PR templates, and CONTRIBUTING.md

Full Changelog: v0.5.0...v0.5.1

v0.5.0

23 Feb 13:49
2349bb5

Choose a tag to compare

kwin-mcp v0.5.0 introduces AutomationEngine — a clean separation of MCP-independent automation logic — and an interactive CLI for rapid testing of all 29 MCP tools.

Highlights

  • AutomationEngine (core.py): All automation logic (session, input, screenshot, accessibility) extracted from server.py into a standalone class, making it reusable outside MCP contexts
  • Interactive CLI (kwin-mcp-cli): New entry point with REPL and pipe mode for testing tools without an MCP client
  • Thin MCP server: server.py reduced to lightweight wrappers delegating to AutomationEngine
  • Improved session startup: Cleaner AT-SPI2 bus address propagation and reduced launcher sleep time

Installation

uv tool install kwin-mcp
pip install kwin-mcp

Full Changelog: v0.4.2...v0.5.0

v0.4.2

22 Feb 13:51
4a19406

Choose a tag to compare

What's Changed

Improve MCP tool spec quality for better LLM integration

  • Add Annotated[type, Field(description=...)] to all parameters across all 29 tools, so MCP clients expose per-parameter descriptions in JSON Schema
  • Remove redundant docstring Args: sections that duplicated Field descriptions
  • Improve tool-level docstrings with prerequisites, return value formats, and usage guidance
  • Add coordinate system explanation to mouse/touch tools (0,0 = top-left)
  • Clarify keyboard_type vs keyboard_type_unicode distinction
  • Document enable_clipboard=true prerequisite for clipboard tools
  • Add dbus-send argument format examples to dbus_call
  • Note AT-SPI2 visibility limitation in list_windows
  • Clarify case-insensitive substring matching in focus_window

Full Changelog: v0.4.1...v0.4.2

v0.4.1

22 Feb 12:53
49dcb34

Choose a tag to compare

Fixed

  • Explicitly pass KWIN_WAYLAND_NO_PERMISSION_CHECKS and KWIN_SCREENSHOT_NO_PERMISSION_CHECKS env vars directly to the KWin process in the wrapper script
  • Environment variable inheritance through dbus-run-session was unreliable, causing the restricted Wayland protocol access added in v0.4.0 (org_kde_plasma_window_management, etc.) and .desktop file X-KDE-Wayland-Interfaces declarations to not take effect

Full Changelog: v0.4.0...v0.4.1

v0.4.0

22 Feb 11:19
3024e1d

Choose a tag to compare

Highlights

Restricted Wayland Protocol Access (Critical)

Set KWIN_WAYLAND_NO_PERMISSION_CHECKS=1 in isolated sessions, allowing clients to bind KWin-blacklisted Wayland protocols.

  • Enables access to org_kde_plasma_window_management, org_kde_kwin_fake_input, and other restricted protocols
  • Apps using Plasma's TasksModel / window management APIs can now be tested
  • Same pattern as the existing KWIN_SCREENSHOT_NO_PERMISSION_CHECKS — safe in isolated virtual sessions

App stdout/stderr Capture

launch_app and session_start now redirect app output to per-app log files.

  • New read_app_log MCP tool to retrieve logs by PID
  • Isolated log files per app (app_{name}_{counter}.log)
  • Automatic cleanup on session stop

Wayland Protocol Diagnostics

New wayland_info MCP tool enumerates Wayland globals exposed in the session.

  • filter_protocol parameter for searching specific protocols
  • Useful for debugging protocol access issues

Environment Variable Passthrough & Shell Parsing

  • session_start and launch_app now accept an env parameter for passing extra environment variables to launched apps
  • Command parsing changed from str.split() to shlex.split() — properly handles quoted shell commands (e.g. bash -c 'echo hello world')

Internal Changes

  • Session.launch_app() return type: intAppInfo dataclass (pid, command, log_path, process)
  • SessionInfo now tracks all launched apps via apps: dict[int, AppInfo]
  • MCP tool count: 27 → 30

Full Changelog: v0.3.0...v0.4.0

v0.3.0

22 Feb 09:29
2add0bd

Choose a tag to compare

Added

  • M5.1 E2E input features: touch input (tap, swipe, pinch, multi-finger swipe), clipboard (get/set), Unicode text input (wtype/wl-copy fallback), window management (launch_app, list_windows, focus_window), dbus_call, and wait_for_element — 17 new MCP tools total

Fixed

  • External binary missing errors now return helpful install instructions instead of raw FileNotFoundError (affects wl-clipboard, wtype, dbus-send, spectacle)

v0.2.0

20 Feb 14:39
72a5c51

Choose a tag to compare

What's New

Composite Frame Capture

Action tools now support capturing screenshots at specified delays after an action completes — all within a single MCP call. This enables AI agents to observe transient UI states like hover effects, click animations, and menu transitions without extra round-trips.

New parameter: screenshot_after_ms on mouse_click, mouse_move, mouse_drag, keyboard_type, keyboard_key

mouse_click(x=100, y=200, screenshot_after_ms=[0, 50, 100, 200, 500])

Returns the action result plus captured frame paths:

Clicked left at (100, 200)
Captured 5 frames:
  0ms: /tmp/.../frame_000_0ms.png (75.8 KB)
  50ms: /tmp/.../frame_001_50ms.png (78.5 KB)
  ...

Fast D-Bus Screenshot Capture

Frame capture uses KWin's ScreenShot2 D-Bus interface directly (~30-70ms per frame), bypassing the spectacle CLI (~200-300ms). Burst capture is further optimized with a two-phase pipeline: raw frame capture with accurate timing, then deferred PNG encoding.

Full Changelog

https://github.com/isac322/kwin-mcp/blob/main/CHANGELOG.md

v0.1.0 — Initial Release

20 Feb 09:24
877dcaf

Choose a tag to compare

The first release of kwin-mcp — an MCP server for GUI automation on KDE Plasma 6 Wayland.

This release enables AI agents (like Claude Code) to autonomously launch, interact with, and observe any Wayland GUI application in a fully isolated virtual KWin session.

Highlights

  • Isolated KWin sessionssession_start creates a sandboxed environment using dbus-run-session + kwin_wayland --virtual, completely isolated from the host desktop (D-Bus, display, and input).
  • 11 MCP tools for the full observe-act loop: session management, screenshot capture, accessibility tree inspection, element search, mouse input (click, move, scroll, drag), and keyboard input (typing and key combinations).
  • Zero authorization prompts — Connects directly to KWin's private EIS D-Bus interface, bypassing the XDG RemoteDesktop portal.
  • Structured UI understanding — AT-SPI2 accessibility tree provides widget roles, names, states, coordinates, and available actions without requiring screenshot analysis.
  • Full input emulation via libei — absolute pointer positioning, US QWERTY keyboard with modifier support (Ctrl, Alt, Shift, Super), smooth drag interpolation.

Available Tools

Tool Description
session_start Start an isolated KWin Wayland session, optionally launching an app
session_stop Stop the session and clean up all processes
screenshot Capture a screenshot of the virtual display (PNG)
accessibility_tree Get the AT-SPI2 widget tree
find_ui_elements Search for UI elements by name, role, or description
mouse_click Click (left/right/middle, single/double) at coordinates
mouse_move Move the cursor to coordinates
mouse_scroll Scroll vertically or horizontally
mouse_drag Drag between two points with smooth interpolation
keyboard_type Type text (US QWERTY)
keyboard_key Press key combinations (e.g., ctrl+c, alt+F4)

Installation

# Using uv (recommended)
uv tool install kwin-mcp

# Or using pip
pip install kwin-mcp

Requirements

  • KDE Plasma 6 (Wayland session)
  • Python 3.12+
  • System packages: spectacle, at-spi2-core, python-gobject, dbus-python

See the README for distro-specific installation instructions.

Links