This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Geisterhand is a macOS screen automation tool that provides both an HTTP API and CLI for controlling mouse, keyboard, and capturing screenshots. It requires macOS Accessibility and Screen Recording permissions.
# Build all targets
swift build
# Build for release
swift build -c release
# Run the CLI tool
swift run geisterhand
# Run the menu bar app
swift run GeisterhandApp
# Run tests
swift test
# Run a single test
swift test --filter "keyCodeMapLetters"The project has three main targets:
The shared core functionality used by both the CLI and the app:
- Server/HTTPServer.swift: Hummingbird-based HTTP server (
GeisterhandServeractor) running on port 7676 by default.ServerManagerprovides a synchronous wrapper for app lifecycle management. - Server/Routes/: Individual route handlers for each endpoint (
/status,/screenshot,/click,/click/element,/type,/key,/scroll,/wait,/accessibility/*,/menu) - Input/KeyboardController.swift: CGEvent-based keyboard automation. Uses
KeyCodeMapfor key name to keycode translation. Supports PID-targeted key events viapressKey(key:modifiers:targetPid:). - Input/MouseController.swift: CGEvent-based mouse automation (clicks, scrolling). Supports PID-targeted scroll via
scroll(x:y:deltaX:deltaY:targetPid:). - Screen/ScreenCaptureService.swift: ScreenCaptureKit-based screen capture (actor). Finds windows including off-screen ones for background capture.
- Accessibility/AccessibilityService.swift: AXUIElement-based UI element tree traversal, element search, and action execution (
@MainActorsingleton) - Accessibility/MenuService.swift: Application menu discovery and triggering via accessibility APIs. Supports background mode (skip
app.activate()). - Permissions/PermissionManager.swift: Checks and requests Accessibility (
AXIsProcessTrusted) and Screen Recording permissions - Models/APIModels.swift: Codable request/response types for the HTTP API
- Models/AccessibilityModels.swift: Types for accessibility operations (
ElementPath,UIElementInfo,ElementQuery,AccessibilityAction, etc.)
SwiftUI menu bar application that:
- Lives in the system menu bar with a hand icon
- Shows permission and server status via icon color (green/yellow/red)
- Manages server lifecycle through
AppDelegate - Uses
StatusMonitorfor periodic status updates
ArgumentParser-based CLI with subcommands: run, screenshot, click, type, key, scroll, status, server
The run subcommand is the main way to use Geisterhand. It launches (or attaches to) an app in the background (without stealing focus) and starts an HTTP server scoped to that app's PID:
geisterhand run Calculator # by app name
geisterhand run /Applications/Safari.app # by bundle path
geisterhand run com.apple.TextEdit # by bundle identifier
geisterhand run /usr/bin/python3 script.py # raw executable with args
geisterhand run Calculator --port 7676 # pin a specific portHow it works:
- Launches the app in the background (
NSWorkspace.OpenConfiguration.activates = false/open -g) — it does NOT steal focus or come to the foreground - If the app is already running, attaches to it instead of launching a new instance
- Starts an HTTP server on an auto-selected free port (or
--portto pin one) - Prints a single JSON line to stdout:
{"port":49152,"pid":12345,"app":"Calculator","host":"127.0.0.1"} - All routes automatically scope to the target app's PID — no need to pass
pidorappin each request - Screenshots work via ScreenCaptureKit even when the app is behind other windows
- Auto-exits when the target app/process terminates
Implementation: Sources/geisterhand/main.swift — Run struct (line ~292). Uses NSWorkspace.shared.openApplication() for .app bundles, Foundation.Process for raw executables, and open -g -a as a fallback for app names. Each route handler receives a TargetApp containing pid, appName, and bundleIdentifier, and falls back to it when no explicit targeting params are provided.
All endpoints run on the host/port from geisterhand run output (or 127.0.0.1:7676 with geisterhand server):
GET /status- System info and permission statusGET /screenshot- Capture screen or specific window (supports?app=Namefor background windows)POST /click- Click at coordinatesPOST /click/element- Click element by title/role/label (supportsuse_accessibility_actionfor background)POST /type- Type text (supportspid/path/role/titlefor background AX setValue)POST /key- Press key with modifiers (supportspidfor PID-targeted,pathfor AX action)POST /scroll- Scroll at position (supportspid/pathfor background targeting)POST /wait- Wait for element to appear/disappear/become enabledGET /accessibility/tree- Get UI element hierarchy (supports?format=compact)GET /accessibility/elements- Find elements by role/title/labelGET /accessibility/focused- Get focused elementPOST /accessibility/action- Perform action on element (press, setValue, focus, etc.)GET /menu- Get application menu structurePOST /menu- Trigger menu item (supportsbackground: trueto skip activation)
- Core services use singletons (
.shared) for shared state ScreenCaptureServiceandGeisterhandServerare actors for thread safetyAccessibilityServiceandMenuServiceare@MainActorsingletons (AX APIs require main thread)- Route handlers that use accessibility services are marked
@MainActor - Tests use Swift Testing framework (
@Test,#expect) - JSON uses snake_case encoding/decoding strategy
- Background mode: Input routes (
/type,/key,/scroll) accept optionalpid,path, and element query params. When present, they use accessibility APIs or PID-targeted CGEvents instead of global events, enabling automation of background apps without bringing them to the foreground.
One command publishes a new version end-to-end (version bump, build, sign, notarize, GitHub release, Homebrew tap update):
make publish NEW_VERSION=1.2.0This runs scripts/publish.sh which does:
- Bumps
CFBundleVersionandCFBundleShortVersionStringinSources/GeisterhandApp/Info.plist - Commits the version bump, creates a git tag
v<version>, pushes both - Runs
make clean && make release(build → sign → DMG → notarize → staple) - Creates a GitHub release with the DMG and auto-generated notes
- Clones
Geisterhand-io/homebrew-tap, updates SHA256 hashes in bothCasks/geisterhand.rbandFormula/geisterhand.rb, pushes
- Code signing identity:
Developer ID Application: Skelpo GmbH (K6UW5YV9F7)(hardcoded in Makefile) - Notarization credentials: Stored in macOS Keychain as profile
GeisterhandNotarize. If missing, re-create with:xcrun notarytool store-credentials "GeisterhandNotarize" \ --apple-id "info@skelpo.com" --team-id "K6UW5YV9F7" --password "<app-specific-password>"
- GitHub CLI (
gh): Must be authenticated for release creation and tap push
Makefile— build/sign/notarize pipeline andpublishtargetscripts/publish.sh— full publish orchestration scriptGeisterhand.entitlements— hardened runtime entitlements for notarizationhomebrew/geisterhand.rb— cask formula template (reference copy)homebrew/geisterhand-formula.rb— source formula template (reference copy)- Homebrew tap repo:
Geisterhand-io/homebrew-tap(Casks/ and Formula/)
- Homebrew cask (recommended):
brew install --cask geisterhand-io/tap/geisterhand - Homebrew source formula (CLI only):
brew install geisterhand-io/tap/geisterhand - GitHub releases: https://github.com/Geisterhand-io/macos/releases