The easiest way to use this from aria-app is to install the sibling
desktop-agent and kg-agent repos into one Python environment, then start the
packaged desktop-agent-bridge command.
The bridge lets aria-app / UI-TARS keep its Electron UI and local operator, while the next-action policy comes from your sibling desktop-agent repo:
baseline: plaindesktop-agentgrounded:desktop-agentwithkg-agentgraph review / grounding enabled
That gives you a clean demo surface for the story:
- run the exact same task in the same desktop app
- toggle
Force Workflow Mode - show that the grounded run takes fewer steps or drifts less
The bridge exposes the same endpoints aria-app already expects:
POST /v1/chat/completionsPOST /v1/feedbackGET /healthz
Internally it:
- parses the UI-TARS request
- extracts the current screenshot and original task instruction
- calls
desktop-agent'sKimiAgent.predict(...) - optionally injects
GraphHintProviderwhen mode isgrounded - translates the returned
pyautoguiaction into UI-TARS action syntax - returns
backend_metaso the frontend can show workflow state and submit feedback
The bridge is best for normal desktop-productivity tasks such as Finder, browser, spreadsheet, or file-dialog workflows.
It is not a good fit for game-like actions that depend on:
- long key holds
mouseDown()/mouseUp()- raw cursor movement without a click target
That limitation comes from the mismatch between:
desktop-agentaction output:pyautogui- UI-TARS action space:
click,type,drag,hotkey,scroll,wait,finished
For your expense-report / receipt demo, that tradeoff is usually fine.
You need all three projects present as sibling folders:
/path/to/aria-app/path/to/desktop-agent/path/to/kg-agent
You also need a Python environment that can import both desktop-agent and kg-agent.
At minimum, grounded mode needs:
neo4jPython package- KG dependencies from
kg-agent GEMINI_API_KEY- a running Neo4j instance
Baseline mode needs the desktop-agent runtime dependencies and your controller model credentials, such as:
KIMI_API_KEYfor--model-provider moonshotANTHROPIC_API_KEYfor--model-provider anthropic
Recommended setup:
python3 -m pip install -e /path/to/desktop-agent
python3 -m pip install -e /path/to/kg-agentThat gives you a stable CLI entrypoint:
desktop-agent-bridgeIf you are not installing kg-agent into the same environment yet, you can
still point the bridge at the checkout with --kg-agent-path.
Grounded mode only helps if the graph already contains a successful memory for the task or a closely related task.
If you have not ingested that memory yet, do that first from the kg-agent repo.
The kg-agent README already includes example commands for:
- rebuilding the graph from a known successful trajectory
- running a desktop eval with graph hints enabled
If your demo task is:
Find the Uber receipt in Downloads and create an expense report spreadsheet
then the best setup is to ingest one successful run of that task, or a close variant, before the live comparison.
Run the bridge from the Python environment where you installed desktop-agent
and kg-agent.
Example:
desktop-agent-bridge \
--model-provider moonshot \
--controller-model kimi-k2.5 \
--auto-mode groundedIf you want Anthropic instead:
desktop-agent-bridge \
--model-provider anthropic \
--controller-model claude-sonnet-4-5 \
--auto-mode groundedIf you prefer explicit sibling-repo paths during local development:
desktop-agent-bridge \
--desktop-agent-path /path/to/desktop-agent \
--kg-agent-path /path/to/kg-agent \
--model-provider moonshot \
--controller-model kimi-k2.5 \
--auto-mode groundedHelpful behavior:
X-Force-Workflow-Mode: baseline-> no KGX-Force-Workflow-Mode: grounded-> KG enabled- no force header -> uses
--auto-mode
Check that it is up:
curl -s http://127.0.0.1:8000/healthzFrom this repo:
cd /path/to/aria-app
corepack enable
corepack pnpm install
corepack pnpm run dev:ui-tarsIf the root dev script is noisy:
cd /path/to/aria-app/apps/ui-tars
corepack pnpm run build:deps
corepack pnpm run devIn Settings:
- Provider:
Hugging Face for UI-TARS-1.5 - Base URL:
http://127.0.0.1:8000/v1 - API Key:
dummy-key - Model name:
cuakg-default
Then use Force Workflow Mode for deterministic demos:
Baseline: plain desktop-agentGrounded: desktop-agent + kg-agent
Recommended flow:
- Put the desktop into the same clean start state before each run.
- Run the task once with
Force Workflow Mode = Baseline. - Record the total step count from the run or feedback payload.
- Reset the desktop to the same start state.
- Run the same task with
Force Workflow Mode = Grounded. - Compare: baseline step count vs grounded step count drift / retries / unnecessary detours
The bridge logs the useful signals directly to stdout:
- requested mode
- effective mode
- raw
desktop-agentcode - translated UI-TARS action
- workflow metadata
The frontend feedback POST also includes:
total_stepsmodeworkflow_statusretrieval_confidence
That is usually enough to support the claim that KG grounding improves the run.
- Use the exact same desktop state and files for both runs.
- Use a task that has already been ingested into Neo4j.
- Keep the task in the productivity-UI regime rather than raw games or free-camera apps.
- Watch the bridge logs during grounded mode to confirm that the graph path is active.
The most common causes are:
neo4jPython package is missing- Neo4j is not running
GEMINI_API_KEYis not set- the Python environment only has
desktop-agentdeps but notkg-agentdeps
If baseline works and grounded fails immediately, that almost always points to KG runtime setup rather than the Electron app.