Computer use API for AI agents — virtual Linux desktop with REST control.
Built February 2025.
Docker container running a virtual X11 desktop (Xvfb + Fluxbox) with a .NET REST API. AI agents can take screenshots, click, type, scroll, launch apps, run shell commands — full computer use over HTTP.
| Method | Endpoint | Description |
|---|---|---|
| GET | /screenshot |
Capture the current screen |
| POST | /click |
Click at coordinates |
| POST | /double-click |
Double-click at coordinates |
| POST | /type |
Type text |
| POST | /key |
Press a key or key combination |
| POST | /drag |
Drag from one point to another |
| POST | /scroll |
Scroll the mouse wheel |
| GET | /resolution |
Get screen resolution |
| GET | /desktop-info |
Get desktop environment info |
| GET | /analyze-ui |
Detect UI elements via OmniParser |
| GET | /analyze-ui-with-image |
Analyze UI with annotated image |
| GET | /test-omniparser |
Test OmniParser connectivity |
| Method | Endpoint | Description |
|---|---|---|
| POST | /launch |
Launch an application |
| POST | /kill |
Kill a running application |
| GET | /launch-browser |
Launch a web browser |
| GET | /launch-terminal |
Launch a terminal |
| GET | /launch-file-explorer |
Launch a file explorer |
| GET | /launch-text-editor |
Launch a text editor |
| GET | /is-installed |
Check if an application is installed |
| GET | /check-browsers |
Check available browsers |
| Method | Endpoint | Description |
|---|---|---|
| POST | /execute |
Execute a shell command |
| GET | /environment |
Get environment info |
| GET | /installed-browsers |
List installed browsers |
| POST | /install-package |
Install a system package |
Docker Container
├── Xvfb (virtual X11 display)
├── Fluxbox (window manager)
├── .NET 9 REST API
└── OmniParser sidecar (UI element detection)
- C# / .NET 9 / ASP.NET Core
- Docker
- Xvfb + Fluxbox
- xdotool
- ImageMagick
- OmniParser
docker-compose upThis API exposes shell execution and full desktop control. Run only in sandboxed/isolated environments. Do not expose to untrusted networks.
MIT — see LICENSE.