Skip to content

Nodenester/VirtualWorkerApi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VirtualWorkerApi

Computer use API for AI agents — virtual Linux desktop with REST control.

Built February 2025.

What it does

Docker container running a virtual X11 desktop (Xvfb + Fluxbox) with a .NET REST API. AI agents can take screenshots, click, type, scroll, launch apps, run shell commands — full computer use over HTTP.

API Endpoints

Desktop (api/desktop)

Method Endpoint Description
GET /screenshot Capture the current screen
POST /click Click at coordinates
POST /double-click Double-click at coordinates
POST /type Type text
POST /key Press a key or key combination
POST /drag Drag from one point to another
POST /scroll Scroll the mouse wheel
GET /resolution Get screen resolution
GET /desktop-info Get desktop environment info
GET /analyze-ui Detect UI elements via OmniParser
GET /analyze-ui-with-image Analyze UI with annotated image
GET /test-omniparser Test OmniParser connectivity

Application (api/application)

Method Endpoint Description
POST /launch Launch an application
POST /kill Kill a running application
GET /launch-browser Launch a web browser
GET /launch-terminal Launch a terminal
GET /launch-file-explorer Launch a file explorer
GET /launch-text-editor Launch a text editor
GET /is-installed Check if an application is installed
GET /check-browsers Check available browsers

Command (api/command)

Method Endpoint Description
POST /execute Execute a shell command
GET /environment Get environment info
GET /installed-browsers List installed browsers
POST /install-package Install a system package

Architecture

Docker Container
├── Xvfb (virtual X11 display)
├── Fluxbox (window manager)
├── .NET 9 REST API
└── OmniParser sidecar (UI element detection)

Tech Stack

  • C# / .NET 9 / ASP.NET Core
  • Docker
  • Xvfb + Fluxbox
  • xdotool
  • ImageMagick
  • OmniParser

Getting Started

docker-compose up

Security

This API exposes shell execution and full desktop control. Run only in sandboxed/isolated environments. Do not expose to untrusted networks.

License

MIT — see LICENSE.

About

Computer use API for AI agents (February 2025). Virtual Linux desktop with REST control — screenshots, clicks, typing, app launching, shell commands. C#, .NET 9, Docker, Xvfb, OmniParser.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors