Skip to content

Latest commit

 

History

History
201 lines (113 loc) · 8.31 KB

File metadata and controls

201 lines (113 loc) · 8.31 KB

Quick Start

We're excited to announce the support for UI-TARS-1.5! 🎉🎉🎉

The previous version of UI-TARS Desktop version 0.0.8 will be upgraded to a new Desktop App 0.1.0 with support for both Computer and Browser operator.


Prerequisites

Please install Chrome (stable/beta/dev/canary), Edge (stable/beta/dev/canary), or Firefox (stable/beta/dev/nightly) for Browser Operator.

UI-TARS-desktop is currently only available for single monitor setup. Multi-monitor configuration may cause failure for some tasks.


Download

You can download the latest release version of UI-TARS Desktop from our releases page.

Note: If you have Homebrew installed, you can install UI-TARS Desktop by running the following command:

brew install --cask ui-tars

Install

MacOS

  1. Drag UI TARS application into the Applications folder

  1. Enable the permission of UI TARS in MacOS:
  • System Settings -> Privacy & Security -> Accessibility
  • System Settings -> Privacy & Security -> Screen Recording

  1. Then open UI TARS application, you can see the following interface:

Windows

Still to run the application, you can see the following interface:


Run remote operator

Tip

This feature is currently available only in Mainland China. It is not supported in other regions at this time. We appreciate your understanding and support.

By downloading UI-TARS Desktop App version 0.2.0 or above, you can use remote computer and browser operation features directly within the application.

On the home page, you’ll find the “Use Remote Computer” and “Use Remote Browser” buttons—click either one to start your experience.


Simply enter the GUI tasks you want to accomplish in the chat panel on the left, and the AI model will operate the remote device for you. Each session gives you 30 minutes of free remote access, and after the session ends, you can immediately start a new 30-minute free instance—explore and enjoy without limits.


Note

Notice for Commercial Use:
Beyond the free trial, if you wish to deploy your own Remote Computer and Browser Agent, you can explore more on Volcano Engine's OS Agent Services via deployment links (in Chinese) Computer Use Agent and Browser Use Agent.


Get model and run local operator

UI-TARS-1.5 on Hugging Face

  1. Click the button Deploy from Hugging Face on the top right corner of the page

  1. Select the model UI-TARS-1.5-7B

  1. Refer to README_deploy.md for detailed deployment instructions to obtain the Base URL, API Key, and Model Name.

  2. Open the UI-TARS Desktop App Settings and configure:

Language: en
VLM Provider: Hugging Face for UI-TARS-1.5
VLM Base URL: https:xxx
VLM API KEY: hf_xxx
VLM Model Name: xxx

Note

  1. For VLM Provider, make sure to select "Hugging Face for UI-TARS-1.5" to ensure proper VLM Action parsing.
  2. For VLM Base URL & VLM Model Name, you can checkout your huggingface endpoint page to see detail information. Please make sure Base URL ends with '/v1/'

  1. Click button starting a new chat

  1. Input the command to start a round of GUI operation tasks!


Doubao-1.5-UI-TARS on VolcEngine

  1. Visit the VolcEngine Doubao-1.5-UI-TARS page

  2. Click the button Try (立即体验) on the top right corner of the page

  1. Click the API inference (API 接入) link

  1. Get your API Key from STEP 1 in the drawer panel.

  1. In STEP 2, authenticate your user info and switch to the OpenAI SDK tab to obtain Base Url and Model name

  1. Open the UI-TARS Desktop App Settings and configure:
Language: cn
VLM Provider: VolcEngine Ark for Doubao-1.5-UI-TARS
VLM Base URL: https://ark.cn-beijing.volces.com/api/v3
VLM API KEY: ARK_API_KEY
VLM Model Name: doubao-1.5-ui-tars-250328

Note

For VLM Provider, make sure to select "VolcEngine Ark for Doubao-1.5-UI-TARS" to ensure proper VLM Action parsing.

  1. Select the desired usage scenario before starting a new chat

Note

Before using Browser Operator mode, please ensure that Chrome, Edge, or Firefox is installed on your device.

  1. Input the command to start a round of GUI operation tasks!


Try out our free remote operators

  1. Open the app and agree to our User Agreement

Note

We promise all records on the servers will be exclusively used for academic research purposes and will not be utilized for any other activities.

  1. Use for free for 30 minutes

  1. Easily take control of a remote device

  1. How to exit/close

More

At this point, you should have successfully launched the UI-TARS-Desktop App! To get the most out of UI-TARS and ensure stable usage, we recommend reviewing the following documentation: