Skip to content

pockebot/openpocket

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

552 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

OpenPocket

OpenPocket Logo

An Intelligent Phone That Never Sleeps
Let AI handle your phone-use tasks β€” locally, privately, autonomously.

Website Β· Documentation Β· Quickstart Β· Discord Β· Reddit Β· Contributing

Node.js >= 20 TypeScript CI MIT License


What is OpenPocket?

Imagine having a second phone that works for you around the clock β€” replying to messages, handling payments, playing games, posting on social media β€” all while your real phone stays safely in your pocket, untouched.

OpenPocket is an open-source framework that lets AI agents operate an Android phone on your behalf. Tell it what you want in plain language, and it figures out the rest β€” tapping, scrolling, typing, navigating between apps β€” just like a real person would.

  • Local-first β€” everything runs on your machine; your data never leaves your computer.
  • Privacy by default β€” the agent phone and your personal phone are completely isolated. Sensitive actions require explicit human approval.
  • Emulator + real device β€” connect to Android emulators or physical phones over ADB. Run multiple agents against multiple targets to build your own local phone farm.
  • Extensible β€” add new capabilities through a single SKILL.md file, or build your own agent workflows.

Demos

Social
Auto-manage social media

4.mp4
Gaming
Play mobile games autonomously

2.mp4
Utility Payment
Handle bills and payments

3.mp4
Studying
Assist with learning tasks

1.mp4
Social
Autopilot your X

5.mp4
Utility
Find best price/travel option

6.mp4

Highlights

  • Multi-model β€” works with OpenAI GPT-5.x, Claude 4.6, Gemini 3.x, DeepSeek, Qwen, GLM, Kimi, MiniMax, Doubao, and more.
  • Multi-agent β€” run multiple isolated agents, each with its own config, workspace, target device, and session state.
  • Scheduled jobs β€” create cron tasks from chat or CLI in natural language (e.g. "Every day at 8am open Slack and check in").
  • Human-auth relay β€” sensitive actions (camera, payments, location) escalate to you for approval through a private local relay.
  • Channel integrations β€” receive tasks and results through Telegram, Discord, WhatsApp, or CLI.
  • Skills framework β€” extend agent capabilities by dropping a SKILL.md into the skills directory β€” no code changes needed.

Quick Start

Option A β€” npm (recommended)

npm install -g openpocket
openpocket onboard

Option B β€” from source (for contributors)

git clone git@github.com:pockebot/openpocket.git
cd openpocket
npm install
npm run build
./openpocket onboard

Then start the agent gateway:

openpocket gateway start

Or run a one-off task directly:

openpocket agent --model gpt-5.2-codex "Open Chrome and search weather"

For full setup details see the Quickstart guide, Device targets, and Configuration.

Usage

Multi-agent management

openpocket create agent review-bot --type physical-phone --device R5CX123456A
openpocket create agent ops-bot --type emulator
openpocket agents list

Target a specific agent with --agent:

openpocket --agent review-bot gateway start
openpocket --agent review-bot config-show
openpocket --agent review-bot target show
openpocket --agent review-bot channels login --channel discord

Scheduled jobs

From chat or CLI, describe a schedule in natural language. OpenPocket confirms before persisting. Jobs run in isolated cron:<jobId> sessions.

openpocket cron list
openpocket cron add --id daily-slack-checkin \
  --name "Daily Slack Check-in" \
  --cron "0 8 * * *" --tz Asia/Shanghai \
  --task "Open Slack and complete check-in" \
  --channel telegram --to 12345
openpocket cron disable --id daily-slack-checkin

Manager dashboard and shared relay

openpocket dashboard manager
openpocket human-auth-relay start
  • dashboard manager β€” overview of all agents, targets, channels, and gateway status.
  • human-auth-relay start β€” shared relay hub for human-auth approval flows, with optional ngrok public URL.

Device targets

openpocket target show
openpocket target set --type emulator
openpocket target set --type physical-phone
openpocket target pair --host <device-ip> --pair-port <pair-port> --code <pairing-code> --type physical-phone

Model profiles

Model configuration is per-agent. New agents inherit from the onboard template; each can diverge independently.

openpocket model show
openpocket model list
openpocket model set --name gpt-5.4
openpocket --agent review-bot model set --provider google --model gemini-3.1-pro-preview

Gateway logging

Tune log level, payload redaction, and per-module output in your agent config:

{
  "gatewayLogging": {
    "level": "info",
    "includePayloads": false,
    "maxPayloadChars": 160,
    "modules": {
      "core": true,
      "access": true,
      "task": true,
      "channel": true,
      "cron": true,
      "heartbeat": false,
      "humanAuth": true,
      "chat": false
    }
  }
}

Full CLI reference: CLI and Gateway | Filesystem layout

Architecture

flowchart LR
    U["User Surfaces\nCLI / Telegram / Dashboard"] --> M["Manager Layer\nagent registry + ports + target locks"]
    M --> D0["Default Agent\nconfig + workspace + state"]
    M --> D1["Managed Agent A\nconfig + workspace + state"]
    M --> D2["Managed Agent B\nconfig + workspace + state"]

    D0 --> G0["Gateway + Dashboard"]
    D1 --> G1["Gateway + Dashboard"]
    D2 --> G2["Gateway + Dashboard"]

    G0 --> A0["Agent Runtime"]
    G1 --> A1["Agent Runtime"]
    G2 --> A2["Agent Runtime"]

    A0 --> T0["Target A"]
    A1 --> T1["Target B"]
    A2 --> T2["Target C"]

    A0 --> H0["Private Local Relay"]
    A1 --> H1["Private Local Relay"]
    A2 --> H2["Private Local Relay"]

    H0 --> RH["Shared Relay Hub\noptional ngrok"]
    H1 --> RH
    H2 --> RH
Loading

Components

# Component What it does Docs
1 Multi-agent manager Registry, port allocation, and target locks for one default + N managed agents Multi-agent, Filesystem
2 Gateway orchestration Per-agent gateway, dashboard, session store, channel credentials, task queue CLI & Gateway, Runbook
3 Prompting & model loop System/user prompt composition, context budgeting, model-driven step execution Prompting, Prompt templates
4 Tool execution ADB phone actions, coding tools, memory tools, and user-defined scripts Action schema, Scripts, Skills
5 Human authorization Capability probe + escalation for sensitive actions, shared relay hub with per-agent state Remote human auth
6 Device target abstraction One target per agent, mutual exclusion enforced by the manager Device targets
7 Persistence & audit Sessions, memory, screenshots, relay state, artifacts β€” all inside the agent workspace Filesystem, Session & memory
8 Runtime operations Manager dashboard, per-agent dashboards, keep-awake heartbeat, troubleshooting playbooks Runbook, Troubleshooting, Screen awake

Contributing

We'd love your help making OpenPocket better. Whether it's fixing a typo, improving docs, adding a new skill, or building a major feature β€” every contribution matters.

Getting started

  1. Fork the repo and create a feature branch.
  2. Install dependencies and run the dev checks:
npm install
npm run check
npm run test
npm run smoke:dual-side
npm run docs:build
  1. Keep changes focused β€” one concern per PR.
  2. Add or update tests for any behavioral change.
  3. Open a PR with context on what changed, why, and how you verified it.

Where to help

  • Good first issues β€” look for issues labeled good first issue in the issue tracker.
  • Skills β€” write a new SKILL.md to teach the agent a new phone workflow. See the Skills guide.
  • Model adapters β€” add or improve support for new LLM providers.
  • Channel integrations β€” improve Telegram, Discord, or WhatsApp support, or add new channels.
  • Docs β€” fix gaps, add examples, or translate content.

Read the full guidelines before opening a PR: CONTRIBUTING.md | contribution.md

License

This project is licensed under the MIT License β€” see LICENSE for details.

Acknowledgments

Special thanks to the open-source projects that make OpenPocket possible:

  • pi-mono ecosystem by Mario Zechner β€” @mariozechner/pi-agent-core, @mariozechner/pi-ai, @mariozechner/pi-coding-agent
  • Messaging SDKs β€” node-telegram-bot-api, discord.js, baileys
  • Core runtime β€” openai, @modelcontextprotocol/sdk, zod, @sinclair/typebox, sharp, qrcode
  • Docs & tooling β€” vitepress, mermaid, typescript, tsx

And thanks to all contributors and community members who report issues, share real-world scenarios, and help shape the project. We're building this together.

About

🐹 An Intelligent Phone That Never Sleeps.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors