Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ mimi_secrets.h

# Environment
.env
extension/.env
extension/.venv/
extension/__pycache__/
main/mimi_secrets.h

# Cache / Compiled
Expand Down
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,14 @@ This turns MimiClaw into a proactive assistant — write tasks to `HEARTBEAT.md`
- **Heartbeat** — periodically checks a task file and prompts the AI to act autonomously
- **Tool use** — ReAct agent loop with tool calling for both providers

## Browser Extension (Preview)

For browser-control experiments, an optional bridge prototype is available under [`extension/`](extension/).

- Install and run guide: **[extension/README.md](extension/README.md)**
- Scope: desktop browser automation loop (not part of ESP32 firmware runtime path)
- Use it to validate web action flows first, then port the same message protocol into firmware if needed

## For Developers

Technical details live in the `docs/` folder:
Expand Down
8 changes: 8 additions & 0 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,14 @@ MimiClaw 内置 cron 调度器,让 AI 可以自主安排任务。LLM 可以通
- **心跳服务** — 定期检查任务文件,驱动 AI 自主执行
- **工具调用** — ReAct Agent 循环,两种提供商均支持工具调用

## 浏览器扩展(预览)

如果你要做浏览器控制实验,可以使用 `extension/` 下的可选桥接原型。

- 安装与使用说明:**[extension/README.md](extension/README.md)**
- 适用范围:桌面浏览器自动化循环(不属于 ESP32 固件运行路径)
- 建议流程:先在扩展侧验证网页动作流程,再按需把同一消息协议迁移到固件

## 开发者

技术细节在 `docs/` 文件夹:
Expand Down
8 changes: 8 additions & 0 deletions README_JA.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,14 @@ MimiClawにはcronスケジューラが内蔵されており、AIが自律的に
- **ハートビート** — タスクファイルを定期チェックし、AIを自律的に駆動
- **ツール呼び出し** — ReActエージェントループ、両プロバイダーでツール呼び出し対応

## ブラウザ拡張(プレビュー)

ブラウザ操作の実験をしたい場合は、`extension/` 配下のオプションブリッジを利用できます。

- 導入と利用手順:**[extension/README.md](extension/README.md)**
- 対象範囲:デスクトップブラウザ自動化ループ(ESP32ファームウェア実行パスとは別)
- 推奨フロー:まず拡張側でWeb操作を検証し、必要に応じて同じメッセージプロトコルをファームウェアへ移植

## 開発者向け

技術的な詳細は`docs/`フォルダにあります:
Expand Down
5 changes: 5 additions & 0 deletions extension/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Required: API key used by esp32_sim_server.py
OPENAI_API_KEY=sk-xxxx

# Optional: model override (default: gpt-5-nano)
OPENAI_MODEL=gpt-5-nano
87 changes: 87 additions & 0 deletions extension/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Browser Extension Bridge (Preview)

This directory contains an experimental browser automation bridge for MimiClaw.

It combines:

- a Chrome MV3 extension (`manifest.json`, `background.js`, `content.js`, `offscreen.js`, `popup.*`), and
- a local Python simulator (`esp32_sim_server.py`) that mimics the ESP32 side over WebSocket.

## What This Is For

Use this when you want to validate browser control workflows before wiring the same protocol into firmware.

| Path | Best for | Runs where |
|------|----------|------------|
| MimiClaw firmware (`main/`) | On-device assistant + Telegram + tools | ESP32-S3 |
| Extension bridge (`extension/`) | Browser interaction experiments and action-loop tuning | Desktop browser + local Python |

The two paths share the same message pattern (`get_dom_snapshot` / `execute_action` / `command_result`) so logic can be migrated incrementally.

## Quick Install

### 1. Start the local simulator

```bash
cd extension
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# edit .env and set OPENAI_API_KEY
export $(grep -v '^#' .env | xargs)
python esp32_sim_server.py
```

Default endpoint: `ws://127.0.0.1:8765/ws`

### 2. Load the extension in Chrome

1. Open `chrome://extensions`
2. Enable `Developer mode`
3. Click `Load unpacked`
4. Select this `extension/` directory

### 3. Run a task

1. Open a target webpage.
2. In the extension popup, verify connection status and keep listener enabled.
3. In the Python terminal, enter a goal, for example:

```text
user> Click the "Sign in" button on this page.
```

4. The simulator requests DOM state, asks the LLM for one action, sends it to the extension, and repeats until completion.

## Supported Actions

- `navigate` (`url`)
- `back`
- `forward`
- `click` (`selector` or `text`)
- `fill` (`selector`, `value`)
- `scroll` (`top`)

## Message Protocol (Summary)

Python simulator -> extension:

- `{"type":"get_dom_snapshot","request_id":"..."}`
- `{"type":"execute_action","request_id":"...","action":{...}}`

Extension -> Python simulator:

- `{"type":"command_result","request_id":"...","ok":true,"result":{...}}`

## Notes

- This is a prototype bridge. It does not change firmware behavior by itself.
- Some websites enforce CSP/cross-origin limits that can reduce extraction/action reliability.
- Offscreen + heartbeat/re-register logic is used to keep the WebSocket bridge stable under MV3 service-worker lifecycle constraints.

## Troubleshooting

- Popup shows disconnected: verify `esp32_sim_server.py` is running and `ws://127.0.0.1:8765/ws` is reachable.
- Actions do not execute: keep the target tab on an `http://` or `https://` page (not `chrome://` or extension pages).
- No LLM output: confirm `OPENAI_API_KEY` is exported in the simulator shell.
Loading