Conversation
Signed-off-by: Asklv <boironic@gmail.com>
Signed-off-by: Asklv <boironic@gmail.com>
Signed-off-by: Asklv <boironic@gmail.com>
Signed-off-by: Asklv <boironic@gmail.com>
Signed-off-by: Asklv <boironic@gmail.com>
Signed-off-by: Asklv <boironic@gmail.com>
Signed-off-by: Asklv <boironic@gmail.com>
Signed-off-by: Asklv <boironic@gmail.com>
Signed-off-by: Asklv <boironic@gmail.com>
📝 WalkthroughWalkthroughThis PR adds a preview Chrome MV3 extension and a Python ESP32 simulation server enabling LLM-driven browser automation via a WebSocket bridge. It includes extension background/offscreen/service worker logic, content script DOM snapshotting and action execution, popup UI, offscreen WebSocket client, server-side BridgeState/RPC, and documentation updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as ESP32 Agent
participant Bridge as BridgeState<br/>(WebSocket Server)
participant Ext as Extension Background/Offscreen
participant Content as Content Script
participant Browser as Browser DOM
participant LLM as OpenAI API
Agent->>Bridge: rpc(get_dom_snapshot)
Bridge->>Ext: send RPC request (via WS)
Ext->>Content: GET_DOM_SNAPSHOT message
Content->>Browser: Query DOM & elements
Content-->>Ext: domSnapshot (command_result)
Ext-->>Bridge: relay command_result
Bridge-->>Agent: DOM snapshot response
Agent->>LLM: _ask_llm(goal, dom_snapshot)
LLM-->>Agent: action JSON (navigate/click/fill/scroll)
Agent->>Bridge: rpc(execute_action {action})
Bridge->>Ext: send RPC request
Ext->>Content: EXECUTE_ACTION message
Content->>Browser: Perform action
Content-->>Ext: action result
Ext-->>Bridge: command_result
Bridge-->>Agent: action execution result
Agent->>Bridge: notify(step status)
Bridge->>Ext: agent_status broadcast
Ext->>Content: PLAN_UPDATE
Content->>Browser: Update overlay UI
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
bcee638 to
537520c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
extension/.env.example (1)
2-2: Use a non-secret-looking API key placeholder.A neutral placeholder avoids secret-scanner false positives and reduces accidental misuse.
✏️ Suggested tweak
-OPENAI_API_KEY=sk-xxxx +OPENAI_API_KEY=your_openai_api_key_here🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extension/.env.example` at line 2, Replace the secret-looking placeholder value "sk-xxxx" for the OPENAI_API_KEY entry with a neutral, non-secret placeholder (e.g., "YOUR_OPENAI_API_KEY" or "REPLACE_WITH_OPENAI_KEY") so the environment example no longer resembles a real API key; update the OPENAI_API_KEY line in the .env example accordingly.extension/popup.html (1)
24-27: Associate the hint text with the IP input for accessibility.Linking the hint improves screen-reader context and input guidance.
♿ Proposed change
- <input id="wsIp" type="text" placeholder="127.0.0.1" /> + <input id="wsIp" type="text" placeholder="127.0.0.1" aria-describedby="wsIpHint" /> <button id="reconnect" class="primary">Reconnect</button> - <p class="hint">Fixed format: <code>ws://<IP>:8765/ws</code></p> + <p id="wsIpHint" class="hint">Fixed format: <code>ws://<IP>:8765/ws</code></p>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extension/popup.html` around lines 24 - 27, The hint paragraph isn't associated with the IP input, hurting screen-reader context; add an id to the hint element (e.g., "wsIpHint") and reference that id from the input with aria-describedby="wsIpHint" on the element with id "wsIp" so assistive tech reads the fixed-format hint when the input is focused; ensure the hint element keeps class "hint" and the code sample remains inside it.extension/requirements.txt (1)
1-2: Consider bounding major versions for runtime stability.Using only
>=can introduce breaking upgrades unexpectedly in fresh installs. Bothopenaiandwebsocketshave documented breaking changes at major version boundaries: openai 1.x→2.x changes the type signature ofResponseFunctionToolCallOutputItem.output, and websockets 12.x→13.x deprecates handler signatures, requires Python ≥3.8, and renames APIs. Prefer bounded ranges (or a lock file) for deterministic behavior.♻️ Proposed change
-openai>=1.40.0 -websockets>=12.0 +openai>=1.40.0,<2.0.0 +websockets>=12.0,<13.0.0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extension/requirements.txt` around lines 1 - 2, The requirements file allows unbounded major upgrades for openai and websockets which can introduce breaking changes; update the constraints for the two packages (openai and websockets) to bound their major versions (for example change openai to allow 1.x only and websockets to allow 12.x only) or switch to an exact lock file, ensuring installs remain deterministic and avoid automatic upgrades to incompatible major releases.extension/esp32_sim_server.py (1)
58-84: RPC retry logic is robust but naming could be clearer.The retry loop
range(1, retries + 2)withretries=2gives 3 total attempts, which is correct. However, the parameter nameretriesmight be clearer asmax_retriesto indicate total retry count after the first attempt.The use of
asyncio.shield(fut)on line 77 correctly prevents the future from being cancelled during timeout, allowing potential late responses to still resolve the future.Consider renaming for clarity
- async def rpc(self, payload: Dict[str, Any], timeout: float = 15.0, retries: int = 2) -> Dict[str, Any]: + async def rpc(self, payload: Dict[str, Any], timeout: float = 15.0, max_retries: int = 2) -> Dict[str, Any]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extension/esp32_sim_server.py` around lines 58 - 84, The rpc method's retry parameter name is confusing: rename the parameter retries to max_retries in the async def rpc signature and update all uses inside rpc (including the retry loop range(1, max_retries + 2), the attempt comparison that raises TimeoutError, and any logging/diagnostic text referencing retries) so the intent (number of retries after the first attempt) is clear; ensure PendingRequest handling (self.pending, request_id) and the asyncio.shield(fut) usage remain unchanged.extension/background.js (1)
292-297: Auto-creating a Google tab may be unexpected user behavior.When all tabs are on restricted URLs (like
chrome://), the code automatically creates a new tab tohttps://www.google.com. While this enables the automation to proceed, it could be surprising to users.Consider documenting this behavior or showing a notification, though for a preview/experimental feature this is acceptable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extension/background.js` around lines 292 - 297, The code automatically creates a visible Google tab via chrome.tabs.create (the created variable) when all tabs are restricted; update the logic around chrome.tabs.create and waitForTabComplete to avoid surprising the user by either (a) creating the tab as inactive (set active:false) or (b) showing a user-facing notification via chrome.notifications.create before creating the tab and only proceeding if the user accepts; locate the chrome.tabs.create call and the waitForTabComplete usage and implement one of these behaviors and add a short comment documenting the intentional UX choice.extension/manifest.json (1)
13-17: Remove unnecessary WebSocket host permissions.In Chrome MV3, WebSocket connections from offscreen documents are controlled by the extension's Content Security Policy (
connect-srcdirective), not byhost_permissions. Since this extension doesn't use thechrome.webRequestAPI (which would require explicit host permissions for WebSocket interception), thews://*/*andwss://*/*entries inhost_permissionsare unnecessary. Removing lines 15–16 follows the principle of least privilege.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extension/manifest.json` around lines 13 - 17, Remove the unnecessary WebSocket host permissions from the manifest: delete the "ws://*/*" and "wss://*/*" entries inside the host_permissions array in manifest.json and leave only the required host(s) (e.g., "<all_urls>" if needed); verify any WebSocket allowances are governed by your Content Security Policy's connect-src directive rather than host_permissions and confirm you are not relying on chrome.webRequest for WebSocket interception.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@extension/popup.css`:
- Around line 27-32: The CSS rule for `#detail` uses the deprecated property
"word-break: break-word"; replace that declaration with the standards-based
"overflow-wrap: anywhere" in the `#detail` selector so long words wrap correctly;
update the `#detail` rule (selector "#detail") to remove "word-break: break-word"
and add "overflow-wrap: anywhere" instead.
---
Nitpick comments:
In `@extension/.env.example`:
- Line 2: Replace the secret-looking placeholder value "sk-xxxx" for the
OPENAI_API_KEY entry with a neutral, non-secret placeholder (e.g.,
"YOUR_OPENAI_API_KEY" or "REPLACE_WITH_OPENAI_KEY") so the environment example
no longer resembles a real API key; update the OPENAI_API_KEY line in the .env
example accordingly.
In `@extension/background.js`:
- Around line 292-297: The code automatically creates a visible Google tab via
chrome.tabs.create (the created variable) when all tabs are restricted; update
the logic around chrome.tabs.create and waitForTabComplete to avoid surprising
the user by either (a) creating the tab as inactive (set active:false) or (b)
showing a user-facing notification via chrome.notifications.create before
creating the tab and only proceeding if the user accepts; locate the
chrome.tabs.create call and the waitForTabComplete usage and implement one of
these behaviors and add a short comment documenting the intentional UX choice.
In `@extension/esp32_sim_server.py`:
- Around line 58-84: The rpc method's retry parameter name is confusing: rename
the parameter retries to max_retries in the async def rpc signature and update
all uses inside rpc (including the retry loop range(1, max_retries + 2), the
attempt comparison that raises TimeoutError, and any logging/diagnostic text
referencing retries) so the intent (number of retries after the first attempt)
is clear; ensure PendingRequest handling (self.pending, request_id) and the
asyncio.shield(fut) usage remain unchanged.
In `@extension/manifest.json`:
- Around line 13-17: Remove the unnecessary WebSocket host permissions from the
manifest: delete the "ws://*/*" and "wss://*/*" entries inside the
host_permissions array in manifest.json and leave only the required host(s)
(e.g., "<all_urls>" if needed); verify any WebSocket allowances are governed by
your Content Security Policy's connect-src directive rather than
host_permissions and confirm you are not relying on chrome.webRequest for
WebSocket interception.
In `@extension/popup.html`:
- Around line 24-27: The hint paragraph isn't associated with the IP input,
hurting screen-reader context; add an id to the hint element (e.g., "wsIpHint")
and reference that id from the input with aria-describedby="wsIpHint" on the
element with id "wsIp" so assistive tech reads the fixed-format hint when the
input is focused; ensure the hint element keeps class "hint" and the code sample
remains inside it.
In `@extension/requirements.txt`:
- Around line 1-2: The requirements file allows unbounded major upgrades for
openai and websockets which can introduce breaking changes; update the
constraints for the two packages (openai and websockets) to bound their major
versions (for example change openai to allow 1.x only and websockets to allow
12.x only) or switch to an exact lock file, ensuring installs remain
deterministic and avoid automatic upgrades to incompatible major releases.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (5)
extension/icons/icon128.pngis excluded by!**/*.pngextension/icons/icon16.pngis excluded by!**/*.pngextension/icons/icon32.pngis excluded by!**/*.pngextension/icons/icon48.pngis excluded by!**/*.pngextension/mimibroswer.pngis excluded by!**/*.png
📒 Files selected for processing (16)
.gitignoreREADME.mdREADME_CN.mdREADME_JA.mdextension/.env.exampleextension/README.mdextension/background.jsextension/content.jsextension/esp32_sim_server.pyextension/manifest.jsonextension/offscreen.htmlextension/offscreen.jsextension/popup.cssextension/popup.htmlextension/popup.jsextension/requirements.txt
| #detail { | ||
| margin: 0 0 10px; | ||
| font-size: 12px; | ||
| color: #6c7a89; | ||
| word-break: break-word; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
word-break break-word deprecated CSS specification
💡 Result:
word-break: break-word is a legacy, deprecated keyword for the word-break property, kept only for compatibility. The CSS Text Module Level 3 spec defines it as behaving the same as word-break: normal plus overflow-wrap: anywhere, regardless of whatever overflow-wrap is otherwise set to. [1][2]
What to use instead (standards-based):
- Prefer
overflow-wrap: anywhere(closest match to the deprecated behavior). [2] - If you want the more traditional “only break long tokens if needed” behavior, use
overflow-wrap: break-word. (This is different fromanywherein some layout/min-content sizing behavior.) [2]
Recommendation: don’t use word-break: break-word; use overflow-wrap: anywhere (or overflow-wrap: break-word, depending on desired behavior). [1][2]
🏁 Script executed:
#!/bin/bash
# Check the file content and verify the code at lines 27-32
if [ -f extension/popup.css ]; then
echo "=== File content at lines 27-32 ==="
sed -n '27,32p' extension/popup.css
echo ""
echo "=== Full file for context ==="
cat -n extension/popup.css
else
echo "File not found"
fiRepository: memovai/mimiclaw
Length of output: 3076
Replace deprecated word-break: break-word with overflow-wrap: anywhere.
word-break: break-word is deprecated per CSS Text Module Level 3 and should be replaced with the standards-based overflow-wrap: anywhere, which provides the same wrapping behavior.
✅ Proposed fix
`#detail` {
margin: 0 0 10px;
font-size: 12px;
color: `#6c7a89`;
- word-break: break-word;
+ overflow-wrap: anywhere;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #detail { | |
| margin: 0 0 10px; | |
| font-size: 12px; | |
| color: #6c7a89; | |
| word-break: break-word; | |
| } | |
| `#detail` { | |
| margin: 0 0 10px; | |
| font-size: 12px; | |
| color: `#6c7a89`; | |
| overflow-wrap: anywhere; | |
| } |
🧰 Tools
🪛 Stylelint (17.3.0)
[error] 31-31: Unexpected deprecated keyword "break-word" for property "word-break" (declaration-property-value-keyword-no-deprecated)
(declaration-property-value-keyword-no-deprecated)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extension/popup.css` around lines 27 - 32, The CSS rule for `#detail` uses the
deprecated property "word-break: break-word"; replace that declaration with the
standards-based "overflow-wrap: anywhere" in the `#detail` selector so long words
wrap correctly; update the `#detail` rule (selector "#detail") to remove
"word-break: break-word" and add "overflow-wrap: anywhere" instead.
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (1)
extension/popup.css (1)
27-32:⚠️ Potential issue | 🟡 MinorReplace deprecated
word-break: break-wordin#detail.Line [31] still uses a deprecated keyword; switch to
overflow-wrap: anywhere.#!/bin/bash rg -n 'word-break:\s*break-word|overflow-wrap:\s*anywhere' extension/popup.css✅ Proposed fix
`#detail` { margin: 0 0 10px; font-size: 12px; color: `#6c7a89`; - word-break: break-word; + overflow-wrap: anywhere; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extension/popup.css` around lines 27 - 32, In the `#detail` CSS rule replace the deprecated "word-break: break-word" with the modern "overflow-wrap: anywhere": locate the "#detail" selector in extension/popup.css, remove the "word-break: break-word" line and add "overflow-wrap: anywhere" to ensure long words break correctly across browsers.
🧹 Nitpick comments (1)
README_CN.md (1)
260-267: Consider relocating the browser extension section for better document flow.Placing the "浏览器扩展(预览)" section after "其他功能" (Other Features) is unconventional, as "其他功能" typically serves as a catch-all section near the document's end. Consider one of these alternatives:
- Option 1 (Recommended): Move this section to appear before "其他功能" (around line 249), making it a peer to other major feature sections.
- Option 2: Convert it to a subsection (###) within "其他功能".
- Option 3: Add a brief mention in the "其他功能" bullet list with a link to
extension/README.md.This would improve the logical flow of the documentation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README_CN.md` around lines 260 - 267, The "浏览器扩展(预览)" section is placed after the "其他功能" section which breaks document flow; relocate or reframe it. Move the entire "## 浏览器扩展(预览)" block (and its two bullets linking to extension/README.md and notes) so it appears before the "其他功能" section (making it a peer feature section), or alternatively convert it into a subsection "### 浏览器扩展(预览)" under the existing "其他功能" header, or replace it with a single bullet entry linking to extension/README.md within the "其他功能" list; update any adjacent headings to preserve heading order and links.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@extension/background.js`:
- Around line 69-76: The fallback DOM snapshot currently uses el.value which
exposes live form data; update the mapping in the anonymous map (the one
producing objects with tag, selector, text, href, placeholder, contentEditable
and which calls selectorFor(el)) to stop reading el.value—use only el.innerText
(or el.textContent) for the text field (e.g., replace (el.innerText || el.value
|| "") with (el.innerText || "") ) so live input values are not captured,
keeping placeholder and contentEditable behaviors unchanged.
In `@extension/content.js`:
- Around line 488-507: The overlay timeline currently builds HTML with untrusted
values (s.action, s.reason, s.step) and assigns it to timeline.innerHTML,
enabling HTML injection; change the rendering to create DOM nodes and set their
textContent (or escape values) instead of interpolating into a
string—specifically replace the stepsHtml/loadingHtml string assembly and the
timeline.innerHTML assignment with code that iterates over steps, creates
elements for ".mb-step", ".mb-step-head" and ".mb-step-reason", sets
element.textContent using s.step/s.action/s.reason (falling back to "?" /
"unknown" / "(no reason)"), append those nodes to timeline, and only use
innerHTML for static, trusted markup if needed.
In `@extension/esp32_sim_server.py`:
- Around line 54-57: When clearing the extension socket in clear_extension, also
fail any in-flight RPC futures so they don't hang: locate the container of
outstanding RPCs (e.g., self.pending_rpcs / self.pending_futures /
self._pending_requests) and for each future set an exception (or cancel) with a
clear ConnectionError or RuntimeError before clearing the container; then set
self.extension_ws = None. Ensure you reference clear_extension and extension_ws
and clear the pending container after failing each future.
- Around line 455-463: Registration is currently unauthenticated: when handling
a incoming message with mtype == "register" and data.get("role") == "extension"
the code calls bridge.set_extension(websocket) and sets registered_once without
validating a token. Add an auth token check (e.g. compare data.get("token") to a
server-side secret/config) before calling bridge.set_extension; if the token is
missing or invalid, send a register_nack (or error) over websocket and close the
connection instead of registering. Update the registration branch that
references mtype, data, bridge.set_extension, registered_once, and the
websocket.send call to perform this validation and early reject unauthorized
attempts.
- Around line 71-83: The current retry loop resends the same payload on timeout,
risking duplicate non-idempotent browser actions (e.g.,
execute_action/click/fill/post). Modify the send/retry logic in the method that
uses self.extension_ws, payload, fut, request_id, retries and timeout so that
you only perform retries for idempotent RPC types: detect the RPC name from
payload (e.g., payload.get("method") or payload["method"]) and if it is a
non-idempotent action like "execute_action", "click", "fill", "post" then send
exactly once and on timeout raise TimeoutError immediately; otherwise keep the
existing retry loop and logging/attempt counting for idempotent methods. Ensure
fut is still awaited with asyncio.shield and that the error message still
includes request_id when raising TimeoutError.
In `@extension/manifest.json`:
- Around line 13-17: The manifest currently grants universal host access via
"host_permissions" and auto-injects on all pages (the "<all_urls>" entries and
content script "matches") — change this to require explicit opt-in: remove
"<all_urls>" from "host_permissions" and instead list only necessary default
domains (or leave empty) and add "<all_urls>" to "optional_permissions"; remove
or narrow the "matches" pattern in "content_scripts" so scripts are not
auto-injected on every site, and implement programmatic injection using
chrome.permissions.request (or browser.permissions.request) followed by
chrome.scripting.executeScript (or equivalent) after the user explicitly enables
the session/feature; update any background/extension UI flow to request the
optional permission when the user opts in and only then perform injection.
---
Duplicate comments:
In `@extension/popup.css`:
- Around line 27-32: In the `#detail` CSS rule replace the deprecated "word-break:
break-word" with the modern "overflow-wrap: anywhere": locate the "#detail"
selector in extension/popup.css, remove the "word-break: break-word" line and
add "overflow-wrap: anywhere" to ensure long words break correctly across
browsers.
---
Nitpick comments:
In `@README_CN.md`:
- Around line 260-267: The "浏览器扩展(预览)" section is placed after the "其他功能"
section which breaks document flow; relocate or reframe it. Move the entire "##
浏览器扩展(预览)" block (and its two bullets linking to extension/README.md and notes)
so it appears before the "其他功能" section (making it a peer feature section), or
alternatively convert it into a subsection "### 浏览器扩展(预览)" under the existing
"其他功能" header, or replace it with a single bullet entry linking to
extension/README.md within the "其他功能" list; update any adjacent headings to
preserve heading order and links.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (5)
extension/icons/icon128.pngis excluded by!**/*.pngextension/icons/icon16.pngis excluded by!**/*.pngextension/icons/icon32.pngis excluded by!**/*.pngextension/icons/icon48.pngis excluded by!**/*.pngextension/mimibroswer.pngis excluded by!**/*.png
📒 Files selected for processing (16)
.gitignoreREADME.mdREADME_CN.mdREADME_JA.mdextension/.env.exampleextension/README.mdextension/background.jsextension/content.jsextension/esp32_sim_server.pyextension/manifest.jsonextension/offscreen.htmlextension/offscreen.jsextension/popup.cssextension/popup.htmlextension/popup.jsextension/requirements.txt
✅ Files skipped from review due to trivial changes (1)
- README_JA.md
🚧 Files skipped from review as they are similar to previous changes (7)
- extension/offscreen.js
- README.md
- extension/popup.html
- extension/popup.js
- extension/offscreen.html
- extension/requirements.txt
- extension/README.md
| return Array.from(document.querySelectorAll(query)).slice(0, maxCount).map((el) => ({ | ||
| tag: el.tagName.toLowerCase(), | ||
| selector: selectorFor(el), | ||
| text: (el.innerText || el.value || "").trim().slice(0, 120), | ||
| href: el.href || null, | ||
| placeholder: el.getAttribute("placeholder") || null, | ||
| contentEditable: Boolean(el.isContentEditable || el.getAttribute("contenteditable") === "true") | ||
| })); |
There was a problem hiding this comment.
Fallback DOM snapshot also exposes live form values.
Line [72] includes el.value for inputs/textareas in the fallback snapshot path, so secrets can still leak even if primary content messaging fails.
🔒 Proposed fix (align fallback redaction with content script)
return Array.from(document.querySelectorAll(query)).slice(0, maxCount).map((el) => ({
+ const tag = el.tagName.toLowerCase();
+ const type = (el.getAttribute("type") || "").toLowerCase();
+ const nameHint = `${el.getAttribute("name") || ""} ${el.id || ""}`.toLowerCase();
+ const isFormField = tag === "input" || tag === "textarea" || tag === "select";
+ const isSensitive =
+ type === "password" ||
+ /pass|pwd|token|otp|secret|ssn|card|cvv|email|phone/i.test(nameHint);
tag: el.tagName.toLowerCase(),
selector: selectorFor(el),
- text: (el.innerText || el.value || "").trim().slice(0, 120),
+ text: isFormField || isSensitive ? "" : (el.innerText || "").trim().slice(0, 120),
href: el.href || null,
placeholder: el.getAttribute("placeholder") || null,
contentEditable: Boolean(el.isContentEditable || el.getAttribute("contenteditable") === "true")
}));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extension/background.js` around lines 69 - 76, The fallback DOM snapshot
currently uses el.value which exposes live form data; update the mapping in the
anonymous map (the one producing objects with tag, selector, text, href,
placeholder, contentEditable and which calls selectorFor(el)) to stop reading
el.value—use only el.innerText (or el.textContent) for the text field (e.g.,
replace (el.innerText || el.value || "") with (el.innerText || "") ) so live
input values are not captured, keeping placeholder and contentEditable behaviors
unchanged.
| function elementBrief(el) { | ||
| return { | ||
| tag: el.tagName.toLowerCase(), | ||
| selector: elementSelector(el), | ||
| text: (el.innerText || el.value || "").trim().slice(0, 120), | ||
| href: el.href || null, | ||
| ariaLabel: el.getAttribute("aria-label") || null, | ||
| role: el.getAttribute("role") || null, | ||
| placeholder: el.getAttribute("placeholder") || null, | ||
| contentEditable: Boolean(el.isContentEditable || el.getAttribute("contenteditable") === "true") | ||
| }; |
There was a problem hiding this comment.
DOM snapshot currently leaks form values (including secrets).
Line [49] reads el.value for interactive elements selected from Line [62], so typed values (passwords, OTPs, emails, tokens) can be exfiltrated in snapshots.
🔒 Proposed fix (redact form field values)
function elementBrief(el) {
+ const tag = el.tagName.toLowerCase();
+ const type = (el.getAttribute("type") || "").toLowerCase();
+ const nameHint = `${el.getAttribute("name") || ""} ${el.id || ""}`.toLowerCase();
+ const isFormField = tag === "input" || tag === "textarea" || tag === "select";
+ const isSensitive =
+ type === "password" ||
+ /pass|pwd|token|otp|secret|ssn|card|cvv|email|phone/i.test(nameHint);
return {
- tag: el.tagName.toLowerCase(),
+ tag,
selector: elementSelector(el),
- text: (el.innerText || el.value || "").trim().slice(0, 120),
+ text: isFormField || isSensitive ? "" : (el.innerText || "").trim().slice(0, 120),
href: el.href || null,
ariaLabel: el.getAttribute("aria-label") || null,
role: el.getAttribute("role") || null,Also applies to: 58-72
| const stepsHtml = steps | ||
| .map( | ||
| (s) => ` | ||
| <div class="mb-step"> | ||
| <div class="mb-step-head">Step ${s.step || "?"} · ${s.action || "unknown"}</div> | ||
| <div class="mb-step-reason">${s.reason || "(no reason)"}</div> | ||
| </div> | ||
| ` | ||
| ) | ||
| .join(""); | ||
| const loadingHtml = | ||
| status === "running" | ||
| ? ` | ||
| <div class="mb-loading"> | ||
| Waiting next step | ||
| <span class="mb-dots"><i></i><i></i><i></i></span> | ||
| </div> | ||
| ` | ||
| : ""; | ||
| timeline.innerHTML = stepsHtml + loadingHtml; |
There was a problem hiding this comment.
Avoid HTML injection in overlay timeline rendering.
Line [507] writes model/page-derived fields into innerHTML (s.action, s.reason), which allows untrusted markup injection. Render with textContent (or escape before interpolation).
🛡️ Proposed fix (escape before interpolation)
const timeline = root.getElementById("mb-timeline");
+ const esc = (v) =>
+ String(v ?? "").replace(/[&<>"']/g, (ch) => ({
+ "&": "&",
+ "<": "<",
+ ">": ">",
+ '"': """,
+ "'": "'"
+ }[ch]));
const stepsHtml = steps
.map(
(s) => `
<div class="mb-step">
- <div class="mb-step-head">Step ${s.step || "?"} · ${s.action || "unknown"}</div>
- <div class="mb-step-reason">${s.reason || "(no reason)"}</div>
+ <div class="mb-step-head">Step ${esc(s.step || "?")} · ${esc(s.action || "unknown")}</div>
+ <div class="mb-step-reason">${esc(s.reason || "(no reason)")}</div>
</div>
`
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extension/content.js` around lines 488 - 507, The overlay timeline currently
builds HTML with untrusted values (s.action, s.reason, s.step) and assigns it to
timeline.innerHTML, enabling HTML injection; change the rendering to create DOM
nodes and set their textContent (or escape values) instead of interpolating into
a string—specifically replace the stepsHtml/loadingHtml string assembly and the
timeline.innerHTML assignment with code that iterates over steps, creates
elements for ".mb-step", ".mb-step-head" and ".mb-step-reason", sets
element.textContent using s.step/s.action/s.reason (falling back to "?" /
"unknown" / "(no reason)"), append those nodes to timeline, and only use
innerHTML for static, trusted markup if needed.
| def clear_extension(self, ws: Any) -> None: | ||
| if self.extension_ws is ws: | ||
| self.extension_ws = None | ||
|
|
There was a problem hiding this comment.
Fail pending RPCs immediately on extension disconnect.
When the extension socket drops, in-flight futures stay pending until timeout. That delays recovery and amplifies cascading retries.
⚙️ Proposed fix
def clear_extension(self, ws: Any) -> None:
if self.extension_ws is ws:
self.extension_ws = None
+ for pending in self.pending.values():
+ if not pending.future.done():
+ pending.future.set_exception(ConnectionError("extension_disconnected"))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def clear_extension(self, ws: Any) -> None: | |
| if self.extension_ws is ws: | |
| self.extension_ws = None | |
| def clear_extension(self, ws: Any) -> None: | |
| if self.extension_ws is ws: | |
| self.extension_ws = None | |
| for pending in self.pending.values(): | |
| if not pending.future.done(): | |
| pending.future.set_exception(ConnectionError("extension_disconnected")) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extension/esp32_sim_server.py` around lines 54 - 57, When clearing the
extension socket in clear_extension, also fail any in-flight RPC futures so they
don't hang: locate the container of outstanding RPCs (e.g., self.pending_rpcs /
self.pending_futures / self._pending_requests) and for each future set an
exception (or cancel) with a clear ConnectionError or RuntimeError before
clearing the container; then set self.extension_ws = None. Ensure you reference
clear_extension and extension_ws and clear the pending container after failing
each future.
| for attempt in range(1, retries + 2): | ||
| if self.extension_ws is None: | ||
| raise RuntimeError("Extension disconnected while waiting for command result.") | ||
|
|
||
| await self.extension_ws.send(json.dumps(payload, ensure_ascii=False)) | ||
| try: | ||
| result = await asyncio.wait_for(asyncio.shield(fut), timeout=timeout) | ||
| return result | ||
| except asyncio.TimeoutError: | ||
| if attempt >= retries + 1: | ||
| raise TimeoutError(f"RPC timeout request_id={request_id} after {attempt} attempts") from None | ||
| print(f"[server] rpc retry request_id={request_id} attempt={attempt + 1}") | ||
| finally: |
There was a problem hiding this comment.
RPC timeout retries can duplicate non-idempotent browser actions.
Line [75] resends the same command on timeout. For execute_action (click/fill/post), this can trigger duplicate side effects.
🧭 Proposed fix (retry only idempotent RPC types)
async def rpc(self, payload: Dict[str, Any], timeout: float = 15.0, retries: int = 2) -> Dict[str, Any]:
@@
- try:
- for attempt in range(1, retries + 2):
+ is_idempotent = payload.get("type") in {"get_dom_snapshot"}
+ max_attempts = (retries + 1) if is_idempotent else 1
+ try:
+ for attempt in range(1, max_attempts + 1):
@@
- except asyncio.TimeoutError:
- if attempt >= retries + 1:
+ except asyncio.TimeoutError:
+ if attempt >= max_attempts:
raise TimeoutError(f"RPC timeout request_id={request_id} after {attempt} attempts") from None🧰 Tools
🪛 Ruff (0.15.2)
[warning] 73-73: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 78-78: Consider moving this statement to an else block
(TRY300)
[warning] 81-81: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extension/esp32_sim_server.py` around lines 71 - 83, The current retry loop
resends the same payload on timeout, risking duplicate non-idempotent browser
actions (e.g., execute_action/click/fill/post). Modify the send/retry logic in
the method that uses self.extension_ws, payload, fut, request_id, retries and
timeout so that you only perform retries for idempotent RPC types: detect the
RPC name from payload (e.g., payload.get("method") or payload["method"]) and if
it is a non-idempotent action like "execute_action", "click", "fill", "post"
then send exactly once and on timeout raise TimeoutError immediately; otherwise
keep the existing retry loop and logging/attempt counting for idempotent
methods. Ensure fut is still awaited with asyncio.shield and that the error
message still includes request_id when raising TimeoutError.
| if mtype == "register" and data.get("role") == "extension": | ||
| bridge.set_extension(websocket) | ||
| if not registered_once: | ||
| print("[server] extension registered") | ||
| registered_once = True | ||
| else: | ||
| print("[server] extension re-registered") | ||
| await websocket.send(json.dumps({"type": "register_ack", "ts": int(asyncio.get_running_loop().time() * 1000)})) | ||
| continue |
There was a problem hiding this comment.
WebSocket extension registration is unauthenticated.
Any local process (or webpage reaching localhost websocket) can send {"type":"register","role":"extension"} and hijack bridge state. Add an auth token check during registration.
🔐 Proposed fix (shared secret on register)
DEFAULT_MODEL = os.getenv("OPENAI_MODEL", "gpt-5-nano")
+BRIDGE_AUTH_TOKEN = os.getenv("BRIDGE_AUTH_TOKEN", "").strip()
@@
if mtype == "register" and data.get("role") == "extension":
+ token = str(data.get("token", "")).strip()
+ if not BRIDGE_AUTH_TOKEN or token != BRIDGE_AUTH_TOKEN:
+ await websocket.close(code=1008, reason="unauthorized")
+ return
bridge.set_extension(websocket)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extension/esp32_sim_server.py` around lines 455 - 463, Registration is
currently unauthenticated: when handling a incoming message with mtype ==
"register" and data.get("role") == "extension" the code calls
bridge.set_extension(websocket) and sets registered_once without validating a
token. Add an auth token check (e.g. compare data.get("token") to a server-side
secret/config) before calling bridge.set_extension; if the token is missing or
invalid, send a register_nack (or error) over websocket and close the connection
instead of registering. Update the registration branch that references mtype,
data, bridge.set_extension, registered_once, and the websocket.send call to
perform this validation and early reject unauthorized attempts.
| "host_permissions": [ | ||
| "<all_urls>", | ||
| "ws://*/*", | ||
| "wss://*/*" | ||
| ], |
There was a problem hiding this comment.
Default all-sites access/injection is too broad for first-run behavior.
Line [14] plus Line [39] gives universal page access and auto-injection, which materially increases sensitive-data exposure risk. Prefer opt-in host permissions and inject only after explicit user action/session enablement.
Also applies to: 37-42
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extension/manifest.json` around lines 13 - 17, The manifest currently grants
universal host access via "host_permissions" and auto-injects on all pages (the
"<all_urls>" entries and content script "matches") — change this to require
explicit opt-in: remove "<all_urls>" from "host_permissions" and instead list
only necessary default domains (or leave empty) and add "<all_urls>" to
"optional_permissions"; remove or narrow the "matches" pattern in
"content_scripts" so scripts are not auto-injected on every site, and implement
programmatic injection using chrome.permissions.request (or
browser.permissions.request) followed by chrome.scripting.executeScript (or
equivalent) after the user explicitly enables the session/feature; update any
background/extension UI flow to request the optional permission when the user
opts in and only then perform injection.
Summary
Changes
Issues
This PR is cherry-picked from #38, and only contains the Chrome extension. cc @crispyberry
Summary by CodeRabbit
New Features
Documentation
Chores