-
Notifications
You must be signed in to change notification settings - Fork 296
feat: add private notes note-taking agent #231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
44e2db8
Add Codex Task Runner community ability
Ju-usc c4df336
docs: align webhook example with safe scoped Codex execution
Ju-usc b367543
fix: align Codex Task Runner with new validator rules
Ju-usc 9f4007a
style: auto-format Python files with autoflake + autopep8
github-actions[bot] 067a59f
fix: preserve register capability tag format
Ju-usc fd1d78e
style: auto-format Python files with autoflake + autopep8
github-actions[bot] a6125eb
fix: address PR feedback on async webhook and docs hardening
Ju-usc 9a7dff3
fix: preserve register capability tag format
Ju-usc 32590ab
style: auto-format Python files with autoflake + autopep8
github-actions[bot] 5146102
fix: keep validator register tag stable under auto-format
Ju-usc 2796cfd
refactor: simplify Codex Task Runner flow
Ju-usc de41251
rename codex-task-runner → coding-agent-runner
Ju-usc 3d4d378
feat: add private notes note-taking agent
Ju-usc b29d557
style: auto-format Python files with autoflake + autopep8
github-actions[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,91 @@ | ||
| # Coding Agent Runner | ||
|
|
||
|  | ||
|  | ||
|
|
||
| ## What It Does | ||
| Runs a coding task through a remote webhook that invokes Claude Code or Codex headlessly, then reads back a short spoken result. | ||
|
|
||
| ## Trigger Words | ||
| - "run coding task" | ||
| - "run a coding agent" | ||
| - "execute coding task" | ||
|
|
||
| ## Setup | ||
| 1. Run any webhook server that accepts `POST /run` with bearer auth (see example below). | ||
| 2. In `main.py`, replace `WEBHOOK_URL` and `WEBHOOK_TOKEN` placeholders. Use the same token on both sides. | ||
| 3. Upload this ability to OpenHome and set trigger words in the dashboard. | ||
|
|
||
| If OpenHome can't reach your server directly, use a tunnel (e.g. `ngrok http 8080`). | ||
|
|
||
| ## Webhook Contract | ||
|
|
||
| The ability sends: | ||
| ``` | ||
| POST /run | ||
| Authorization: Bearer <token> | ||
| {"prompt": "Add tests for the validator script"} | ||
| ``` | ||
|
|
||
| And expects back: | ||
| ```json | ||
| {"ok": true, "summary": "Added tests and they pass."} | ||
| ``` | ||
|
|
||
| Optional response fields: `artifact_path`, `request_id`. | ||
|
|
||
| ## Minimal Webhook Server | ||
|
|
||
| The webhook just needs to run Claude Code or Codex and return the output. Swap the command to match your agent. | ||
|
|
||
| > **Safety note:** Both examples use autonomous execution flags. Only run in a | ||
| > sandboxed environment or a directory you're comfortable modifying. | ||
|
|
||
| ```python | ||
| # Runs on a separate server, not inside OpenHome. | ||
| import subprocess | ||
| from flask import Flask, jsonify, request | ||
|
|
||
| app = Flask(__name__) | ||
| TOKEN = "your-secret-token" | ||
| AGENT = "claude" # "claude" or "codex" | ||
| WORKDIR = "/path/to/your/project" # sandbox / working directory | ||
|
|
||
| def agent_cmd(prompt): | ||
| if AGENT == "codex": | ||
| return ["codex", "exec", "--full-auto", prompt] | ||
| return ["claude", "-p", prompt, "--allowedTools", "Bash,Read,Write,Edit"] | ||
|
|
||
| @app.post("/run") | ||
| def run(): | ||
| if request.headers.get("Authorization") != f"Bearer {TOKEN}": | ||
| return jsonify(ok=False, error="unauthorized"), 401 | ||
|
|
||
| prompt = (request.get_json(silent=True) or {}).get("prompt", "").strip() | ||
| if not prompt: | ||
| return jsonify(ok=False, error="prompt required"), 400 | ||
|
|
||
| result = subprocess.run( | ||
| agent_cmd(prompt), | ||
| capture_output=True, text=True, timeout=600, check=False, | ||
| cwd=WORKDIR, | ||
| ) | ||
| if result.returncode != 0: | ||
| return jsonify(ok=False, error=f"exit code {result.returncode}"), 500 | ||
|
|
||
| return jsonify(ok=True, summary=result.stdout.strip() or "Done.") | ||
| ``` | ||
|
|
||
| ## Example Conversation | ||
| > **User:** "run coding task" | ||
| > **AI:** "Tell me the coding task you'd like to run." | ||
| > **User:** "Add basic tests for the validator script and run them." | ||
| > **AI:** "Got it. Want me to run that now?" | ||
| > **User:** "Yes" | ||
| > **AI:** "Tests were added and they all pass." | ||
|
|
||
| ## Logs | ||
| Look for `[CodingAgentRunner]` entries in OpenHome Live Editor logs. | ||
|
|
||
| ## Token Hygiene | ||
| For demos, static tokens are fine. After testing, rotate on both sides. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,157 @@ | ||
| """OpenHome ability – voice-triggered coding task execution via webhook. | ||
|
|
||
| Flow: ask → confirm → refine prompt → call webhook → speak result. | ||
| """ | ||
|
|
||
| import asyncio | ||
|
|
||
| import requests | ||
| from src.agent.capability import MatchingCapability | ||
| from src.main import AgentWorker | ||
| from src.agent.capability_worker import CapabilityWorker | ||
|
|
||
| WEBHOOK_URL = "YOUR_WEBHOOK_URL_HERE" | ||
| WEBHOOK_TOKEN = "YOUR_WEBHOOK_TOKEN_HERE" | ||
| REQUEST_TIMEOUT_SECONDS = 180 | ||
| EXIT_WORDS = {"stop", "cancel", "exit", "quit", "never mind"} | ||
|
|
||
| TAG = "[CodingAgentRunner]" | ||
|
|
||
|
|
||
| class CodingAgentRunnerCapability(MatchingCapability): | ||
| """Voice ability that sends coding tasks to an external webhook.""" | ||
|
|
||
| worker: AgentWorker = None | ||
| capability_worker: CapabilityWorker = None | ||
|
|
||
| # {{register capability}} | ||
|
|
||
| def call(self, worker: AgentWorker): | ||
| self.worker = worker | ||
| self.capability_worker = CapabilityWorker(self.worker) | ||
| self.worker.session_tasks.create(self.run()) | ||
|
|
||
| async def run(self): | ||
| try: | ||
| # 1) Guard: ensure webhook is configured. | ||
| if WEBHOOK_URL in ("", "YOUR_WEBHOOK_URL_HERE") \ | ||
| or WEBHOOK_TOKEN in ("", "YOUR_WEBHOOK_TOKEN_HERE"): | ||
| await self.capability_worker.speak( | ||
| "This coding agent runner is not configured yet. " | ||
| "Please set the webhook URL and token in the ability code." | ||
| ) | ||
| return | ||
|
|
||
| # 2) Ask for the coding task. | ||
| await self.capability_worker.speak( | ||
| "Tell me the coding task you'd like to run." | ||
| ) | ||
| task = await self.capability_worker.user_response() | ||
|
|
||
| if not task: | ||
| await self.capability_worker.speak( | ||
| "I didn't catch that. Please try again." | ||
| ) | ||
| return | ||
|
|
||
| lowered = task.lower().strip() | ||
| if any(lowered == w or lowered.startswith(f"{w} ") for w in EXIT_WORDS): | ||
| await self.capability_worker.speak("Okay, canceled.") | ||
| return | ||
|
|
||
| # 3) Confirm before running. | ||
| if not await self.capability_worker.run_confirmation_loop( | ||
| "Got it. Want me to run that now?" | ||
| ): | ||
| await self.capability_worker.speak("Okay, I won't run it.") | ||
| return | ||
|
|
||
| # 4) Refine transcription → call the webhook. | ||
| prompt = self._refine_prompt(task) | ||
| await self.capability_worker.speak( | ||
| "Running your coding task now. This may take up to a few minutes." | ||
| ) | ||
| result = await self._call_webhook(prompt) | ||
|
|
||
| if not result or not result.get("ok"): | ||
| await self.capability_worker.speak( | ||
| "I couldn't complete that coding task. " | ||
| "Check your webhook server logs." | ||
| ) | ||
| return | ||
|
|
||
| # 5) Speak the result. | ||
| spoken = self._rewrite_for_voice( | ||
| result.get("summary") or "Task finished but returned no summary." | ||
| ) | ||
| await self.capability_worker.speak(spoken) | ||
|
|
||
| if result.get("artifact_path"): | ||
| await self.capability_worker.speak( | ||
| "I also saved the full output in the run artifacts." | ||
| ) | ||
|
|
||
| except Exception as err: | ||
| self.worker.editor_logging_handler.error( | ||
| f"{TAG} unexpected error: {err}" | ||
| ) | ||
| await self.capability_worker.speak( | ||
| "Something went wrong while running the coding task." | ||
| ) | ||
| finally: | ||
| self.capability_worker.resume_normal_flow() | ||
|
|
||
| async def _call_webhook(self, prompt: str) -> dict | None: | ||
| """POST the task to the webhook; return parsed JSON or None.""" | ||
| try: | ||
| resp = await asyncio.to_thread( | ||
| requests.post, | ||
| WEBHOOK_URL, | ||
| headers={ | ||
| "Content-Type": "application/json", | ||
| "Authorization": f"Bearer {WEBHOOK_TOKEN}", | ||
| }, | ||
| json={"prompt": prompt}, | ||
| timeout=REQUEST_TIMEOUT_SECONDS, | ||
| ) | ||
| resp.raise_for_status() | ||
| payload = resp.json() | ||
| if not isinstance(payload, dict): | ||
| raise ValueError("response is not a JSON object") | ||
| except Exception as err: | ||
| self.worker.editor_logging_handler.error( | ||
| f"{TAG} webhook failed: {err}" | ||
| ) | ||
| return None | ||
| return payload | ||
|
|
||
| def _refine_prompt(self, raw: str) -> str: | ||
| """Use the LLM to clean up a voice transcription into a clear coding task.""" | ||
| try: | ||
| text = self.capability_worker.text_to_text_response( | ||
| "The following is a voice transcription of a coding task. " | ||
| "Clean it up into a clear, actionable prompt for a coding agent. " | ||
| "Fix transcription errors, remove filler words, and keep the intent. " | ||
| "Return only the refined prompt, nothing else.\n\n" | ||
| f"Transcription:\n{raw}", | ||
| self.worker.agent_memory.full_message_history, | ||
| ) | ||
| return (text or "").strip() or raw | ||
| except Exception: | ||
| return raw | ||
|
|
||
| def _rewrite_for_voice(self, raw: str) -> str: | ||
| """Use the LLM to rewrite a raw summary into spoken-friendly text.""" | ||
| try: | ||
| text = self.capability_worker.text_to_text_response( | ||
| "Rewrite this coding result for spoken voice. " | ||
| "Use 1-2 short conversational sentences. " | ||
| "No list numbers, markdown, file paths, or code snippets. " | ||
| "Keep only the key outcome and one optional follow-up.\n\n" | ||
| f"Result:\n{raw}", | ||
| self.worker.agent_memory.full_message_history, | ||
| ) | ||
| cleaned = (text or "").replace("```", "").strip() | ||
| return cleaned or raw | ||
| except Exception: | ||
| return raw | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| # Private Notes | ||
|
|
||
| `Private Notes` is a voice-first note-taking agent for OpenHome. It stores notes in persistent `private_notes.json`, so note contents stay out of the Personality prompt and are only spoken when the user explicitly asks. | ||
|
|
||
| ## What It Does | ||
|
|
||
| - saves a new note | ||
| - reads one or more notes | ||
| - overwrites a specific note after confirmation | ||
| - deletes one or more notes after confirmation | ||
|
|
||
| The ability uses a single LLM tool loop with conversation history. Python owns all note reads and writes. | ||
|
|
||
| ## Example Phrases | ||
|
|
||
| - `take a note` | ||
| - `note this down: call Sarah after lunch` | ||
| - `read my notes` | ||
| - `read my last note` | ||
| - `update my grocery note` | ||
| - `delete my last note` | ||
| - `delete my notes` | ||
|
|
||
| ## Storage | ||
|
|
||
| - File: `private_notes.json` | ||
| - Persistence: `temp=False` | ||
| - JSON saves safely overwrite by deleting any existing file before writing because `write_file()` appends by default | ||
| - No `.md` files are written, so the Memory Watcher does not inject note contents into the Personality prompt | ||
|
|
||
| ## Voice UX | ||
|
|
||
| - if no request is captured, the ability asks what the user wants to do | ||
| - reads are capped to the 3 most recent matches to avoid long voice dumps | ||
| - overwrite and delete actions always require confirmation | ||
| - final responses stay short, warm, and conversational | ||
|
|
||
| ## Suggested Trigger Words | ||
|
|
||
| Configure these in the OpenHome dashboard: | ||
|
|
||
| - `private note` | ||
| - `private notes` | ||
| - `take a note` | ||
| - `note this down` | ||
| - `write this down` | ||
| - `read my notes` | ||
| - `delete my notes` |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR metadata describes only adding the
private-notesability, but this change set also introduces a full newcoding-agent-runnerability. If this is intentional, it should be called out explicitly in the PR title/description; otherwise, consider splitting into a separate PR to keep review and release scope clear.