[codex] Add Google Workspace OAuth auth flow by furukama · Pull Request #280 · HybridAIOne/hybridclaw

furukama · 2026-04-11T11:35:34Z

What changed

add a first-party Google Workspace OAuth module that uses HybridClaw's encrypted runtime secret store instead of ad hoc token files
wire hybridclaw auth login/status/logout google-workspace into the CLI, including a Hermes-style stepwise PKCE flow
document the new auth flow and update the bundled Google Workspace skill guidance
add focused unit coverage for token storage, PKCE setup, auth-code exchange, state validation, and refresh behavior

Why

HybridClaw already had browser-login reuse for Google properties, but it did not have a built-in OAuth flow for API-style Google Workspace access. The missing piece was reusable auth/session plumbing that fits the existing secret-store model.

Impact

users can store a Google OAuth desktop client JSON, create an auth URL, exchange a pasted redirect URL or code, inspect status, and clear the stored session with first-party commands
Google Workspace OAuth state now lives in ~/.hybridclaw/credentials.json alongside other encrypted runtime secrets
the current built-in scope bundle covers Gmail, Calendar, Drive, Docs, Sheets, and Contacts

Notes

this PR adds auth/session plumbing only; it does not yet add first-party Google Calendar/Docs/Drive tool execution in the runtime
container-mode consumers still need a follow-up bridge if they are going to use these stored credentials directly

Validation

npm run lint
/Users/bkoehler/src/hybridclaw/node_modules/.bin/vitest run --configLoader runner --config vitest.unit.config.ts tests/google-workspace-auth.test.ts tests/cli.test.ts

Copilot

Pull request overview

Adds first-party Google Workspace OAuth credential management to HybridClaw’s CLI/runtime secrets, and expands the eval/memory subsystem with a native LOCOMO benchmark harness plus new internal memory documentation.

Changes:

Introduce src/auth/google-workspace-auth.ts implementing a PKCE OAuth flow with token persistence in the encrypted runtime secret store, and wire it into hybridclaw auth login/status/logout google-workspace.
Add a native LOCOMO eval suite (/eval locomo ...) including dataset setup/download, QA + retrieval modes, scoring, managed-run orchestration, and comprehensive tests.
Refactor/extend semantic memory writing (new MemoryService.storeSemanticMemory() entry point) and add internal docs for memory layering/limits.

Reviewed changes

Copilot reviewed 28 out of 30 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/tui-slash-menu.test.ts	Updates TUI slash menu expectations to include `/eval locomo`.
tests/memory-service.test.ts	Adds unit coverage asserting default embeddings are plain `number[]`.
tests/locomo-native.test.ts	Adds extensive LOCOMO native harness tests (setup, run, retrieval, scoring, concurrency).
tests/google-workspace-auth.test.ts	Adds unit coverage for Google Workspace OAuth secret storage, PKCE, exchange, refresh, and state validation.
tests/gateway-service.eval-command.test.ts	Updates gateway eval help expectations to include LOCOMO managed commands.
tests/eval-command.test.ts	Adds managed LOCOMO eval command tests and helper fixtures.
tests/command-registry.test.ts	Ensures `/eval locomo` is registered and canonically mapped.
tests/cli.test.ts	Adds CLI routing tests for `auth login/status/logout google-workspace`.
src/memory/memory-service.ts	Changes hashed embedding provider to return a plain array and adds `storeSemanticMemory()` wrapper.
src/evals/locomo-types.ts	Introduces shared LOCOMO constants/types (marker filename, dataset filename, aggregates).
src/evals/locomo-official-scoring.ts	Adds an official LoCoMo scoring port (stemming-based F1 + category handling).
src/evals/locomo-native.ts	Implements native LOCOMO CLI harness (setup/download, QA/retrieval modes, summaries/progress).
src/evals/eval-command.ts	Wires LOCOMO into managed eval suite framework (setup/run/status/results rendering + internal launcher).
src/command-registry.ts	Adds slash command catalog entries for LOCOMO eval commands.
src/cli/help.ts	Updates help text for eval and auth to include LOCOMO and Google Workspace OAuth usage.
src/cli/auth-command.ts	Adds `google-workspace` as a first-class auth provider with login/status/logout handling.
src/cli.ts	Adds internal entrypoint `__eval-locomo-native` to run the LOCOMO harness.
src/auth/google-workspace-auth.ts	New Google Workspace OAuth module using runtime secrets for client/pending/token storage and refresh.
skills/google-workspace/SKILL.md	Updates skill guidance to prefer built-in Google Workspace OAuth flow for API access.
package.json	Adds `stemmer` dependency for LOCOMO official scoring.
package-lock.json	Locks `stemmer` and includes updated lockfile changes.
docs/static/docs.js	Adds “Memory” to internals docs navigation.
docs/development/reference/configuration.md	Documents Google Workspace OAuth state living in `~/.hybridclaw/credentials.json`.
docs/development/reference/commands.md	Documents `/eval locomo ...` and `auth ... google-workspace` usage.
docs/development/internals/session-routing.md	Shifts sidebar position to accommodate new Memory internals doc.
docs/development/internals/README.md	Links to the new Memory internals documentation.
docs/development/internals/memory.md	New internal documentation describing memory layers, prompt injection, and default limits.
docs/development/getting-started/authentication.md	Adds Google Workspace OAuth flow to onboarding/auth docs.
docs/development/agents.md	Adds Memory internals link to docs landing page.
.gitignore	Ignores `.tmp` temporary files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/evals/locomo-native.ts

+  const datasetPath = getDatasetPath(options.installDir);
+  if (!fs.existsSync(datasetPath)) {
+    console.log(`Downloading dataset from ${LOCOMO_DATASET_URL}`);
+    const response = await fetchWithTimeout(
+      LOCOMO_DATASET_URL,
+      undefined,
+      LOCOMO_DATASET_DOWNLOAD_TIMEOUT_MS,
+      'LOCOMO dataset download',
+    );
+    if (!response.ok) {
+      throw new Error(
+        `Failed to download LOCOMO dataset: HTTP ${response.status}`,
+      );
+    }
+    const rawBuffer = Buffer.from(await response.arrayBuffer());
+    verifyDownloadedDataset(rawBuffer);
+    const raw = rawBuffer.toString('utf-8');
+    if (!raw.trim().startsWith('[')) {
+      throw new Error('Downloaded LOCOMO dataset is not valid JSON.');
+    }
+    fs.writeFileSync(datasetPath, rawBuffer);
+  } else {
+    console.log(`Dataset already present at ${datasetPath}`);
+  }
+
+  const sampleCount = loadSamples(datasetPath).length;
+  fs.writeFileSync(getMarkerPath(options.installDir), 'ok\n', 'utf-8');


src/auth/google-workspace-auth.ts

+export async function exchangeGoogleWorkspaceAuthCode(
+  codeOrUrl: string,
+): Promise<ExchangeGoogleWorkspaceAuthCodeResult> {
+  const clientSecret = requireStoredClientSecret();
+  const pending = requireStoredPendingAuth();
+  const existingToken = readStoredToken();
+  const { code, state } = extractCodeAndState(codeOrUrl);
+  if (state && state !== pending.state) {
+    throw new GoogleWorkspaceAuthError(
+      'google_workspace_state_mismatch',
+      'Google Workspace authorization response state mismatch. Run `hybridclaw auth login google-workspace --auth-url` again.',
+    );
+  }


src/memory/memory-service.ts

+  storeSemanticMemory(params: {
+    sessionId: string;
+    role: string;
+    source?: string | null;
+    scope?: string | null;
+    metadata?: Record<string, unknown> | string | null;
+    content: string;
+    confidence?: number;
+    embedding?: number[] | null;
+    sourceMessageId?: number | null;
+  }): number {
+    const content = params.content.trim();
+    if (!content) {
+      throw new Error('Cannot store empty semantic memory content.');
+    }
+
+    return this.backend.storeSemanticMemory({
+      ...params,
+      content,
+      embedding:
+        params.embedding === undefined
+          ? this.embeddingProvider.embed(content)
+          : params.embedding,
+    });
+  }


src/evals/eval-command.ts

+  {
+    id: 'locomo',
+    title: 'LOCOMO',
+    summary:
+      'Native HybridClaw LoCoMo QA benchmark over the official long-conversation dataset.',
+    aliases: ['lo-co-mo', 'locomo-memory'],
+    prereqs: [
+      'Node.js 22',
+      'network access during `setup` to download `locomo10.json`',
+    ],
+    starter: [
+      '/eval locomo setup',
+      '/eval locomo run --budget 4000 --max-questions 20',
+      '/eval locomo run --mode retrieval --budget 4000 --max-questions 20',
+    ],
+    notes: [
+      'The default `qa` mode generates LoCoMo answers through HybridClaw’s local OpenAI-compatible gateway and scores the model outputs directly.',
+      '`--mode retrieval` skips model generation, ingests each conversation into an isolated native memory session, and scores evidence hit-rate from recalled semantic memories.',
+      'The `qa` prompt shape follows the upstream `evaluate_gpts` flow: truncated conversation context plus a short-answer QA prompt for each LoCoMo question.',
+      '`--num-samples` limits conversation records. Use `--max-questions` for quick smoke runs over a small number of LoCoMo questions.',
+      'By default, LOCOMO creates one fresh template-seeded agent per conversation sample. Use `--current-agent` to reuse the current agent workspace.',
+      'Prompt/profile eval flags flow through `HYBRIDCLAW_EVAL_MODEL`, so agent/workspace mode and prompt ablations affect the benchmarked run.',
+    ],
+  },


Benedikt Koehler and others added 8 commits April 10, 2026 19:09

feat: add locomo eval suite

a03ecbf

Merge branch 'main' into codex/locomo-eval

c7ca8fc

feat: expand locomo eval modes

84741c9

fix: address locomo review feedback

a0d6d24

fix: address remaining locomo review feedback

5aac4d4

test: format locomo native coverage

cd6e11f

feat: add memory documentation and update navigation links

5bd0425

feat: add google workspace oauth auth flow

6031eb3

furukama marked this pull request as ready for review April 11, 2026 11:53

Copilot AI review requested due to automatic review settings April 11, 2026 11:53

Copilot started reviewing on behalf of furukama April 11, 2026 11:53 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Add Google Workspace OAuth auth flow#280

[codex] Add Google Workspace OAuth auth flow#280
furukama wants to merge 8 commits intomainfrom
codex/google-workspace-oauth

furukama commented Apr 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

furukama commented Apr 11, 2026

What changed

Why

Impact

Notes

Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants