Skip to content

Conversation

@2witstudios
Copy link
Owner

@2witstudios 2witstudios commented Jan 15, 2026

Summary
Adds a new AI tool that allows agents to query recent workspace activity, enabling context-aware assistance and pulse/welcome messages. The tool returns activities grouped by drive with compact, token-efficient output.

Key features:

Per-drive activity grouping with drive context (name, prompt/description)
Time window filtering (1h, 24h, 7d, 30d, or since last visit)
Option to exclude own activity (see only what others changed)
AI attribution tracking (which changes were AI-generated)
Compact delta format (content shows length changes only, not full text)
Hard output limit with progressive degradation to prevent context bloat
Changes
apps/web/src/lib/ai/tools/activity-tools.ts - New tool implementation
apps/web/src/lib/ai/core/ai-tools.ts - Register activity tools
apps/web/src/lib/ai/core/tool-filtering.ts - Add to tool summary list
apps/web/src/lib/ai/tools/tests/activity-tools.test.ts - Auth/authz tests
Output Format
{
"ok": true,
"actors": [{ "email": "...", "name": "...", "isYou": false, "count": 5 }],
"drives": [{
"drive": { "id": "...", "name": "Backend", "context": "API workspace" },
"activities": [{
"ts": "2026-01-15T10:00:00Z",
"op": "update",
"res": "page",
"title": "API Docs",
"actor": 0,
"fields": ["content", "title"],
"delta": { "content": { "len": { "from": 5000, "to": 5200 } }, "title": { "from": "Draft", "to": "API Docs" } }
}],
"stats": { "total": 1, "byOp": { "update": 1 }, "aiCount": 0 }
}],
"meta": { "total": 1, "window": "24h", "truncated": null }
}

Context Efficiency
Actor deduplication: Activities reference actors by index, not repeated objects
Compact field names: ts, op, res instead of verbose names
Smart deltas: Content fields show length change only (not full text)
Hard output cap: Default 20k chars (~5k tokens) with progressive truncation
Test Plan
Verify tool appears in agent tool list
Test with since: "24h" returns recent activity
Test with excludeOwnActivity: true filters out current user
Test with driveIds filter scopes to specific drives
Verify output stays under maxOutputChars limit with large activity sets
Verify meta.truncated populated when data is dropped

Summary by CodeRabbit

  • New Features

    • Added an activity retrieval tool to the AI toolkit: recent workspace activity grouped by drive with actor deduplication, per-drive stats, and metadata.
    • Supports time-window filters, drive scoping, operation-type filters, exclude-own and AI-change options, plus progressive truncation with truncation metadata.
  • Integration

    • Tool is now included in the public toolkit and among available read/write tools.
  • Tests

    • Added tests for authentication, access control, input schema, and tool metadata.

✏️ Tip: You can customize this high-level summary in your review settings.

Adds a new AI tool that allows agents to query recent workspace activity,
enabling context-aware assistance and pulse/welcome messages.

Features:
- Per-drive activity grouping with drive metadata (name, prompt/description)
- Time window filtering (1h, 24h, 7d, 30d, or since last visit)
- Operation category filtering (content, permissions, membership)
- Exclude own activity option for collaboration awareness
- AI attribution tracking (which changes were AI-generated)
- Detailed change diffs (previousValues/newValues) for understanding what changed
- Contributor breakdown per drive

The tool returns rich, contextualized data allowing the AI to:
- Form intuition about ongoing work patterns
- Generate informed welcome/pulse messages
- Provide contextually relevant assistance
Key optimizations:
- Deduplicate actors: activities reference actors by index instead of
  repeating full actor info on each activity
- Compact field names: ts, op, res, ai instead of verbose names
- Smart delta: content fields show length change only (not full text),
  title/boolean/number fields show full values
- Remove verbose nextSteps/message fields (AI interprets raw data)
- Flatten response structure

Before: Each activity repeated full actor object
After: actors[] array at top, activities use actorIdx

Before: content diffs included full document text
After: content diffs show { len: { from: 5000, to: 5200 } }
- Add maxOutputChars param (default 20k chars ≈ 5k tokens)
- Lower default activity limit from 100 to 50
- Progressive truncation when over limit:
  1. Drop all deltas first
  2. Drop oldest activities (from largest drives)
  3. Drop entire drives if still over
- Response includes truncated info so AI knows data was reduced

Prevents runaway context consumption while preserving most useful data.
- Add get_activity to tool filtering summary list
- Add activity-tools.test.ts with auth/authz error path tests
- Follows existing test patterns (scaffold note for happy path)
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 15, 2026

📝 Walkthrough

Walkthrough

Adds a new get_activity AI tool (exported via activityTools) that fetches, filters, groups, and truncates workspace activity by drive with auth and drive-membership checks; registers the tool in the tools registry and adds unit tests for auth/authorization and schema validations. (50 words)

Changes

Cohort / File(s) Summary
AI tools registry
apps/web/src/lib/ai/core/ai-tools.ts, apps/web/src/lib/ai/core/tool-filtering.ts
Import and spread activityTools into PageSpace tools; add "get_activity" to tool-filtering list. Review for registration order and any naming collisions.
Activity tool implementation
apps/web/src/lib/ai/tools/activity-tools.ts
New exported activityTools with get_activity: zod input schema, auth & drive-membership checks, dynamic activity query, grouping by drive, actor deduplication, per-drive stats, optional diffs, and staged output-size degradation (drop diffs → drop oldest → drop drives); returns actors, drives, meta, truncation history. Inspect error paths, access checks, and truncation consistency.
Unit tests
apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts
New tests mocking auth/membership: tool presence/description, schema shape (ZodObject), unauthenticated error, access-denied error, and description content assertions. Validate mocks for isUserDriveMember and consider adding integration/happy-path tests later.

Sequence Diagram(s)

sequenceDiagram
    participant Client as AI/System
    participant Tool as get_activity
    participant Auth as Auth Service
    participant Drives as Drive Service
    participant Activity as Activity Store

    Client->>Tool: invoke get_activity(filters)
    Tool->>Auth: verify user authentication
    Auth-->>Tool: user context
    Tool->>Drives: verify membership for driveIds
    Drives-->>Tool: accessible drive list
    Tool->>Activity: fetch activity for drives & time window
    Activity-->>Tool: raw activity records
    Tool->>Tool: filter, group by drive, dedupe actors, compute stats
    Tool->>Tool: apply output-size degradation (drop diffs → drop oldest → drop drives)
    Tool-->>Client: response (actors, drives, meta, truncation history)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I hopped through drives to fetch the noisy trail,

Trimmed the diffs and numbered each small tale.
Actors tucked aside so duplicates hide,
Auth kept doors closed, stats marched side by side.
A tidy bundle of activity — ready to compile!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The PR title 'Claude/activity tool exploration n5 lkm' is vague and uses generic/unclear terms that don't convey the actual changeset: adding a workspace activity retrieval AI tool. Revise the title to clearly describe the main change, e.g., 'Add activity retrieval AI tool for workspace queries' or 'Implement get_activity tool for agent workspace activity summaries'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@apps/web/src/lib/ai/tools/activity-tools.ts`:
- Around line 313-329: The current logic selects all non-trashed drives into
userDrives and then calls isUserDriveMember(drive.id) in a loop causing an N+1
query; replace this with a single DB query to load accessible drive IDs for the
user: query driveMembers (or join driveMembers with drives) to select drive IDs
where driveMembers.userId = userId and drives.isTrashed = false, and also
include any drives the user owns (ownerId = userId) in the same query so
targetDriveIds is built from that single result set instead of per-drive calls
to isUserDriveMember; update references to userDrives, isUserDriveMember, and
targetDriveIds accordingly.
🧹 Nitpick comments (4)
apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts (1)

68-79: Schema validation tests lack meaningful assertions.

These tests only verify the schema is defined but don't actually validate the schema behavior. Consider adding actual schema validation tests or removing these placeholders to avoid false confidence in test coverage.

💡 Suggested improvement
     it('accepts valid time window options', () => {
-      // Verify the schema accepts all documented time windows
       const schema = activityTools.get_activity.inputSchema;
       expect(schema).toBeDefined();
-      // Schema validation happens at runtime via Zod
+      // Verify schema parses valid time windows
+      const validWindows = ['1h', '24h', '7d', '30d', 'last_visit'];
+      for (const window of validWindows) {
+        expect(() => schema.parse({ since: window })).not.toThrow();
+      }
     });

     it('has output size limit parameter', () => {
       const schema = activityTools.get_activity.inputSchema;
       expect(schema).toBeDefined();
-      // maxOutputChars should be part of the schema
+      // Verify maxOutputChars constraints
+      expect(() => schema.parse({ maxOutputChars: 999 })).toThrow(); // Below min
+      expect(() => schema.parse({ maxOutputChars: 50001 })).toThrow(); // Above max
+      expect(() => schema.parse({ maxOutputChars: 20000 })).not.toThrow();
     });
apps/web/src/lib/ai/tools/activity-tools.ts (3)

32-37: Unused constants and types.

AUTH_OPERATIONS and the associated AuthOperation type are defined but never used. The other operation types (ContentOperation, PermissionOperation, MembershipOperation) are also unused since the constants are used directly. Consider removing unused code or adding a TODO if these are planned for future use.

🧹 Remove unused code
 const CONTENT_OPERATIONS = ['create', 'update', 'delete', 'restore', 'move', 'trash', 'reorder'] as const;
 const PERMISSION_OPERATIONS = ['permission_grant', 'permission_update', 'permission_revoke'] as const;
 const MEMBERSHIP_OPERATIONS = ['member_add', 'member_remove', 'member_role_change', 'ownership_transfer'] as const;
-const AUTH_OPERATIONS = ['login', 'logout', 'signup'] as const;
-
-type ContentOperation = typeof CONTENT_OPERATIONS[number];
-type PermissionOperation = typeof PERMISSION_OPERATIONS[number];
-type MembershipOperation = typeof MEMBERSHIP_OPERATIONS[number];
-type AuthOperation = typeof AUTH_OPERATIONS[number];

422-425: Minor redundancy in count updates.

The count is already tracked in actorMap during the first loop. The second loop to update actorsList is necessary because objects are shared by reference, but this could be simplified.

💡 Simplified approach
         // Build actor index for deduplication (saves tokens by not repeating actor info)
         const actorMap = new Map<string, { idx: number; name: string | null; isYou: boolean; count: number }>();
         const actorsList: CompactActor[] = [];

         for (const activity of activities) {
           const email = activity.actorEmail;
           if (!actorMap.has(email)) {
             const idx = actorsList.length;
             const actor: CompactActor = {
               email,
               name: activity.actorDisplayName || activity.user?.name || null,
               isYou: activity.userId === userId,
               count: 0,
             };
             actorsList.push(actor);
             actorMap.set(email, { idx, name: actor.name, isYou: actor.isYou, count: 0 });
           }
-          actorMap.get(email)!.count++;
+          const actorEntry = actorMap.get(email)!;
+          actorEntry.count++;
+          actorsList[actorEntry.idx].count = actorEntry.count;
         }

-        // Update counts in actorsList
-        for (const actor of actorsList) {
-          actor.count = actorMap.get(actor.email)!.count;
-        }

548-564: Repeated JSON.stringify in loop may cause performance issues.

The while loop calls JSON.stringify(response) on each iteration to check size. For large responses near the limit, this could involve many iterations with expensive serialization. Consider estimating size reduction per activity or batching drops.

💡 Suggested optimization
         // Step 2: If still over limit, drop oldest activities from each drive
         if (outputSize > maxOutputChars) {
           let droppedCount = 0;
           const targetSize = maxOutputChars * 0.9; // Leave 10% buffer
+          
+          // Estimate average activity size to batch drops
+          const avgActivitySize = outputSize / Math.max(activities.length, 1);
+          const estimatedDropCount = Math.ceil((outputSize - targetSize) / avgActivitySize);

-          while (outputSize > targetSize) {
+          for (let i = 0; i < estimatedDropCount && outputSize > targetSize; i++) {
             // Find drive with most activities and drop oldest
             let maxDrive: CompactDriveGroup | null = null;
             for (const group of response.drives) {
               if (!maxDrive || group.activities.length > maxDrive.activities.length) {
                 maxDrive = group;
               }
             }

             if (!maxDrive || maxDrive.activities.length <= 1) break;

             // Drop oldest (last in array since sorted desc by timestamp)
             maxDrive.activities.pop();
             maxDrive.stats.total = maxDrive.activities.length;
             droppedCount++;
-            outputSize = JSON.stringify(response).length;
           }
+          
+          // Verify final size
+          outputSize = JSON.stringify(response).length;
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b5dd39 and 1f040fc.

📒 Files selected for processing (4)
  • apps/web/src/lib/ai/core/ai-tools.ts
  • apps/web/src/lib/ai/core/tool-filtering.ts
  • apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts
  • apps/web/src/lib/ai/tools/activity-tools.ts
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Never use any types - always use proper TypeScript types
Use camelCase for variable and function names
Use UPPER_SNAKE_CASE for constants
Use PascalCase for type and enum names
Use kebab-case for filenames, except React hooks (camelCase with use prefix), Zustand stores (camelCase with use prefix), and React components (PascalCase)
Lint with Next/ESLint as configured in apps/web/eslint.config.mjs
Message content should always use the message parts structure with { parts: [{ type: 'text', text: '...' }] }
Use centralized permission functions from @pagespace/lib/permissions (e.g., getUserAccessLevel, canUserEditPage) instead of implementing permission logic locally
Always use Drizzle client from @pagespace/db package for database access
Use ESM modules throughout the codebase

**/*.{ts,tsx}: Never use any types - always use proper TypeScript types
Write code that is explicit over implicit and self-documenting

Files:

  • apps/web/src/lib/ai/core/ai-tools.ts
  • apps/web/src/lib/ai/tools/activity-tools.ts
  • apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts
  • apps/web/src/lib/ai/core/tool-filtering.ts
**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

**/*.ts: React hook files should use camelCase matching the exported hook name (e.g., useAuth.ts)
Zustand store files should use camelCase with use prefix (e.g., useAuthStore.ts)

Files:

  • apps/web/src/lib/ai/core/ai-tools.ts
  • apps/web/src/lib/ai/tools/activity-tools.ts
  • apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts
  • apps/web/src/lib/ai/core/tool-filtering.ts
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Format code with Prettier

Files:

  • apps/web/src/lib/ai/core/ai-tools.ts
  • apps/web/src/lib/ai/tools/activity-tools.ts
  • apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts
  • apps/web/src/lib/ai/core/tool-filtering.ts
**/*ai*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use Vercel AI SDK for AI integrations

Files:

  • apps/web/src/lib/ai/core/ai-tools.ts
apps/web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

apps/web/src/**/*.{ts,tsx}: Use message parts structure for message content: { parts: [{ type: 'text', text: '...' }] }
For database access, always use Drizzle client from @pagespace/db: import { db, pages } from '@pagespace/db';
Use centralized Drizzle ORM with PostgreSQL for all database operations - no direct SQL or other ORMs
Use Socket.IO for real-time collaboration features - imported from the realtime service at port 3001
Use Vercel AI SDK with async/await for all AI operations and streaming
Use Next.js 15 App Router and TypeScript for all routes and components

Files:

  • apps/web/src/lib/ai/core/ai-tools.ts
  • apps/web/src/lib/ai/tools/activity-tools.ts
  • apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts
  • apps/web/src/lib/ai/core/tool-filtering.ts
🧠 Learnings (2)
📚 Learning: 2025-12-22T20:04:40.910Z
Learnt from: CR
Repo: 2witstudios/PageSpace PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-22T20:04:40.910Z
Learning: Applies to **/*ai*.{ts,tsx} : Use Vercel AI SDK for AI integrations

Applied to files:

  • apps/web/src/lib/ai/core/ai-tools.ts
📚 Learning: 2025-12-14T14:54:45.713Z
Learnt from: CR
Repo: 2witstudios/PageSpace PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-14T14:54:45.713Z
Learning: Applies to packages/lib/**/*.test.ts : Write unit tests for shared utilities in `packages/lib` with test files named `*.test.ts` alongside source or in `__tests__/` directory

Applied to files:

  • apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts
🧬 Code graph analysis (1)
apps/web/src/lib/ai/core/ai-tools.ts (1)
apps/web/src/lib/ai/tools/activity-tools.ts (1)
  • activityTools (172-593)
🔇 Additional comments (10)
apps/web/src/lib/ai/core/ai-tools.ts (1)

9-9: LGTM!

The import and integration of activityTools follows the established pattern used by other tool modules. The spread operator correctly merges the new tools into the aggregated pageSpaceTools object.

Also applies to: 25-25

apps/web/src/lib/ai/core/tool-filtering.ts (1)

110-110: LGTM!

Correctly placed under read tools since get_activity only fetches data without modifications. The tool will remain available in read-only mode, which is the expected behavior.

apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts (1)

40-49: LGTM on auth test!

The authentication test correctly verifies that missing userId in context throws the expected error. The minimal context object is sufficient for this negative test case.

apps/web/src/lib/ai/tools/activity-tools.ts (7)

40-58: LGTM!

The time window calculation is correct and handles all documented options with appropriate fallbacks.


60-96: LGTM with a note on edge cases.

The implementation correctly retrieves the previous session's timestamp with a fallback to last login. The assumption that the first result is the "current session" may not hold for concurrent sessions across devices, but this is an acceptable approximation for generating "since last visit" summaries.


134-170: LGTM!

Smart token optimization - showing length changes for large content fields while preserving full values for small, meaningful fields like titles and booleans. The design appropriately balances context richness with output efficiency.


204-258: LGTM!

Well-structured Zod schema with sensible defaults, clear constraints, and descriptive documentation that helps the AI understand when and how to use each parameter.


387-400: LGTM!

Query construction is correct with proper use of Drizzle's relational queries. The conditional filter application ensures empty arrays don't cause issues with inArray.


427-495: LGTM!

Activity grouping is well-implemented with efficient use of Maps for O(1) lookups. The compact format with actor indices provides good token efficiency.


585-590: LGTM!

Proper error handling with logging and descriptive error messages that include the original error context.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Fixes based on code review:

1. N+1 query fix (lines 313-329):
   - Replace loop calling isUserDriveMember per drive
   - Single query: join driveMembers + drives for membership
   - Parallel query for owned drives (ownerId = userId)
   - Combine with Set deduplication

2. Remove unused code (lines 32-37):
   - Remove AUTH_OPERATIONS constant (never used)
   - Remove unused type aliases

3. Simplify actor count tracking (lines 422-425):
   - Store actor reference directly in map
   - Count updates shared via reference, no second loop

4. Optimize JSON.stringify in truncation (lines 548-564):
   - Estimate avg activity size from initial serialization
   - Batch drops and only re-serialize periodically
   - Reduces O(n) serializations to O(n/batchSize)

5. Improve test assertions (line 79):
   - Replace placeholder tests with meaningful assertions
   - Verify schema is ZodObject type
   - Verify description contains expected use cases
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@apps/web/src/lib/ai/tools/activity-tools.ts`:
- Around line 553-605: When truncating activities in the loop that pops from
maxDrive.activities (response.drives, CompactDriveGroup), update the derived
counters so output stays consistent: decrement the drive/group stats.total
(already set), decrement response.meta.stats.byOp for the popped activity's op,
decrement response.meta.stats.aiCount when the popped activity is AI-generated
(or use a boolean like CompactActivity.isAI if aiModel can be null), and
decrement the corresponding actor entry actors[].count for the popped
activity.actor; alternatively, after truncation recompute response.meta.stats
and all actors[].count by iterating remaining response.drives and their
activities to ensure counts are accurate before setting response.meta.truncated.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f040fc and 812b1c8.

📒 Files selected for processing (2)
  • apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts
  • apps/web/src/lib/ai/tools/activity-tools.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/web/src/lib/ai/tools/tests/activity-tools.test.ts
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Never use any types - always use proper TypeScript types
Use camelCase for variable and function names
Use UPPER_SNAKE_CASE for constants
Use PascalCase for type and enum names
Use kebab-case for filenames, except React hooks (camelCase with use prefix), Zustand stores (camelCase with use prefix), and React components (PascalCase)
Lint with Next/ESLint as configured in apps/web/eslint.config.mjs
Message content should always use the message parts structure with { parts: [{ type: 'text', text: '...' }] }
Use centralized permission functions from @pagespace/lib/permissions (e.g., getUserAccessLevel, canUserEditPage) instead of implementing permission logic locally
Always use Drizzle client from @pagespace/db package for database access
Use ESM modules throughout the codebase

**/*.{ts,tsx}: Never use any types - always use proper TypeScript types
Write code that is explicit over implicit and self-documenting

Files:

  • apps/web/src/lib/ai/tools/activity-tools.ts
**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

**/*.ts: React hook files should use camelCase matching the exported hook name (e.g., useAuth.ts)
Zustand store files should use camelCase with use prefix (e.g., useAuthStore.ts)

Files:

  • apps/web/src/lib/ai/tools/activity-tools.ts
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Format code with Prettier

Files:

  • apps/web/src/lib/ai/tools/activity-tools.ts
apps/web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

apps/web/src/**/*.{ts,tsx}: Use message parts structure for message content: { parts: [{ type: 'text', text: '...' }] }
For database access, always use Drizzle client from @pagespace/db: import { db, pages } from '@pagespace/db';
Use centralized Drizzle ORM with PostgreSQL for all database operations - no direct SQL or other ORMs
Use Socket.IO for real-time collaboration features - imported from the realtime service at port 3001
Use Vercel AI SDK with async/await for all AI operations and streaming
Use Next.js 15 App Router and TypeScript for all routes and components

Files:

  • apps/web/src/lib/ai/tools/activity-tools.ts
🧬 Code graph analysis (1)
apps/web/src/lib/ai/tools/activity-tools.ts (4)
packages/db/src/index.ts (4)
  • db (20-20)
  • and (8-8)
  • eq (8-8)
  • ne (9-9)
apps/web/src/lib/ai/core/types.ts (1)
  • ToolExecutionContext (8-37)
packages/db/src/schema/members.ts (1)
  • driveMembers (52-70)
packages/db/src/schema/core.ts (1)
  • drives (7-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Unit Tests
🔇 Additional comments (4)
apps/web/src/lib/ai/tools/activity-tools.ts (4)

35-54: Time-window helper looks solid and deterministic.

Clear, easy to reason about, and defaults are sensible.


56-92: Last-visit lookup logic is clear with a sensible fallback.

No concerns in this flow.


129-166: Compact delta generation is token-efficient and readable.

Nice balance between detail and size.


286-339: Remove this suggestion—isUserDriveMember already checks owner access.

The isUserDriveMember function in packages/lib/src/permissions/permissions.ts (lines 205–209) explicitly checks if the user is the drive owner before checking membership. Therefore, owners will not be denied access. The current per-drive approach correctly handles owner access and is functionally sound, though a combined query approach could reduce database round trips when checking multiple drives.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Lint fixes:
- Remove unused 'or' import from @pagespace/db
- Change 'let operationFilter' to 'const' (array is mutated, not reassigned)

Truncation counter consistency:
- After truncation, recompute all derived counters:
  - Reset and recalculate actor counts from remaining activities
  - Recompute drive stats (total, byOp, aiCount) from remaining activities
  - Update meta.total and meta.aiTotal to reflect actual remaining count
- Remove actors with zero count (all their activities were truncated)
- Remap actor indices in activities when actors are removed
Test file:
- Add type assertion for execute params (Zod defaults applied at runtime)

Main tool:
- Fix inArray type: cast operationFilter to column's enum type
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@apps/web/src/lib/ai/tools/activity-tools.ts`:
- Around line 553-616: The truncation currently refuses to remove the last
activity/drive so a large drive.title or drive.context (or a single
activity.content) can still exceed maxOutputChars; modify the post‑Step 3 logic
in activity-tools.ts to enforce the hard cap by adding a last‑resort pass that
(a) if response.drives.length === 1 and that drive.activities.length === 1
either allows dropping that final activity/drive (increment droppedCount and set
meta.truncated) or (b) progressively truncates large string fields (drive.title,
drive.context, activity.content) from the end across the remaining objects until
JSON.stringify(response).length <= maxOutputChars, updating maxDrive.stats.total
and response.meta.truncated as appropriate; recompute outputSize after each
truncation and ensure meta.droppedActivities or a new
meta.truncated.truncatedChars field reflects the change.
🧹 Nitpick comments (3)
apps/web/src/lib/ai/tools/__tests__/activity-tools.test.ts (2)

14-27: Track the deferred integration tests.

Consider converting the TODOs into a tracked issue so the happy‑path coverage doesn’t get lost.

If you want, I can draft an integration test plan or scaffolding for the activity repository seam.


70-77: Avoid relying on Zod internal _def in tests.

_def is internal and can change; prefer instanceof z.ZodObject.

♻️ Proposed refactor
-import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { z } from 'zod';
@@
-      const def = (schema as { _def?: { typeName?: string } })._def;
-      expect(def?.typeName).toBe('ZodObject');
+      expect(schema).toBeInstanceOf(z.ZodObject);
apps/web/src/lib/ai/tools/activity-tools.ts (1)

309-338: Prefer centralized access helpers for drive visibility.

This block re-implements membership/ownership logic. If @pagespace/lib has (or should have) a centralized “accessible drives” helper, consider using it to keep access rules consistent.

As per coding guidelines, prefer centralized permission functions over local permission logic.

Comment on lines +553 to +616
// Step 2: If still over limit, drop oldest activities using batched approach
// to avoid expensive JSON.stringify on every single drop
if (outputSize > maxOutputChars) {
let droppedCount = 0;
const targetSize = maxOutputChars * 0.9; // Leave 10% buffer
const totalActivityCount = response.drives.reduce((sum, g) => sum + g.activities.length, 0);

// Estimate avg chars per activity (avoid divide by zero)
const avgActivitySize = totalActivityCount > 0
? Math.ceil(outputSize / totalActivityCount)
: 200;

// Estimate how many activities to drop
const excessChars = outputSize - targetSize;
const estimatedDrops = Math.ceil(excessChars / avgActivitySize);
const batchSize = Math.max(1, Math.min(10, Math.ceil(estimatedDrops / 5)));

let dropsSinceLastCheck = 0;
while (outputSize > targetSize) {
// Find drive with most activities and drop oldest
let maxDrive: CompactDriveGroup | null = null;
for (const group of response.drives) {
if (!maxDrive || group.activities.length > maxDrive.activities.length) {
maxDrive = group;
}
}

if (!maxDrive || maxDrive.activities.length <= 1) break;

// Drop oldest (last in array since sorted desc by timestamp)
maxDrive.activities.pop();
maxDrive.stats.total = maxDrive.activities.length;
droppedCount++;
dropsSinceLastCheck++;

// Only re-serialize periodically to check actual size
if (dropsSinceLastCheck >= batchSize) {
outputSize = JSON.stringify(response).length;
dropsSinceLastCheck = 0;
}
}

// Final size check
if (dropsSinceLastCheck > 0) {
outputSize = JSON.stringify(response).length;
}

if (droppedCount > 0) {
response.meta.truncated = {
...response.meta.truncated,
droppedActivities: droppedCount,
};
}
}

// Step 3: If STILL over limit after dropping activities, drop entire drives
if (outputSize > maxOutputChars && response.drives.length > 1) {
while (outputSize > maxOutputChars && response.drives.length > 1) {
// Keep the drive with most activity, drop smallest
response.drives.sort((a, b) => b.stats.total - a.stats.total);
response.drives.pop();
outputSize = JSON.stringify(response).length;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Hard cap can be exceeded when only one activity/drive remains.

The truncation loop stops at 1 activity/drive, so a large drive.context or title can still exceed maxOutputChars, violating the hard cap. Add a last‑resort truncation (or allow dropping the final activity/drive) to guarantee the limit.

🛠️ Possible last‑resort truncation pass
         // Step 3: If STILL over limit after dropping activities, drop entire drives
         if (outputSize > maxOutputChars && response.drives.length > 1) {
           while (outputSize > maxOutputChars && response.drives.length > 1) {
             // Keep the drive with most activity, drop smallest
             response.drives.sort((a, b) => b.stats.total - a.stats.total);
             response.drives.pop();
             outputSize = JSON.stringify(response).length;
           }
         }
+
+        // Step 4: Last-resort string trimming to enforce hard cap
+        if (outputSize > maxOutputChars) {
+          for (const group of response.drives) {
+            if (group.drive.context) {
+              group.drive.context = group.drive.context.slice(0, 500) + '…';
+            }
+            for (const activity of group.activities) {
+              if (activity.title) {
+                activity.title = activity.title.slice(0, 200) + '…';
+              }
+            }
+          }
+          outputSize = JSON.stringify(response).length;
+        }
🤖 Prompt for AI Agents
In `@apps/web/src/lib/ai/tools/activity-tools.ts` around lines 553 - 616, The
truncation currently refuses to remove the last activity/drive so a large
drive.title or drive.context (or a single activity.content) can still exceed
maxOutputChars; modify the post‑Step 3 logic in activity-tools.ts to enforce the
hard cap by adding a last‑resort pass that (a) if response.drives.length === 1
and that drive.activities.length === 1 either allows dropping that final
activity/drive (increment droppedCount and set meta.truncated) or (b)
progressively truncates large string fields (drive.title, drive.context,
activity.content) from the end across the remaining objects until
JSON.stringify(response).length <= maxOutputChars, updating maxDrive.stats.total
and response.meta.truncated as appropriate; recompute outputSize after each
truncation and ensure meta.droppedActivities or a new
meta.truncated.truncatedChars field reflects the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants