Skip to content

Conversation

@Dingway98
Copy link
Contributor

@Dingway98 Dingway98 commented Dec 8, 2025

Note

Introduces first-class action cache capture and replay, adds HyperPage perform helpers, script generation from cache, XPath→CDP resolution, and extends types/outputs to include cache metadata.

  • Agent/Core:
    • Capture per-step action cache in runAgentTask and persist to debug; store by task via getActionCache.
    • Add runFromActionCache to replay cached steps (with XPath/CDP resolution, retries, optional LLM fallback) and write replay debug.
    • Attach cached-action helper methods on HyperPage (performClick/fill/type/press/selectOptionFromDropdown/check/uncheck/hover/scrollToElement/scrollToPercentage/nextChunk/prevChunk).
    • Add createScriptFromActionCache to emit executable scripts from cached steps.
    • New utilities: shared/run-cached-action.ts and shared/xpath-cdp-resolver.ts.
  • Types:
    • Add ActionCacheEntry, ActionCacheOutput, ActionCacheReplayResult, ActionCacheReplayStepResult, ReplayStepMeta, PerformOptions, AgentTaskOutput.
    • Extend TaskOutput with taskId and optional actionCache; HyperPage.ai now returns AgentTaskOutput; Task includes id.
  • Misc:
    • Bump version to 1.0.9.

Written by Cursor Bugbot for commit caa1e94. This will update automatically on new commits. Configure here.

createdAt: startTime,
status: TaskStatus.COMPLETED,
steps: [],
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Single action returns empty actionCache steps array

The executeSingleAction method returns an actionCache with an empty steps array, even though the action was successfully executed and all necessary data (element ID, method, arguments, xpath, instruction) is available. This makes the returned action cache unusable for replay via runFromActionCache, which iterates over the steps array. Unlike runAgentTask which properly builds cache entries using buildActionCacheEntry, this method doesn't populate the steps.

Fix in Cursor Fix in Web

@Dingway98
Copy link
Contributor Author

@claude can you review

@claude
Copy link

claude bot commented Dec 9, 2025

Claude finished @Dingway98's task —— View job


Code Review Complete

I've thoroughly reviewed the action cache implementation. Here are my findings:


Critical Issues (Confirmed from Cursor Reviews)

1. Action cache written unconditionally bypassing debug check ⚠️

Location: src/agent/tools/agent.ts:679-683

The action cache is written to disk even when ctx.debug is false, which is inconsistent with other debug file writes in the same function:

// Lines 679-683: Written unconditionally
fs.mkdirSync(debugDir, { recursive: true });
fs.writeFileSync(
  `${debugDir}/action-cache.json`,
  JSON.stringify(actionCache, null, 2)
);

// Lines 692-696: Properly guarded by ctx.debug
if (ctx.debug) {
  fs.writeFileSync(
    `${debugDir}/taskOutput.json`,
    JSON.stringify(taskOutput, null, 2)
  );
}

Impact: Creates debug/ directories and action-cache.json files for every task execution in production, causing unwanted filesystem side effects.

Recommendation: Wrap lines 679-683 in an if (ctx.debug) check, similar to the taskOutput.json write.


2. Script generation throws on common action types ⚠️

Location: src/agent/shared/action-cache-script.ts:101

The formatCall function throws an error for any step.method not in the METHOD_TO_CALL map:

throw new Error(`Unknown method: ${step.method}`);

However, cached steps can have action types like extract, wait, refreshPage, or pdf which aren't complete or goToUrl and don't have methods in the map.

Impact: Calling createScriptFromActionCache with steps containing these action types will crash with "Unknown method: null" instead of gracefully handling unsupported actions.

Recommendation: Either:

  • Skip unsupported actions with a comment in the generated script
  • Add handling for more action types
  • Return a more graceful error or warning

3. Numeric action arguments silently dropped ⚠️

Location: src/agent/shared/action-cache.ts:47-53

The extractArguments function only extracts arguments if ALL elements are strings:

const isStringArray = (value: unknown): value is string[] =>
  Array.isArray(value) && value.every((item) => typeof item === "string");

const extractArguments = (action: ActionType): string[] => {
  const params = action.params as Record<string, unknown>;
  if (isStringArray(params.arguments)) {
    return params.arguments;
  }
  return [];
};

Impact: When action arguments contain numbers (e.g., scroll percentage values), the function returns an empty array, silently dropping important data. This causes cached actions with numeric arguments to fail during replay.

Recommendation: Accept mixed arrays and convert all values to strings:

const extractArguments = (action: ActionType): string[] => {
  const params = action.params as Record<string, unknown>;
  if (Array.isArray(params.arguments)) {
    return params.arguments.map(arg => String(arg));
  }
  return [];
};

Note: The type signature ActionCacheEntry.arguments: string[] already expects strings, so converting numbers to strings makes sense.


4. Single action returns empty actionCache steps array ⚠️

Location: src/agent/index.ts:1124-1129

The executeSingleAction method returns an actionCache with an empty steps array:

actionCache: {
  taskId,
  createdAt: startTime,
  status: TaskStatus.COMPLETED,
  steps: [],  // ❌ Empty!
},

Impact: The returned action cache is unusable for replay via runFromActionCache, which iterates over the steps array. Unlike runAgentTask which properly builds cache entries using buildActionCacheEntry, this method doesn't populate the steps.

Recommendation: Build and populate the action cache entry similar to how runAgentTask does it:

const actionCacheEntry = buildActionCacheEntry({
  stepIndex: 0,
  action: { type: 'actElement', params: { elementId: element.elementId, method, arguments: args, instruction } },
  actionOutput: { success: true, message: `Successfully executed: ${instruction}`, debug: {} },
  domState,
});

actionCache: {
  taskId,
  createdAt: startTime,
  status: TaskStatus.COMPLETED,
  steps: [actionCacheEntry],
},

Additional Issues Found

5. Missing error handling in XPath resolution

Location: src/agent/shared/run-cached-action.ts:249-255

The XPath resolution doesn't handle errors gracefully:

const resolved = await resolveXPathWithCDP({
  xpath: cachedAction.xpath!,
  frameIndex: cachedAction.frameIndex ?? 0,
  cdpClient,
  frameContextManager,
  debug,
});

If the XPath is invalid or the element doesn't exist, this will throw an error that gets caught in the retry loop, but the error message might not be clear to users.

Recommendation: Add try-catch with a more descriptive error message about XPath resolution failure.


6. Type inconsistency in arguments

Location: src/agent/shared/run-cached-action.ts:17 and usage sites

The CachedActionInput interface defines:

arguments?: Array<string | number>;

But ActionCacheEntry defines:

arguments: string[];

This inconsistency could lead to type confusion. The code in run-cached-action.ts:288-290 handles this by converting to strings, but it's not clear from the types alone.

Recommendation: Either:

  • Make both interfaces use the same type
  • Add a comment explaining the conversion
  • Use a type alias for clarity

7. Potential race condition in action cache storage

Location: src/agent/index.ts:85

The actionCacheByTaskId is stored in memory on the HyperAgent instance, but there's no cleanup mechanism. Over time, this could grow unbounded if many tasks are executed.

Recommendation: Consider adding:

  • A maximum cache size with LRU eviction
  • A method to clear old caches
  • Documentation about the lifecycle of cached actions

Minor Issues

8. Missing null check in script generation

Location: src/agent/shared/action-cache-script.ts:60

args.push(JSON.stringify(step.xpath));

If step.xpath is null, this will generate "null" as a string in the script, which might not be the intended behavior.


9. Inconsistent frameIndex handling

Location: Multiple files

Some places check step.frameIndex !== null && step.frameIndex !== undefined && step.frameIndex !== 0 while others just check step.frameIndex !== null && step.frameIndex !== undefined. Consider standardizing this logic.


Positive Observations

  1. Good separation of concerns - The feature is well-modularized with separate files for cache building, execution, and script generation
  2. Comprehensive type definitions - The TypeScript types are well-defined and documented
  3. Retry logic - The replay functionality includes good retry logic with fallback to LLM
  4. Debug support - Good debug file writing for troubleshooting
  5. Frame support - Proper handling of iframe contexts

Summary

The action cache feature is a valuable addition that enables action replay and script generation. However, there are 4 critical bugs that should be addressed before merging:

  1. Action cache written unconditionally (bypasses debug flag)
  2. Script generation crashes on common action types
  3. Numeric arguments silently dropped
  4. Single actions return empty cache steps

The additional issues are less critical but should be considered for better robustness and maintainability.



if (step.actionType === "extract") {
return `${indent}// Step ${step.stepIndex}
${indent}await page.extract("${step.instruction}");`;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Unescaped string interpolation breaks generated scripts

The step.instruction value is directly interpolated into the generated script string using "${step.instruction}" instead of JSON.stringify(step.instruction). If the instruction contains double quotes, backslashes, or newlines, the generated script will have invalid JavaScript syntax. Other string values in the same file (like urlArg, step.xpath, and value) are correctly escaped with JSON.stringify(), but this case was missed for the extract action type.

Fix in Cursor Fix in Web

❌ Cached XPath Failed: "${cachedXPath}"
✅ LLM Resolved New XPath: "${resolvedXPath}"
`
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Unconditional console.log ignores debug flag

The console.log statement is executed unconditionally when a cached action falls back to LLM, even though a debug parameter is available in scope (destructured at line 44). The rest of the codebase consistently gates console logging behind debug checks. This will cause verbose output in production when users have debug mode disabled.

Fix in Cursor Fix in Web

public async runFromActionCache(
cache: ActionCacheOutput,
pageOrGetter: Page | (() => Page),
params?: RunFromActionCacheParams
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Method signature accepts Page but requires HyperPage internally

The public runFromActionCache method accepts pageOrGetter: Page | (() => Page) but internally casts the page to HyperPage and calls methods that only exist on HyperPage, not on the base Playwright Page. Specifically, hyperPage.extract() and hyperPage.perform() are called, as well as dispatchPerformHelper which invokes performClick, performFill, etc. If a caller passes a plain Playwright Page directly to agent.runFromActionCache(), they'll get runtime errors like "extract is not a function". The type signature is misleading.

Additional Locations (2)

Fix in Cursor Fix in Web

@Dingway98 Dingway98 merged commit ab07e50 into main Dec 9, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants