Skip to content

Conversation

@giordano-lucas
Copy link
Member

@giordano-lucas giordano-lucas commented Nov 3, 2025

Summary by CodeRabbit

  • New Features

    • Introduced a unified CLI ("notte") with workflow commands: create, update, run (local/cloud), and benchmark (iterations, parallelism, summary).
    • Added a @workflow decorator to mark functions as workflows and enable auto-discovery.
    • CLI auto-detects workflow files, injects/updates metadata, and safely handles main blocks.
    • Benchmarks show progress and produce aggregated summaries and per-run details.
  • Improvements

    • Viewer URLs are now built dynamically and logged.
    • Error messages include exception details for clearer diagnostics.
  • Dependencies

    • Added typer and tqdm.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 3, 2025

Walkthrough

Adds a Typer-based CLI for managing Notte workflows (create, update, run, benchmark), a decorator-based workflow marker, metadata parsing/serialization and file manipulation utilities, and a CLI entrypoint. Adds dependencies (typer, tqdm) and exposes new SDK public symbols. Also updates allowed imports in the script validator to include "notte_llm" and "tqdm", refines an ExecutionResult error message to include exception content, and changes test Makefile targets to run with 4 pytest workers.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Multiple new CLI modules and commands with interdependent flows (file loading, metadata injection, upload, local vs cloud execution) — review control flow, subprocess handling, and restoration of file state.
  • Metadata parsing and main-block comment/uncomment logic — review text parsing edge cases and git author heuristics.
  • Decorator implementation — verify correct metadata attachment for sync and async functions, and no runtime import cycles.
  • SDK exports and pyproject changes — check entrypoint wiring and dependency spec.
  • Minor SDK endpoint change (viewer URL) and ExecutionResult error-message tweak.

Files/areas needing extra attention:

  • packages/notte-sdk/src/notte_sdk/cli/workflow.py
  • packages/notte-sdk/src/notte_sdk/cli/metadata.py
  • packages/notte-sdk/src/notte_sdk/cli/workflow_cli.py and init.py
  • packages/notte-sdk/src/notte_sdk/decorators.py
  • packages/notte-core/src/notte_core/ast.py (ALLOWED_IMPORTS change)
  • Makefile test targets and pyproject.toml entrypoint/dependencies

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'Add workflow cli' accurately summarizes the main change in the changeset. The PR introduces a comprehensive CLI system for workflow management, including new modules for CLI infrastructure (cli/init.py, workflow_cli.py), workflow management commands (workflow.py), metadata handling (metadata.py), decorators for marking workflows (decorators.py), and supporting dependencies. The title clearly identifies this as the addition of a workflow CLI feature.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add-workflow-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12db86e and fa8d53f.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • makefile (1 hunks)
  • packages/notte-core/src/notte_core/ast.py (2 hunks)
  • packages/notte-core/src/notte_core/browser/observation.py (1 hunks)
  • packages/notte-sdk/pyproject.toml (1 hunks)
  • packages/notte-sdk/src/notte_sdk/__init__.py (2 hunks)
  • packages/notte-sdk/src/notte_sdk/cli/__init__.py (1 hunks)
  • packages/notte-sdk/src/notte_sdk/cli/metadata.py (1 hunks)
  • packages/notte-sdk/src/notte_sdk/cli/workflow_cli.py (1 hunks)
  • packages/notte-sdk/src/notte_sdk/decorators.py (1 hunks)
  • packages/notte-sdk/src/notte_sdk/endpoints/workflows.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
packages/notte-sdk/src/notte_sdk/__init__.py (2)
packages/notte-sdk/src/notte_sdk/cli/workflow_cli.py (1)
  • workflow_cli (42-60)
packages/notte-sdk/src/notte_sdk/decorators.py (1)
  • workflow (13-69)
packages/notte-sdk/src/notte_sdk/cli/workflow_cli.py (1)
packages/notte-sdk/src/notte_sdk/cli/__init__.py (1)
  • main (639-648)
packages/notte-sdk/src/notte_sdk/cli/metadata.py (1)
packages/notte-sdk/src/notte_sdk/cli/__init__.py (1)
  • run (277-329)
packages/notte-sdk/src/notte_sdk/cli/__init__.py (3)
packages/notte-sdk/src/notte_sdk/cli/metadata.py (6)
  • WorkflowMetadata (11-154)
  • comment_main_block (250-300)
  • get_git_author (157-210)
  • insert_metadata_block (213-247)
  • uncomment_main_block (303-366)
  • from_file (38-53)
packages/notte-sdk/src/notte_sdk/client.py (1)
  • NotteClient (26-184)
packages/notte-sdk/src/notte_sdk/endpoints/workflows.py (7)
  • workflow_id (620-621)
  • create (287-300)
  • response (597-602)
  • update (314-332)
  • update (644-660)
  • run (470-537)
  • run (726-811)
🪛 Ruff (0.14.2)
packages/notte-sdk/src/notte_sdk/cli/metadata.py

51-51: Do not catch blind exception: Exception

(BLE001)


170-170: Starting a process with a partial executable path

(S607)


179-179: Starting a process with a partial executable path

(S607)


197-197: subprocess call: check for execution of untrusted input

(S603)


198-198: Starting a process with a partial executable path

(S607)

packages/notte-core/src/notte_core/browser/observation.py

251-251: Avoid specifying long messages outside the exception class

(TRY003)

packages/notte-sdk/src/notte_sdk/cli/__init__.py

39-39: Avoid specifying long messages outside the exception class

(TRY003)


83-83: Do not call getattr with a constant attribute value. It is not any safer than normal property access.

Replace getattr with attribute access

(B009)


115-115: Avoid specifying long messages outside the exception class

(TRY003)


165-165: Avoid specifying long messages outside the exception class

(TRY003)


168-168: Avoid specifying long messages outside the exception class

(TRY003)


189-189: Avoid specifying long messages outside the exception class

(TRY003)


214-216: Avoid specifying long messages outside the exception class

(TRY003)


252-252: Avoid specifying long messages outside the exception class

(TRY003)


298-298: subprocess call: check for execution of untrusted input

(S603)


302-302: Avoid specifying long messages outside the exception class

(TRY003)


315-315: Avoid specifying long messages outside the exception class

(TRY003)


395-395: Avoid specifying long messages outside the exception class

(TRY003)


414-414: Avoid specifying long messages outside the exception class

(TRY003)


432-432: Avoid specifying long messages outside the exception class

(TRY003)


445-445: subprocess call: check for execution of untrusted input

(S603)


474-481: Consider moving this statement to an else block

(TRY300)


482-482: Do not catch blind exception: Exception

(BLE001)


553-553: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: tests
🔇 Additional comments (28)
makefile (1)

16-17: Clarify the rationale and address inconsistency in test parallelism.

The PR changes test parallelism from -n logical (dynamic, based on CPU count) to -n 4 (fixed) for test-sdk, test-docs, and test-agent targets. However, the main test (line 8) and test-cicd (line 12) targets still use -n logical. This creates two concerns:

  1. Inconsistency: Different test targets now have different parallelism strategies, which complicates maintenance and could lead to surprises in different environments.
  2. Unclear motivation: The PR objective is to add workflow CLI; the connection to test parallelism changes is not evident from the commit context.

Please clarify:

  • Why was the parallelism fixed to 4 workers specifically for these targets?
  • Should the main test and test-cicd targets also be updated for consistency?
  • Is there a documented reason (e.g., test flakiness, resource constraints, performance optimization) for this change?

If this change is intentional and necessary (e.g., to work around test flakiness introduced by new CLI/metadata tooling), consider updating all test targets to use consistent parallelism and adding a comment in the makefile explaining the rationale.

Also applies to: 21-21, 25-26

packages/notte-core/src/notte_core/browser/observation.py (1)

247-251: LGTM! Improved error reporting.

Including the actual exception in the error message enhances debugging and makes it easier to identify the inconsistent state. This is a good improvement.

Note: Ruff flags TRY003 for this line, suggesting that long messages should be defined outside the exception. However, given that this message is relatively short, includes dynamic content, and is only used in one location, the current approach is reasonable.

packages/notte-sdk/src/notte_sdk/endpoints/workflows.py (1)

523-529: The concern about self.server_url being None is unfounded. In BaseClient.__init__, self.server_url is assigned as server_url or os.getenv("NOTTE_API_URL") or self.DEFAULT_NOTTE_API_URL, which guarantees it always has a non-None string value. The type annotation confirms self.server_url: str (not optional). The code is safe as-is.

packages/notte-core/src/notte_core/ast.py (1)

84-84: LGTM - appropriate for CLI progress tracking.

The addition of tqdm is well-suited for the workflow CLI's progress tracking needs mentioned in the PR summary. It's a widely-used library that only writes to stdout/stderr and doesn't violate any of the security constraints outlined in lines 62-66.

packages/notte-sdk/src/notte_sdk/decorators.py (3)

72-82: LGTM!

The implementation correctly checks for the workflow marker attribute.


85-95: LGTM!

The helper function correctly retrieves the workflow name.


98-108: LGTM!

The helper function correctly retrieves the workflow description.

packages/notte-sdk/pyproject.toml (2)

18-19: LGTM!

The dependency additions are appropriate for the new CLI functionality.


27-28: LGTM!

The console script entry point is correctly configured.

packages/notte-sdk/src/notte_sdk/__init__.py (2)

31-33: LGTM!

The imports correctly expose the new workflow decorator and CLI helper.


47-48: LGTM!

The public API exports are correctly updated.

packages/notte-sdk/src/notte_sdk/cli/workflow_cli.py (1)

42-60: LGTM!

The function correctly intercepts CLI arguments and delegates to the main CLI entry point when appropriate.

packages/notte-sdk/src/notte_sdk/cli/metadata.py (5)

11-154: LGTM!

The WorkflowMetadata class correctly handles parsing, serialization, and updates. The broad exception catch on line 51 is acceptable here as it logs the error and returns None gracefully.


157-210: LGTM!

The git author retrieval logic is well-implemented with appropriate timeouts and fallbacks. The static analysis warnings about subprocess are false positives in this context.


213-247: LGTM!

The metadata block insertion logic correctly handles both updating existing metadata and inserting new metadata, with proper handling of existing shebangs.


250-300: LGTM!

The function correctly identifies and comments out the __main__ block using proper indentation analysis.


303-366: LGTM!

The function correctly reverses the commenting operation by identifying and uncommenting the __main__ block.

packages/notte-sdk/src/notte_sdk/cli/__init__.py (11)

35-39: LGTM!

The function correctly determines the workflow file path from the executing script.


69-104: LGTM with minor suggestion.

The workflow function discovery logic correctly implements the documented priority: decorated function → 'run' function → single callable. The filtering logic is reasonable, though you could consider additional type checks if issues arise with imported functions.


107-130: LGTM!

The function correctly loads workflow files while preventing __main__ block execution by setting module.__name__ to the spec name.


133-159: Verify the restore logic in the finally block.

The context manager's finally block (lines 155-158) reads the original file and uncomments the __main__ block, but the original file was never modified - only the temp file was created with the commented version. This restore operation appears unnecessary since the original file is unchanged.

Consider simplifying to:

     try:
         yield temp_file
     finally:
         # Clean up temp file
         if temp_file.exists():
             temp_file.unlink()
-        # Restore __main__ block if it was commented
-        if was_commented:
-            content = file_path.read_text(encoding="utf-8")
-            content, _ = uncomment_main_block(content)
-            _ = file_path.write_text(content, encoding="utf-8")

Or clarify if there's a scenario where the original file gets modified that I'm missing.


161-171: LGTM!

The function correctly retrieves and validates workflow metadata with appropriate error messages.


174-178: LGTM!

The helper function correctly updates metadata in the workflow file.


182-238: LGTM!

The create command correctly handles workflow creation with good error checking, interactive prompts for missing metadata, and helpful user feedback.


242-273: LGTM!

The update command correctly updates workflows and refreshes metadata timestamps.


277-329: LGTM!

The run command correctly handles both local and cloud execution with proper variable loading and error handling. The subprocess security warning is a false positive in this context.


333-636: LGTM!

The benchmark command is well-implemented with comprehensive features including interactive prompts, parallel execution, progress tracking, timeout handling, and detailed results reporting. The broad exception catches (flagged by static analysis) are appropriate here to ensure benchmark resilience and complete reporting even when individual runs fail.


639-648: LGTM!

The main entry point correctly delegates to the Typer app with clear documentation about the compatibility parameter.

"notte_sdk",
"notte_agent",
"notte_core",
"notte_llm",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify that notte_llm aligns with the security model.

The security comments (lines 62-66) specify that allowed modules must not provide file system access, process control, network access beyond basic parsing, system introspection, or code execution capabilities. Please confirm that notte_llm adheres to these constraints.


🏁 Script executed:

#!/bin/bash
# Search for potentially unsafe operations in notte_llm module
# Looking for file operations, subprocess calls, network requests, etc.

fd -e py . -x rg -l "notte_llm" | head -5

Length of output: 173



---

🏁 Script executed:

```shell
#!/bin/bash
# Find notte_llm module source directory
fd -type d -name "notte*llm*" 2>/dev/null

# Also search for where notte_llm is defined/imported from
rg -n "^from notte_llm import|^import notte_llm" tests/llms/test_engine.py -C2

Length of output: 65



---

🏁 Script executed:

```shell
#!/bin/bash
# Map repository structure
ls -la | head -20

# Search for notte_llm package or module definition
rg -n "notte_llm" --type=py -g "!test*" --max-count=5

Length of output: 4153



---

🏁 Script executed:

```shell
#!/bin/bash
# Examine notte_llm package structure and main exports
cat packages/notte-llm/src/notte_llm/__init__.py 2>/dev/null || echo "No __init__.py"

# Check main engine.py for unsafe operations
rg -n "subprocess|import os|open\(|requests\.|eval\(|exec\(|__import__|compile\(" packages/notte-llm/src/notte_llm/ --type=py

Length of output: 422



---

🏁 Script executed:

```shell
#!/bin/bash
# Get full __init__.py contents
cat packages/notte-llm/src/notte_llm/__init__.py

# Check what's exported and the public API surface
rg -n "^class |^def |^from .* import|__all__" packages/notte-llm/src/notte_llm/engine.py | head -20

# Check service.py public API
rg -n "^class |^def " packages/notte-llm/src/notte_llm/service.py | head -20

Length of output: 1092



---

🏁 Script executed:

```shell
#!/bin/bash
# Find where ALLOWED_IMPORTS is used in code
rg -n "ALLOWED_IMPORTS" packages/notte-core/src/notte_core/ -A3 -B3

# Look at the execution context to understand how RestrictedPython uses it
grep -r "safe_globals\|exec_result\|compile_restricted" packages/notte-core/src/notte_core/ast.py -A5 -B5

Length of output: 3964



---

🏁 Script executed:

```shell
#!/bin/bash
# Get complete __init__.py to see all exports
cat packages/notte-llm/src/notte_llm/__init__.py

# Check if tracer classes are exported
rg -n "from.*tracer import|LlmTracer|LlmUsageFileTracer" packages/notte-llm/src/notte_llm/__init__.py

# See how LLMEngine uses tracers
rg -n "tracer|Tracer" packages/notte-llm/src/notte_llm/engine.py | head -10

Length of output: 386


Critical: notte_llm exposes file system access, violating the stated security model.

Adding notte_llm to ALLOWED_IMPORTS allows user scripts to import and instantiate LlmUsageFileTracer, which internally uses open() for file I/O. This bypasses RestrictedPython's restrictions—while RestrictedPython blocks direct open() calls in user code, it cannot prevent allowed modules from using unrestricted builtins. User scripts can therefore write arbitrary files:

from notte_llm.tracer import LlmUsageFileTracer
tracer = LlmUsageFileTracer(file_path="/tmp/exfil.txt")

This violates lines 62-64 which explicitly forbid "File system access". Either remove notte_llm from ALLOWED_IMPORTS or create a restricted wrapper that excludes tracer functionality.

🤖 Prompt for AI Agents
packages/notte-core/src/notte_core/ast.py around line 74: ALLOWED_IMPORTS
currently includes "notte_llm", which exposes file-system access via
LlmUsageFileTracer and thus violates the stated "File system access"
restriction; either remove "notte_llm" from ALLOWED_IMPORTS, or replace it with
a restricted wrapper module (or granular allowed submodules) that exports only
safe, non-I/O APIs (explicitly excluding tracer classes/functions that call
open()); update ALLOWED_IMPORTS to reference the safe wrapper or the vetted
submodule list, add or update unit tests to assert that tracer-related names are
not importable from the sandbox, and add a brief comment explaining the change
for future reviews.

Comment on lines +13 to +69
def workflow(name: str, description: str | None = None) -> Callable[[Callable[P, R]], Callable[P, R]]:
"""
Decorator to mark a function as a Notte workflow.
This decorator is optional. If present, it stores metadata on the function that can be used
by the CLI to manage the workflow lifecycle (create, update, run). If not present, the CLI
will prompt for workflow name and description during creation.
Args:
name: The name of the workflow.
description: Optional description of the workflow.
Example:
```python
from notte_sdk import workflow
@workflow(name="My Workflow", description="Does something useful")
def run(url: str, query: str) -> str:
# workflow code here
return "result"
```
Note:
The decorator is optional. You can also create workflows without it - the CLI will
prompt for name and description interactively, or detect the workflow function by
looking for a function named 'run' or the only function in the file.
"""

def decorator(func: Callable[P, R]) -> Callable[P, R]:
# Store metadata on the function
func.__workflow_name__ = name # type: ignore[attr-defined]
func.__workflow_description__ = description # type: ignore[attr-defined]
func.__is_workflow__ = True # type: ignore[attr-defined]

# Preserve function signature and behavior
@functools.wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
return func(*args, **kwargs)

# Handle async functions
import inspect

if inspect.iscoroutinefunction(func):

@functools.wraps(func)
async def async_wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
return await func(*args, **kwargs)

# Copy metadata to async wrapper
async_wrapper.__workflow_name__ = name # type: ignore[attr-defined]
async_wrapper.__workflow_description__ = description # type: ignore[attr-defined]
async_wrapper.__is_workflow__ = True # type: ignore[attr-defined]
return async_wrapper # type: ignore[return-value]

return wrapper

return decorator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix metadata propagation for sync functions and consider moving import to module level.

The decorator has two issues:

  1. Critical: For synchronous functions, metadata attributes are set on the original function (lines 43-45) but not copied to the wrapper that is returned (line 67). This means is_workflow(), get_workflow_name(), and get_workflow_description() will fail when called on the decorated function.

  2. Optional: The inspect module is imported inside the decorator (line 53), which is inefficient when the decorator is applied multiple times.

Apply this diff to fix the metadata propagation issue:

         # Preserve function signature and behavior
         @functools.wraps(func)
         def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
             return func(*args, **kwargs)
+        
+        # Copy metadata to wrapper
+        wrapper.__workflow_name__ = name  # type: ignore[attr-defined]
+        wrapper.__workflow_description__ = description  # type: ignore[attr-defined]
+        wrapper.__is_workflow__ = True  # type: ignore[attr-defined]

         # Handle async functions
         import inspect

To address the optional refactor, move the import to the top of the file:

 from __future__ import annotations

+import inspect
 import functools
 from typing import TYPE_CHECKING, Any, Callable, ParamSpec, TypeVar

And remove it from line 53.

🤖 Prompt for AI Agents
In packages/notte-sdk/src/notte_sdk/decorators.py around lines 13 to 69, the
decorator sets metadata on the original sync function but returns a wrapper
without copying those attributes (so
is_workflow/get_workflow_name/get_workflow_description fail), and it imports
inspect inside the decorator; fix by copying __workflow_name__,
__workflow_description__, and __is_workflow__ from func onto the returned
wrapper for synchronous functions (same as done for async_wrapper), move the
inspect import to the module top and remove the inline import on line 53, and
ensure the wrapper preserves and returns the same metadata and behavior.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa8d53f and a2c2df4.

📒 Files selected for processing (5)
  • packages/notte-sdk/pyproject.toml (1 hunks)
  • packages/notte-sdk/src/notte_sdk/cli/README.md (1 hunks)
  • packages/notte-sdk/src/notte_sdk/cli/__init__.py (1 hunks)
  • packages/notte-sdk/src/notte_sdk/cli/workflow.py (1 hunks)
  • packages/notte-sdk/src/notte_sdk/cli/workflow_cli.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • packages/notte-sdk/src/notte_sdk/cli/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/notte-sdk/pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (3)
packages/notte-sdk/src/notte_sdk/cli/workflow_cli.py (1)
packages/notte-sdk/src/notte_sdk/cli/__init__.py (1)
  • main (40-49)
packages/notte-sdk/src/notte_sdk/cli/__init__.py (1)
packages/notte-sdk/src/notte_sdk/decorators.py (1)
  • workflow (13-69)
packages/notte-sdk/src/notte_sdk/cli/workflow.py (4)
packages/notte-sdk/src/notte_sdk/cli/metadata.py (6)
  • WorkflowMetadata (11-154)
  • comment_main_block (250-300)
  • get_git_author (157-210)
  • insert_metadata_block (213-247)
  • uncomment_main_block (303-366)
  • from_file (38-53)
packages/notte-sdk/src/notte_sdk/client.py (1)
  • NotteClient (26-184)
packages/notte-sdk/src/notte_sdk/decorators.py (3)
  • get_workflow_description (98-108)
  • get_workflow_name (85-95)
  • is_workflow (72-82)
packages/notte-sdk/src/notte_sdk/endpoints/workflows.py (2)
  • workflow_id (620-621)
  • response (597-602)
🪛 Ruff (0.14.2)
packages/notte-sdk/src/notte_sdk/cli/workflow.py

84-84: Avoid specifying long messages outside the exception class

(TRY003)


135-135: Do not call getattr with a constant attribute value. It is not any safer than normal property access.

Replace getattr with attribute access

(B009)


167-167: Avoid specifying long messages outside the exception class

(TRY003)


217-217: Avoid specifying long messages outside the exception class

(TRY003)


220-220: Avoid specifying long messages outside the exception class

(TRY003)


248-248: Avoid specifying long messages outside the exception class

(TRY003)


274-276: Avoid specifying long messages outside the exception class

(TRY003)


319-319: Avoid specifying long messages outside the exception class

(TRY003)


375-375: subprocess call: check for execution of untrusted input

(S603)


379-379: Avoid specifying long messages outside the exception class

(TRY003)


392-392: Avoid specifying long messages outside the exception class

(TRY003)


480-480: Avoid specifying long messages outside the exception class

(TRY003)


499-499: Avoid specifying long messages outside the exception class

(TRY003)


517-517: Avoid specifying long messages outside the exception class

(TRY003)


530-530: subprocess call: check for execution of untrusted input

(S603)


559-566: Consider moving this statement to an else block

(TRY300)


567-567: Do not catch blind exception: Exception

(BLE001)


638-638: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: tests

Comment on lines +61 to +70
# If first arg is a workflow command (not "workflow"), prepend "workflow" subcommand
if first_arg in ["create", "update", "run", "benchmark"]:
sys.argv.insert(1, "workflow")
# After inserting "workflow", auto-detect file path if not provided
# Check if a file path is already provided (after "workflow" and the command)
if len(sys.argv) <= 3: # Only ["script.py", "workflow", "command"] - no file path yet
# Insert the current script path as the file argument
sys.argv.insert(3, sys.argv[0])
# Note: file_path is auto-detected from sys.argv by typer, so we don't need to pass it
main()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

CLI blows up on positional workflow paths

These inserts add the script path as a positional argument, but every Typer subcommand only declares --workflow-path. When python my_flow.py create (or notte workflow create my_flow.py) hits this code, Typer sees the additional argument and exits with Error: Got unexpected extra argument 'my_flow.py', so none of the commands actually run. Please accept positional paths (e.g., add a typer.Argument and feed that into get_workflow_path_from_context) or stop injecting the extra argv entirely and require --workflow-path. As it stands the documented usage is broken.

Comment on lines +431 to +469
timeout_seconds = timeout * 60 # Convert minutes to seconds

# Interactive prompts if running without flags
# Check if any benchmark-specific flags were provided
benchmark_flags = ["--local", "--iterations", "--timeout", "--variables", "--parallelism"]
has_flags = any(flag in sys.argv for flag in benchmark_flags)

if not has_flags:
logger.info("Running benchmark interactively. Press Enter to use defaults.")
logger.info("")

# Prompt for local vs cloud
while True:
local_input = (
typer.prompt(
"Run locally or on cloud? [local/cloud]",
default="cloud",
)
.strip()
.lower()
)
if local_input in ["local", "cloud"]:
local = local_input == "local"
break
logger.error("Please enter 'local' or 'cloud'")

# Prompt for iterations
iterations = typer.prompt("Number of iterations", default=10, type=int)

# Prompt for timeout
timeout = typer.prompt("Timeout in minutes", default=20, type=int)

# Prompt for parallelism
parallelism = typer.prompt("Parallelism level (number of parallel runs)", default=1, type=int)
if parallelism < 1:
parallelism = 1
if parallelism > iterations:
parallelism = iterations
logger.warning(f"Parallelism reduced to {iterations} (cannot exceed number of iterations)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Interactive timeout ignores the user’s input

timeout_seconds is captured before the interactive prompts. When a user types a different timeout during the prompts, timeout changes but timeout_seconds stays at the old value, so the benchmark still stops at the default 20 minutes. Move the timeout_seconds = timeout * 60 calculation to after the interactive section (or recompute it once inputs are finalized) so the chosen timeout actually applies.

🤖 Prompt for AI Agents
In packages/notte-sdk/src/notte_sdk/cli/workflow.py around lines 431 to 469, the
timeout_seconds variable is computed before the interactive prompts so any
timeout the user enters is ignored; move the timeout_seconds = timeout * 60
calculation to after the interactive prompt block (or recompute timeout_seconds
immediately after all prompts/flag parsing are finished) so timeout_seconds
reflects the final timeout value, and ensure any downstream code uses that
recomputed timeout_seconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants