Skip to content

Version and compatibility issues #4

@usik

Description

@usik

Hi Team,
This is Luke.

I've been interested in trying out and benchmarking Droidrun so I started with this repo.
In the midst of running a task, I've came across a few issue that could be alleviated rather quickly.

Error Messages:

💬 Preparing chat for task execution... Error during task execution: 'Context' object has no attribute 'get' 2025-10-14 17:59:55,024 - eval.runner - ERROR - Error completing task ContactsNewContactDraft 0: 1 validation error for CodeActResultEvent steps Input should be a valid integer [type=int_type, input_value=[], input_type=list] For further information visit https://errors.pydantic.dev/2.11/v/int_type 2025-10-14 17:59:55,024 - eval.runner - DEBUG - Tearing down task ContactsNewContactDraft 0 2025-10-14 17:59:55,461 - eval.cli - INFO - Task ContactsNewContactDraft 0 completed successfully 2025-10-14 17:59:55,462 - tracker - ERROR - DISCORD_WEBHOOK_URL is not set

Issues shown:

  1. Context API Mismatch
  2. Pydantic Validation
  3. Missing Environment Variable DISCORD_WEBHOOK_URL

Temporary workarounds for now:

Issue 1 :

Updated the following files to replace `ctx.get` -> `ctx.store.get` and `ctx.set` to `ctx.store.set`
# llama_index.core.workflow
-droidrun/agent/utils/executer.py
-droidrun/agent/planner/planner_agent.py
-droidrun/agent/codeact/codeact_agent.py
-droidrun/agent/droid/droid_agent.py

Issue 2:

Updated "droid_agent.py" file to replace `steps=[]` to `steps=0`
CodeActResultEvent(success=False, reason=f"Error: {str(e)}", task=task, steps=0)

Issue 3:

Updated "eval/tracker.py" file to replace `logger.error` to `logger.debug` to make the webhook optional.
def send_discord_embed(embed: dict):
    webhook_url = os.getenv("DISCORD_WEBHOOK_URL")
    if webhook_url is None:
        logger.debug("DISCORD_WEBHOOK_URL is not set, skipping Discord notification")
        return

I am pretty sure your team has updated version internally and thought it be helpful for others that came across the same issue that I did when reproducing the benchmark result.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions