Popular repositories Loading
-
-
gorilla
gorilla PublicForked from ShishirPatil/gorilla
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
Python
-
-
tau2-bench
tau2-bench PublicForked from sierra-research/tau2-bench
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
Python
-
openai-cua-sample-app
openai-cua-sample-app PublicForked from openai/openai-cua-sample-app
Learn how to use CUA (our Computer Using Agent) via the API on multiple computer environments.
Python
Repositories
- benchmarked-free-ride-ci Public
sequrity-ai/benchmarked-free-ride-ci’s past year of commit activity - openclawbench Public
sequrity-ai/openclawbench’s past year of commit activity - inference-benchmark Public
sequrity-ai/inference-benchmark’s past year of commit activity - agentdojo-benchmark Public Forked from ethz-spylab/agentdojo
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
sequrity-ai/agentdojo-benchmark’s past year of commit activity - tau2-bench Public Forked from sierra-research/tau2-bench
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
sequrity-ai/tau2-bench’s past year of commit activity - gorilla-tool-call Public Forked from ShishirPatil/gorilla
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
sequrity-ai/gorilla-tool-call’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…