Skip to content

Pinned Loading

  1. terminal-bench terminal-bench Public

    A benchmark for LLMs on complicated tasks in the terminal

    Python 1.7k 484

  2. harbor harbor Public

    Harbor is a framework for running agent evaluations and creating and using RL environments.

    Python 938 698

  3. terminal-bench-2 terminal-bench-2 Public

    Shell 114 44

  4. terminal-bench-3 terminal-bench-3 Public

    🚧 Accepting Task Submissions 🚧

    Python 64 72

  5. terminal-bench-science terminal-bench-science Public

    Terminal-Bench-Science: Evaluating AI Agents on Complex Real-World Scientific Workflows in the Terminal

    Python 32 24

  6. awesome-harbor awesome-harbor Public

    A curated list of awesome Harbor ecosystem projects

    18 1

Repositories

Showing 10 of 10 repositories
  • harbor Public

    Harbor is a framework for running agent evaluations and creating and using RL environments.

    harbor-framework/harbor’s past year of commit activity
    Python 938 Apache-2.0 698 61 153 Updated Mar 11, 2026
  • terminal-bench-3 Public

    🚧 Accepting Task Submissions 🚧

    harbor-framework/terminal-bench-3’s past year of commit activity
    Python 64 72 0 39 Updated Mar 10, 2026
  • terminal-bench-science Public

    Terminal-Bench-Science: Evaluating AI Agents on Complex Real-World Scientific Workflows in the Terminal

    harbor-framework/terminal-bench-science’s past year of commit activity
    Python 32 Apache-2.0 24 0 15 Updated Mar 10, 2026
  • t-bench-docs Public
    harbor-framework/t-bench-docs’s past year of commit activity
    TypeScript 6 12 2 1 Updated Mar 9, 2026
  • harbor-framework/terminal-bench-challenge’s past year of commit activity
    0 1 0 1 Updated Mar 6, 2026
  • benchmark-template Public template

    Harbor Benchmark Template

    harbor-framework/benchmark-template’s past year of commit activity
    Python 6 4 6 2 Updated Mar 6, 2026
  • awesome-harbor Public

    A curated list of awesome Harbor ecosystem projects

    harbor-framework/awesome-harbor’s past year of commit activity
    18 1 0 0 Updated Mar 3, 2026
  • harbor-framework/terminal-bench-2’s past year of commit activity
    Shell 114 Apache-2.0 43 10 16 Updated Feb 27, 2026
  • harbor-docs Public
    harbor-framework/harbor-docs’s past year of commit activity
    MDX 2 6 0 3 Updated Feb 24, 2026
  • terminal-bench Public

    A benchmark for LLMs on complicated tasks in the terminal

    harbor-framework/terminal-bench’s past year of commit activity
    Python 1,683 Apache-2.0 484 104 184 Updated Jan 22, 2026

Most used topics

Loading…