msa -> yab -> geak_v3_features by sdubagun-amd · Pull Request #10 · AMD-AGI/GEAK

sdubagun-amd · 2026-02-12T05:11:41Z

Moving to a common repository.

Decoupled additions (copied as-is from MSA): - mcp_tools/: 6 MCP servers (automated-test-discovery, kernel-evolve, kernel-ercs, kernel-profiler, metrix-mcp, openevolve-mcp) + mcp-client - Dockerfile, entrypoint.sh, scripts/run-docker.sh - runtime_env.py (local/Docker auto-detection) - optimizer/ (unified OpenEvolve + Autotune interface) - benchmark.py (standardized benchmarking framework) - kernel_profile.py (GPU profiling CLI) - mcp_tools/metrix.py (AMD Metrix API tool) - reference/ (50+ GPU optimization strategies database + state machine) - test_suite/ (10-kernel AITER regression suite) - examples/add_kernel/ - docs: DISCOVERY_PIPELINE.md, METRIX_TOOL.md, GETTING_STARTED.md, RUNTIME_ENV.md, RUNTIME_QUICKSTART.md Integrated changes (best-of-both-worlds): - Test discovery: MSA's content-based pipeline runs first (fast, free), results fed into v3's UnitTestAgent as context. Subagent always runs but starts informed rather than exploring from scratch. - mini.py: Added --runtime, --docker-image, --workspace CLI flags - pyproject.toml: Added geak/kernel-profile scripts, mcp[cli] dep - README.md: Added MCP servers, Docker, architecture sections All geakagent -> minisweagent import references fixed in ported files.

Analysis/comparison docs moved to ~/geak_analysis_docs/ (not needed in repo).

Cherry-picked 6 geak_v3_features commits (167fc13..18853fb): - Model refactor: amd_base, amd_claude, amd_openai, amd_gemini - Tool-call message protocol in default.py - Trajectory saving, test_profiling_tool.py deletion - Parallel worktree fixes, unit test prompt Ported 15 msa post-squash-merge commits (path-translated geakagent->minisweagent): - baseline_metrics.py, protected_files.py (new) - resolve_kernel_url.py, test discovery injection (new) - OpenEvolve COMMANDMENT-based evaluation refactor - kernel_profile: remove --filter, always profile all kernels - optimizer/core.py: new MCP API (gpu, output_dir, commandment_path) - mini.py: --kernel-url flag, discovery injection, INSTRUCTIONS.md loading - default.py: summary_on_cost_limit feature - Dockerfile: OpenEvolve installation - openevolve-mcp/server.py: refactored Doc rejects (README, METRIX_TOOL, RUNTIME_QUICKSTART) deferred to cleanup phase. Co-authored-by: Cursor <cursoragent@cursor.com>

Replace double quotes inside f-string expressions with single quotes. Python 3.10 does not support reusing the outer quote character inside f-string braces (PEP 701 landed in 3.12). Co-authored-by: Cursor <cursoragent@cursor.com>

New MCP server `profiler-mcp` wraps both profiling backends behind a single `profile_kernel` tool with a `backend` parameter: - backend="metrix": AMD Metrix API (structured JSON, bottleneck classification) - backend="rocprof-compute": rocprof-compute CLI (deep roofline, instruction mix) Files: - mcp_tools/profiler-mcp/src/profiler_mcp/server.py - unified MCP server - mcp_tools/profiler-mcp/tests/test_profiler_unit.py - 14 mock-based tests - mcp_tools/profiler-mcp/tests/test_profiler_integration.py - 4 GPU tests - mcp_tools/profiler-mcp/examples/profile_kernel.py - Python API example All tests pass (14 unit, 4 integration on MI300X). Ruff-clean. Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

…(Phase 4) Co-authored-by: Cursor <cursoragent@cursor.com>

…Worker (Phase 5) Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

- Delete geak_agent/ legacy package; move resolve_kernel_url to src/minisweagent/tools/ - Deduplicate MetrixTool (delete src copy, keep mcp_tools/metrix-mcp canonical) - Delete 66 inherited mini-swe-agent docs + mkdocs.yml + assets - Delete 23 dead tests (missing upstream modules), fix 3 remaining - Rename test_suite/ -> eval_suite/, reference/ -> knowledge_base/ - Consolidate examples under examples/ - Fix all stale geakagent/sdubagun/yueliu14 references - Remove 16 ghost __pycache__-only directories - Deprecate mcp_tools/kernel-profiler/ (superseded by profiler-mcp) - Remove mcp_tools/kernel-from-url-mcp/ (dead, only __pycache__) - Remove mkdocs dependencies from pyproject.toml - Update git remote to AMD-AGI/GEAK.git Tests: 96 passed, 58 skipped, 1 xfailed, 0 failures Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

…plumbing tests - Refactor MCPToolBridge to use a single persistent asyncio event loop on a background daemon thread per instance, resolving "Future attached to a different loop" errors that broke MCP server calls during e2e runs. - Fix discovery scoping for --kernel-url flows: when the kernel lives inside a .geak_resolved clone, scope both discovery calls (in mini.py and unit_test_agent.py) to the clone root instead of the entire workspace. Add a hard boundary in discovery._expand_workspace_for_file to prevent walking above .geak_resolved. - Export RESOLVED_DIR_NAME constant and add find_resolved_clone_root() helper in resolve_kernel_url_impl.py to couple the directory convention cleanly. - Fix mini.py second discovery call: prioritize _resolved_kernel_path (from --kernel-url resolution) when determining _kernel_path, instead of falling through to None when --task is not provided. - Add new test files: test_discovery_scope.py (scope boundary + mini.py wiring), test_e2e_pipeline_smoke.py, test_mcp_server_smoke.py, test_plumbing_contracts.py, test_toolruntime_dispatch.py, and extend test_mcp_bridge.py with event loop lifecycle tests. - Apply ruff formatting fixes across modified files.

…nt instructions - File-based MCP transport for large profiler results; bump StreamReader limit to 16MB - Auto-detect num_parallel from gpu_ids when not explicitly set - Externalize discovery patterns into discovery_defaults.toml with per-project overrides - Detect Triton wrapper files; fall back to kernel_file stem for test/bench matching - Strengthen test_perf mandatory usage in agent prompts - Fix default task to defer to INSTRUCTIONS.md instead of banning OpenEvolve - Add confirm_exit to base AgentConfig for --exit-immediately support

- Extend discovery to detect HIP/CK/ASM kernels, trace cross-language call chains (Python→torch.ops→pybind11→.cu), and build dependency graphs with fusion opportunity detection - Add GPU pool scheduler: M tasks on N GPUs with dynamic slot assignment via ThreadPoolExecutor and thread-safe GPU queue - Add dynamic task planner generating language-aware optimization tasks (OpenEvolve, CK template tuning, HIP launch config, fusion, etc.) - Add COMMANDMENT.md validation (required sections, shell built-in detection) with auto-validation hook in str_replace_editor - Fix baseline_metrics NaN/inf sanitization and post-write JSON roundtrip - Update SelectPatchAgent to handle task_*/parallel_* directories and prefer per-kernel latency metrics - Consolidate duplicated extension lists into shared constants - Add 25 unit tests for validate_commandment and task_planner Co-authored-by: Cursor <cursoragent@cursor.com>

- Add GPU/profiler rules to task planner and strategy yaml to prevent agents from using inline env vars or HIP_VISIBLE_DEVICES prefixes - Detect inline env var prefixes (VAR=val cmd) in COMMANDMENT validator - Add COMMANDMENT.md validation hook to bash tool (agents bypass editor) - Update INSTRUCTIONS.md with anti-patterns and wrapper script template - Add 4 new tests for inline env var detection Co-authored-by: Cursor <cursoragent@cursor.com>

Discovery was broken when given a directory instead of a single file: - MCP server crashed on read_text() for directories, _expand_workspace started from parent instead of the directory itself - mini.py missed .git inside the directory (kp.parents excludes kp) - DiscoveryPipeline.run() skipped workspace expansion for directories Fixes: - Add directory mode to MCP discover() with recursive kernel scanning - Fix _expand_workspace to check the path itself when it's a directory - Add _expand_workspace_for_dir() that scopes workspace to the given directory, preventing unrelated sibling files from polluting results - Use parent directory name as kernel_name when file has a generic name (kernel.py, main.py) so test-name matching works properly - Fix mini.py to check kp itself for .git before walking kp.parents Co-authored-by: Cursor <cursoragent@cursor.com>

- Eliminate double discovery: mini.py reuses stashed _run_discovery._last_result instead of calling run_discovery_pipeline() a second time - Enrich discovery context: new format_discovery_for_agent() includes kernel analysis, language-specific testing guidance (triton/hip/ck/asm), and extracted test patterns (tolerances, shapes, dtypes, imports) - Extract test patterns: _extract_test_patterns() in discovery.py pulls atol/rtol, input shapes, dtypes, reference impls, and import patterns from top-confidence test files - Upgrade UnitTestAgent to TestHarnessAgent: creates a fixed test harness with --correctness/--profile/--benchmark modes. Reads INSTRUCTIONS.md for harness rules. The harness is an immutable evaluation contract. - Update INSTRUCTIONS.md: section 1a references pre-scanned discovery results (no re-discovery needed), section 1b notes pre-built harness from UTA Co-authored-by: Cursor <cursoragent@cursor.com>

In large repos like aiter (161 test files), content-based scoring alone gives all tests confidence=1.0, making results effectively random. test_moe_dp_share_expert.py ranked above test_rope.py for a rope kernel. Fixes: - Add _relevance_score() that combines name matching + path proximity - Kernel name in filename: +3.0, in path: +2.0, partial match: +0.5*n - Path proximity: tests near the kernel in directory tree get +1.0 - Remove confidence cap at 1.0 so relevant tests visibly outrank generic - Add generic stem detection (kernel.py -> parent dir name) to MCP server Before: test_moe_sorting_mxfp4.py (conf=1.0) for rope kernel After: test_rope.py (conf=5.85) for rope kernel Co-authored-by: Cursor <cursoragent@cursor.com>

When multiple test files share the same name (e.g. test_rope.py in op_tests/ and triton_tests/rope/), the summary now shows the full path instead of just the filename. Co-authored-by: Cursor <cursoragent@cursor.com>

Phase 1 (automated): Now accepts kernel_function param from resolve_kernel_url (e.g. "_rope_fwd"). Uses function name to boost test files that actually reference the target function in their source. Also extracts @triton.jit and __global__ function names from kernel. Phase 2 (LLM finisher): After Phase 1 ranking, calls the AMD LLM gateway to validate whether the top test actually exercises the target kernel functions. If it does, isolates relevant test functions. If not, generates a focused test script from scratch that directly imports and tests the kernel functions. Writes the script to output_dir. Tested on aiter rope kernel: - Phase 1: test_rope.py ranked #1 (correct) - Phase 2: LLM correctly identified that test_rope.py tests high-level wrappers, not the low-level _rope_fwd kernel, and generated a focused test that directly exercises the Triton helper functions Co-authored-by: Cursor <cursoragent@cursor.com>

When discover() is given a directory with multiple kernels, each kernel now gets its own recommended_test and recommended_benchmark based on per-kernel relevance scoring (using _relevance_score). Previously all tests were scored globally with one flat "Recommended test" for the whole directory. Also fixes directory mode parity with single-file mode: - Apply generic stem fix (kernel.py -> parent dir name) - Remove confidence cap (min(score, 1.0)) so relevant tests visibly outrank generic ones - Use _relevance_score for per-kernel matching Tested on aiter (124 kernels) and geak_eval (8 kernels): each kernel gets its own correct recommendation (e.g. rope -> test_rope.py, topk -> test_moe_topk_sigmoid.py). Co-authored-by: Cursor <cursoragent@cursor.com>

…ndling 28 tests covering all scenarios discovered during development: Single-file mode (8 tests): - Triton kernel finds matching test (not random) - Generic kernel.py uses parent dir name - kernel_function param boosts matching tests - Function names extracted from kernel source - Non-existent path returns error - No confidence cap (relevant tests score > 1.0) - Irrelevant tests don't rank above relevant ones - Full path in summary (not ambiguous filename) Directory mode (7 tests): - Directory with .git used as workspace - Directory without .git expands upward - Per-kernel recommendations for multi-kernel dirs - Generic kernel.py names resolved in directory mode - Empty directory returns zero - Single kernel collapses to dict - Per-kernel summary in output Relevance scoring (3 tests): - Name in filename scores highest - Name in path scores second - Path proximity boosts nearby tests Main pipeline directory handling (4 tests): - _expand_workspace_for_dir uses dir itself - Fallback uses dir (not parent) - Kernel name uses parent dir for generic names - Nested kernels found recursively Pattern extraction (2 tests): - Extracts atol/rtol tolerances - Extracts torch dtype references Enriched context formatter (4 tests): - Includes kernel analysis - Includes language-specific guidance - None result returns empty - Includes extracted patterns Co-authored-by: Cursor <cursoragent@cursor.com>

For kernel gemm_a8w8, test_gemm_a8w8.py (exact match) now scores higher than test_gemm_a8w8_blockscale.py (substring containment). Scoring tiers: - Exact stem match (test_<kernel>.py): +4.0 - Substring in filename: +2.5 - Name in path: +2.0 - Partial parts match: +0.5 * n This prevents confusion between similar kernels like: - gemm_a8w8 vs gemm_a8w8_blockscale - gemm_a16w16 vs gemm_a16w16_gated - fp8 vs fp16 kernel variants Added unit test: test_exact_stem_beats_substring (29 tests total) Co-authored-by: Cursor <cursoragent@cursor.com>

Make every optimization pipeline step independently callable from the CLI with chainable --from-* flags and -o output options: New modules: - task_file.py: shared YAML-frontmatter task file I/O + git worktree helpers (extracted from ParallelAgent to eliminate duplication) - task_generator.py: LLM-assisted task generation with -o DIR for Markdown task files, --from-results for iterative round-over-round refinement - commandment.py: programmatic COMMANDMENT.md generation with built-in validation loop CLI enhancements: - openevolve-worker --from-task: reads task .md, auto-creates worktree - geak --from-task: reads task .md, populates --task/--repo, auto-worktree - All pipeline CLIs support --from-discovery/--from-resolved/--from-profile for seamless chaining via intermediate JSON files - resolve-kernel-url --json/-o, test-discovery --from-resolved/-o, kernel-profile --from-discovery/--json/-o, baseline-metrics --from-profile Infrastructure: - pyproject.toml: new entry points for all modular CLI tools - entrypoint.sh: health checks for new tools - scripts/run-docker.sh: updated docs with full pipeline, iterative refinement, and --from-task examples - tests/run/test_task_generator.py: unit tests for LLM fallback logic Co-authored-by: Cursor <cursoragent@cursor.com>

Modular task pipeline: task files, --from-task, iterative refinement

- Fix agent working directory: set env.config.cwd to repo/worktree path so bash commands and test_perf run in the correct location instead of the container root (/workspace) - Fix env_factory lambda to propagate cwd for parallel agent respawns - Set agent.base_repo_path for single-agent --from-task runs, enabling correct patch generation (diff between original repo and worktree) - Suppress config auto-detection conflicts for --from-task by not passing task body to load_and_merge_configs - Skip redundant baseline profiling: prepend note to task body when running from task files so agent skips re-profiling - Inject task metadata (KERNEL FILE, TEST COMMAND, REPO ROOT) into agent prompt so it knows exactly which files to edit - Print agent log path prominently with tail -f hint - Skip test harness creation for --from-task runs - Extract test_command from discovery JSON into task file metadata - Guard Path() checks against long task body strings (OSError fix) - Fix empty tools list causing litellm BadRequestError in amd_claude - Add run-tasks CLI and task_runner module for batch task execution - Add model registry utility - Agent-based task generator with tool-calling (replaces monolithic prompt) - Backend-neutral kernel profiler output (metrix + rocprof-compute) - Streamlined README, docs, and code cleanup across the codebase Co-authored-by: Cursor <cursoragent@cursor.com>

- Fix orchestrator and task_generator to prefer focused_test command from discovery.json instead of the original repo test, which was causing all test_perf validations to fail with argument parsing errors - Add auto-finalization when orchestrator exhausts step limit: scans all rounds for best_results.json, picks highest speedup, writes final_report.json - Inject COMMANDMENT, baseline metrics, profiling data, and explicit kernel/repo/test paths into sub-agent task bodies via dispatch.py - Exclude traj.json and *.log from git diff patches in test_perf - Add .ruff_cache to .dockerignore - Run ruff check + format on entire src/ - Update README with high-level and low-level command reference, output directory structure, and architecture overview Co-authored-by: Cursor <cursoragent@cursor.com>

- Fix dispatch.py: run_parallel returns (task_id, agent, exit_status, result) but was unpacked as (idx, result, patches, exit), causing wrong success checks and crashes on len(string). Now counts patches from disk instead. - Fix select_patch_agent.py: log exceptions instead of silently swallowing them in run_select_patch(). - Fix orchestrator.py: copy original model tools list before mutation so nested callers (task_generator) don't corrupt the saved reference. - Fix orchestrator.py: wrap _dispatch_tool_call in try/except so tool failures (e.g. API outage during task generation) return JSON error payloads to the LLM instead of crashing the orchestrator loop. Co-authored-by: Cursor <cursoragent@cursor.com>

The MCP tool packages (automated-test-discovery, kernel-ercs, etc.) are installed via pip in the Dockerfile but not declared in pyproject.toml. This causes 18+ test failures when running outside Docker (local dev, CI). Add sys.path entries for all mcp_tools/*/src/ directories in conftest.py so tests can import these packages without requiring a separate pip install. Co-authored-by: Cursor <cursoragent@cursor.com>

Tests were passing {"command": "..."} (a dict) to execute() which expects a plain string. This caused TypeError in subprocess since the dict was forwarded as the cwd argument. Fixed all 22 call sites to pass the command string directly. Co-authored-by: Cursor <cursoragent@cursor.com>

kernel-ercs and kernel-evolve MCP servers require fastmcp>=2.0.0 to start. Without it, the server subprocess crashes on import causing 7 test failures. Adding it to dev dependencies ensures MCP smoke tests pass outside Docker. Co-authored-by: Cursor <cursoragent@cursor.com>

The task-generation agent was hitting LimitsExceeded on complex kernels (e.g. RoPE with 30+ functions). Defaults raised from 30/2.0 to 75/10.0 and made configurable via GEAK_TASKGEN_STEP_LIMIT / GEAK_TASKGEN_COST_LIMIT environment variables. Co-authored-by: Cursor <cursoragent@cursor.com>

Catch subprocess.TimeoutExpired and return a structured dict with returncode=-1 and exception_info instead of letting it propagate as an unhandled exception. Matches the contract expected by tests and the pattern used by other environments. Co-authored-by: Cursor <cursoragent@cursor.com>

Fix task pipeline: agent cwd, config conflicts, and task execution

resolve_kernel_url stored local_repo_path as a relative path while local_file_path was absolute. The parallel agent resolved the relative path against the task file directory, producing a doubled nonsense path that didn't exist. Now all three layers ensure absolute paths: the source (resolve_kernel_url_impl), the orchestrator context loader, and the dispatch batch runner. Co-authored-by: Cursor <cursoragent@cursor.com>

The Full Pipeline Mode (preprocessor → orchestrator) was skipping the UnitTestAgent, relying on a single-shot LLM finisher in the MCP discovery server for harness creation. That approach consistently failed because a single LLM call can't reliably generate correct test harnesses (wrong tensor shapes, wrong tolerances, wrong imports). The UnitTestAgent is a multi-turn agent with bash/editor tools that can read the kernel, read existing tests, run them, see errors, and iterate until the harness works. It was already built for this purpose but wasn't wired into the new pipeline. Changes: - preprocessor.py: Add model/model_factory params to run_preprocessor(). After MCP discovery (Step 2), run UnitTestAgent (Step 2b) with discovery context to create a validated harness. Extract absolute path to the harness script for the profiler. Fall back to raw discovery test command if UnitTestAgent fails. - mini.py: Pass model and model_factory to run_preprocessor(). Tested on ROCm/aiter RoPE kernel: UnitTestAgent creates a working harness, profiling succeeds (48.44 us baseline), orchestrator generates tasks, optimization agent produces 18+ patches with ~13% speedup. Co-authored-by: Cursor <cursoragent@cursor.com>

Wire UnitTestAgent into Full Pipeline Mode preprocessor

…ntext passing Co-authored-by: Cursor <cursoragent@cursor.com>

- Fix GPU isolation: propagate HIP_VISIBLE_DEVICES through BashCommand, MCPToolBridge, ProfilingAnalyzer, and OpenEvolve subprocess env. Prevent shallow-copy race in ParallelAgent by creating new env dicts per thread. Add defensive copy in ToolRuntime.set_env(). - GPU-aware task generation: extend AgentTask with num_gpus, teach task-generator LLM to allocate GPUs per task, ParallelAgent acquires N GPU slots from pool for multi-GPU tasks (e.g. OpenEvolve). - Docker: remove hardcoded HIP_VISIBLE_DEVICES=0 from Dockerfile, unset it in entrypoint.sh so geak --gpu-ids controls isolation. - Fix profiler integration tests: add __main__ to examples/add_kernel so rocprofv3 captures GPU activity, fix MetrixTool empty HIP_VISIBLE_DEVICES handling, update test assertions to match add_kernel (not rope), mark rocprof-compute roofline as xfail. - Add developer docs: gpu-isolation.md (invariants, how-to, pitfalls), update architecture/flow/tools diagrams with SweAgent, codebase context passing chain, multi-GPU dispatch, and --gpu-ids flags. Remove redundant diagrams.md. Co-authored-by: Cursor <cursoragent@cursor.com>

- Fix 1: profiler-mcp no longer mutates os.environ; passes clean env via _env_override to ProfilingAnalyzer subprocess instead. - Fix 2: Centralize agent-type ↔ class mappings into agent_spec.py (_agent_type_to_class / _agent_class_to_type) eliminating 4 duplicate definitions across dispatch, orchestrator, task_generator, task_runner. - Fix 3: Replace silent `except Exception: pass` in OpenEvolveWorker._save_result_artifacts with logger.warning(). - Fix 4: Add public set_tools() to AmdLlmModelBase and AmdLlmModel router; SweAgent and task_generator use it instead of reaching into model._impl. - Fix 5: Remove duplicate `cfg: dict` type annotation in dispatch.py else-branch. - Fix 6: Harden _derive_test_command_from_commandment to support fenced code blocks, add fallback for raw .py commands, and log debug messages on parse outcomes. Co-authored-by: Cursor <cursoragent@cursor.com>

The previous _env_override approach didn't actually remove the empty key from the subprocess env (dict merge brings it back from os.environ). Switch to save/restore of os.environ, which is safe here because profiler-mcp runs as a dedicated single-threaded MCP server process. Co-authored-by: Cursor <cursoragent@cursor.com>

Swe agent, openevolve fixes and context

The test harness had no control over how many shapes were used for profiling vs testing, causing OOM during GPU profiling. Changes: - Add select_shapes_uniform() utility in discovery.py for programmatic shape selection (dedup, sort by element count, uniform sampling) - UnitTestAgent system prompt now instructs the LLM to read discovered test files, extract ALL shapes (variables, loops, configs — not just literal tuples), and build two lists: HARNESS_SHAPES (20-25) for correctness/benchmark PROFILE_SHAPES (5) for --profile mode only - format_discovery_for_agent() cleaned up: passes all extracted patterns without truncation so the LLM has full shape context Co-authored-by: Cursor <cursoragent@cursor.com>

The harness now supports four CLI modes with distinct shape sets: --profile → PROFILE_SHAPES (5) --benchmark → HARNESS_SHAPES (20-25 sampled) --correctness → HARNESS_SHAPES --full-benchmark → ALL_SHAPES (every discovered shape) --full-benchmark runs all discovered shapes and is intended for use only at the start and end of optimization to get the complete picture. --benchmark uses the sampled subset for fast iteration loops. If ALL_SHAPES has ≤25 entries, HARNESS_SHAPES = ALL_SHAPES and both benchmark modes behave identically. Updated INSTRUCTIONS.md and UTA system prompt accordingly. Co-authored-by: Cursor <cursoragent@cursor.com>

The baseline must record BOTH --benchmark (reduced, 20-25 shapes) and --full-benchmark (all shapes) results. During iterations the agent compares reduced vs reduced; at the end it compares full vs full. Mixing modes in a comparison produces meaningless speedup numbers because the shape sets differ. Co-authored-by: Cursor <cursoragent@cursor.com>

Fix OOM in profiling: LLM-driven shape extraction from discovery

The uniform index calculation divides by (count-1), which crashes when count=1. Add early returns for count<=0 (empty) and count==1 (median shape). Co-authored-by: Cursor <cursoragent@cursor.com>

Umangatamd and others added 30 commits February 9, 2026 11:38

Add integration summary doc for team

b31f107

Remove analysis docs from repo, keep only INTEGRATION_SUMMARY

6ccb802

Analysis/comparison docs moved to ~/geak_analysis_docs/ (not needed in repo).

fix tool call messages

e78cb4c

fix tool call in messages

30cc7f1

fix traj.json save

3b9529d

fix repo replace

87c9e05

add bash tool finished detect, repo detect

2b95051

modify unit test prompt

cf5009f

Fix f-string syntax for Python 3.10 compatibility

d4bcb59

Replace double quotes inside f-string expressions with single quotes. Python 3.10 does not support reusing the outer quote character inside f-string braces (PEP 701 landed in 3.12). Co-authored-by: Cursor <cursoragent@cursor.com>

tool call results truncated

33a45ed

Merge branch 'geak_v3_features' into yab

2b19119

Add Cursor rule for ruff/Python standards

8c12883

Co-authored-by: Cursor <cursoragent@cursor.com>

Add MCPToolBridge and wire 4 MCP servers (Phase 2)

5ae016e

Co-authored-by: Cursor <cursoragent@cursor.com>

Wire MCPs in Dockerfile and entrypoint health checks (Phase 3)

71f3cdf

Co-authored-by: Cursor <cursoragent@cursor.com>

Add native tools: resolve_kernel_url, baseline_metrics, check_compat …

6dc8844

…(Phase 4) Co-authored-by: Cursor <cursoragent@cursor.com>

Add heterogeneous ParallelAgent: AgentSpec, GPU detection, OpenEvolve…

ed1baf8

…Worker (Phase 5) Co-authored-by: Cursor <cursoragent@cursor.com>

Add sub_agent tool for child agent sub-tasks (Phase 6)

7d702e8

Co-authored-by: Cursor <cursoragent@cursor.com>

Wire new tools into geak.yaml and strategy template (Phase 7)

aea6330

Co-authored-by: Cursor <cursoragent@cursor.com>

Ruff auto-fix: import ordering and type annotations (Phase 8)

6728b02

Co-authored-by: Cursor <cursoragent@cursor.com>

Add missing tests and examples, ruff auto-fix

d243087

Co-authored-by: Cursor <cursoragent@cursor.com>

Umangatamd and others added 30 commits February 17, 2026 04:59

Use full file path in recommended test summary to avoid ambiguity

2bcdf02

When multiple test files share the same name (e.g. test_rope.py in op_tests/ and triton_tests/rope/), the summary now shows the full path instead of just the filename. Co-authored-by: Cursor <cursoragent@cursor.com>

Merge pull request #11 from AMD-AGI/modular-task-pipeline

cf02798

Modular task pipeline: task files, --from-task, iterative refinement

Merge pull request #13 from AMD-AGI/task-pipeline-fixes

1e55591

Fix task pipeline: agent cwd, config conflicts, and task execution

Merge pull request #14 from AMD-AGI/ttd

2607f9a

Wire UnitTestAgent into Full Pipeline Mode preprocessor

Add SweAgent, fix OpenEvolve worker, and improve agent routing and co…

26810f6

…ntext passing Co-authored-by: Cursor <cursoragent@cursor.com>

Merge pull request #15 from AMD-AGI/swe-agent-openevolve-fixes

8d78baf

Swe agent, openevolve fixes and context

Merge pull request #16 from AMD-AGI/fix/discovery-shape-selection

9408b00

Fix OOM in profiling: LLM-driven shape extraction from discovery

Fix ZeroDivisionError in select_shapes_uniform when count=1

3003ecc

The uniform index calculation divides by (count-1), which crashes when count=1. Add early returns for count<=0 (empty) and count==1 (median shape). Co-authored-by: Cursor <cursoragent@cursor.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

msa -> yab -> geak_v3_features#10

msa -> yab -> geak_v3_features#10
sdubagun-amd wants to merge 60 commits intogeak_v3_featuresfrom
yab

sdubagun-amd commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sdubagun-amd commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants