Drafting some developer facing agent skills#1027
Conversation
- Add SessionStart hook to install Julia for Claude Code on the web - Update Julia version to 1.12.5 (latest stable) - Add /ir-inspect skill with SKILL.md and ir_inspect.jl script - Add tests for the ir-inspect skill alongside the skill directory
Teaching skill that explains Mooncake's AD pipeline, tangent types, BBCode IR, rule system, and Julia compiler prerequisites. Integrates with ir-inspect to offer live demonstrations of each concept.
Replace encyclopedia-style inline summaries with actionable routing: each topic now directs Claude to read specific docs/source files before answering, then run a concrete ir-inspect demo. Add debugging and user-facing API topics.
Point to raw GitHub URLs for Julia's IRCode, BasicBlock, and AbstractInterpreter source so Claude can fetch them on demand.
Detects specialization widening (e.g. typeof(sum) → Function in Vararg), SROA/allocation failures from inlining barriers, and non-inlined small callees. Includes compare mode for side-by-side analysis of slow vs fast call variants. Motivated by issue #1020 where ir-inspect showed identical Mooncake IR but the entry-point had a 13x slowdown. Also adds a Limitations section to ir-inspect pointing users to perf-diagnose when IR looks correct but performance is poor.
Framework-agnostic skill that traces performance issues from observable symptoms down to decision points in Julia's compiler source code. Uses Codex for deep source analysis and produces structured root cause reports. Includes: - Source map routing 6 issue categories to 27 Julia compiler files - 8-step investigation playbook with Codex collaboration patterns - MWE templates for each category (specialization, inlining, SROA, etc.) - Version policy handling pre/post-1.12 compiler restructuring
Make the skill tool-agnostic — it now describes reading Julia source via local clone (Grep/Read) or WebFetch, without assuming any particular external AI tool is available.
Remove content specific to issue #1020 (Vararg+Function specialization): - source-map.md: replace detailed notcalled_func walkthrough and called bitmask chain with concise category-level file listings and search terms - investigation-playbook.md: replace hardcoded source chain example with generic format template, add all 6 categories to decision point hints - mwe-patterns.md: replace Mooncake function name in reduction example
|
Mooncake.jl documentation for PR #1027 is available at: |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
Performance Ratio: |
Move IR inspection and performance diagnostics code from .claude/skills/ scripts into src/skill_utils.jl (included as @unstable in Mooncake module). Move compiler-rootcause reference docs to docs/src/developer_documentation/, rewritten for human readers. Move tests to test/ with new "skill_utils" test group. Slim down SKILL.md files to reference src/ and docs/ instead of embedding code and docs inline. Deduplicate _count_allocs by reusing TestUtils.__count_allocs.
No longer needed — Claude Code on web now supports setup scripts directly.
- Consolidate 4 skills into 1 (ir-inspect), remove debug-compiler, mooncake-reference, perf-diagnose, compiler-rootcause - Consolidate 4 developer docs into 1 (advanced_debugging.md) - Convert skill_utils.jl into SkillUtils module following TestUtils pattern - Cut unreliable perf diagnostics code (~850 lines) and CFG/DOT generation - Remove unused InteractiveUtils and Printf deps - Fix world_age_info stale detection, show_ir notes display - Update tests and docs to use Mooncake.SkillUtils
Remove OpaqueClosure/MistyClosure and compiler boundary investigation sections — not actionable debugging guidance. Revert unintended Project.toml formatting changes.
- Add skill_utils to CI matrix (Julia 1.11) - Remove nothing # hide from doc @example blocks
There was a problem hiding this comment.
Pull request overview
Adds a new internal developer tool module (Mooncake.SkillUtils) for inspecting/diffing IR across Mooncake’s AD pipeline stages, wires it into docs, tests, and CI, and provides an accompanying Claude skill document.
Changes:
- Introduce
src/skill_utils.jlwith IR inspection, stage rendering, diffs, and world-age reporting utilities. - Add a dedicated test group (
skill_utils) and a CI matrix entry to run it. - Add “Advanced Debugging” docs explaining how to use IR inspection utilities.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/skill_utils.jl |
New internal IR inspection utilities and display/export helpers. |
src/Mooncake.jl |
Includes skill_utils.jl behind @unstable. |
test/skill_utils.jl |
New test coverage for SkillUtils functionality. |
test/runtests.jl |
Adds skill_utils test-group dispatch. |
.github/workflows/CI.yml |
Adds a CI job for the new skill_utils test group. |
docs/src/developer_documentation/advanced_debugging.md |
New developer-facing documentation for IR/world-age debugging. |
docs/make.jl |
Adds the new page to navigation and size-threshold ignore list. |
.claude/skills/ir-inspect/SKILL.md |
Adds a Claude Code skill guide for the new IR inspection tooling. |
- Throw ArgumentError for invalid mode values instead of silent fallback - Document that world kwarg is diagnostic-only, not used for IR generation - Clarify that optimize/do_inline only affect the final optimization pass
|
I'll try to take a look over the weekend. |
yebai
left a comment
There was a problem hiding this comment.
Looks good overall. I looked at the skill but skipped the code.
One minor suggestion, to avoid the layers of redirections, can we rename .claude/skills/ir-inspect/SKILL.md to .claude/skills/ir_inspect.md
|
|
||
| ### Forward mode stages | ||
|
|
||
| `:raw` → `:normalized` → `:bbcode` → `:dual_ir` → `:optimized` |
There was a problem hiding this comment.
Double check this, IIRC, forward mode only works with IRCode, not BBCode.
| - **`LazyDerivedRule` / `DynamicDerivedRule`**: these handle world age | ||
| transitions by recompiling on demand. |
There was a problem hiding this comment.
Double-check this, I'm not sure it's true. LazyDerivedRule is only an optimisation; instead of trying to handle world age issues, DynamicDerivedRule does help.
| elseif test_group == "rules/high_order_derivative_patches" | ||
| include(joinpath("rules", "high_order_derivative_patches.jl")) | ||
| elseif test_group == "skill_utils" | ||
| include("skill_utils.jl") |
There was a problem hiding this comment.
This can probably be merged into the basic group to reduce the number of CI jobs.
* Fix IR inspection staging * Fix skill utils world guard
Co-authored-by: Hong Ge <3279477+yebai@users.noreply.github.com> Signed-off-by: Xianda Sun <5433119+sunxd3@users.noreply.github.com>
This is general convention I am afraid |
- Clarify forward-mode BBCode is inspection-only in docs and skill - Distinguish LazyDerivedRule vs DynamicDerivedRule (static vs dynamic dispatch) - Fold skill_utils into basic test group, keep standalone selector - Add io kwarg to write_ir
Import set_valid_world! directly instead of qualifying with Mooncake, which is not in scope in submodules on Julia 1.12. Also remove the standalone skill_utils test group since it's already part of basic.
The function is only defined on 1.12+, so importing it unconditionally breaks precompilation on LTS/1.11.
* Respect primitive dispatch in inspect_ir * Format skill_utils helper signature * Relax extract_meta edge-count assertion * format
|
Thanks, @yebai ! |
These are set up for Claude Code at the moment, for other coding agents, you can just ask the coding agents to migrate the skills for themselves (for Codex CLI and many others
.agentsfolder is the destination).Evaluating the usefulness of Skills is still a very lively debated topic. And I don't claim the skills here are critical and in prime state yet. But give it a try, we can iterate fast.
CI Summary — GitHub Actions
Documentation Preview
Mooncake.jl documentation for PR #1027 is available at:
https://chalk-lab.github.io/Mooncake.jl/previews/PR1027/
Performance
Performance Ratio:
Ratio of time to compute gradient and time to compute function.
Warning: results are very approximate! See here for more context.