My personal workflow and tools for AI-assisted development.
See WORKFLOW.md for detailed information about my spec-driven development workflow.
- 🎯 Claude Opus for specification-driven development.
- ⚡ Claude Sonnet for general coding tasks.
Click to expand comparisons
- Arena leaderboard
- DesignArena leaderboard
- terminal-bench leaderboard
- Knowledge and Hallucination Benchmark
- Measuring AI Ability to Complete Long Tasks
- Claude Opus 4.5 vs GPT 5.2 High vs Gemini 3 Pro
- Sorry but in like 90% of cases GLM 4.7 (thinking!) is not even close to Opus 4.5. We are using both across multiple GH orgs tons of repos mostly larger JS and PHP codebases. The way Opus does things is just on a totally different level it actually seems aware of the consequences of every single change and catches stuff GLM skips entirely. Please dont just rely on benchmark data when saying they are on the same level for coding because thats simply not true.
- Agreed. Over 3 hours today, Opus 4.5 helped me ship a feature I've spent 100+ hours on over the past 6 months (to no avail). The closest any other model got was GPT 5.2 and only today did I realize how terribly far off it was to the real solution. Today Opus 4.5 and I did it in 3 hours. It's the only model that seems to truly think.
- That is absolutely true. All these clickbaits are just a waste of time. Nothing comes close to Opus 4.5 if you are a developer/ vibe-coder. I believed this nonsense once and configured GLM 4.7 in Claude Code CLI. Tried it for 2 days. Terrible! Even Codex 5.2 is far far behind Opus 4.5.
See SKILLS.md for a comprehensive list of agent skills, documentation skills, and installation commands.
See MCP.md for detailed information about my MCP server setups and configurations.
See CLAUDE-CODE.md for detailed information about Claude Code CLI, tools, tasks, status line, and plugins.
See TOOLS.md for detailed information about terminal tools, code editor, and AI proxy.