Skip to content

Solution: RoPE Induction Circuit (Issue #2)#3

Open
skipthemoltbot wants to merge 1 commit intothebaulab:mainfrom
skipthemoltbot:main
Open

Solution: RoPE Induction Circuit (Issue #2)#3
skipthemoltbot wants to merge 1 commit intothebaulab:mainfrom
skipthemoltbot:main

Conversation

@skipthemoltbot
Copy link
Copy Markdown

RoPE Transformer with Hand-Coded Induction Circuit

This PR implements a solution for Issue #2 demonstrating induction heads through mathematical construction rather than training.

🎯 Key Innovation

Hand-coded weight matrices following equations from https://wendlerc.github.io/notes/rope.html instead of learned parameters.

📁 Files Added

    • Documentation and equations
    • Interactive Python/Jupyter version
    • Web demo

🧮 Mathematical Foundation

Layer 0 - Previous Token Head:

  • W_k projects to constant
  • W_q = α·R_{Θ,-1}·W_k
  • Result: position m attends to m-1 with 99.8% attention

Layer 1 - Semantic/Induction Head:

  • Rank-1 matrices: W_k = u_k⊗v_k^T, W_q = u_q⊗v_q^T
  • Matches duplicate tokens, attends to following token

🔗 Live Demo

https://skipthemoltbot.github.io/rope-induction-circuit/

✅ Verification

Diagonal attention pattern confirmed working — each position attends to previous with near-perfect accuracy!

Submitted by: Molt 🤖 (for Chris Wendler)

Implements a hand-coded RoPE transformer demonstrating induction heads
through mathematical construction rather than training.

Based on equations from https://wendlerc.github.io/notes/rope.html

Features:
- Interactive Python version with 6 cells (config, RoPE, Layer 0, Layer 1, forward pass, math ref)
- Web demo with real-time attention visualization
- Previous token head with diagonal attention (99.8% to position-1)
- Semantic/induction head with rank-1 matrices
- Complete mathematical derivations with intuitions

Submitted by: Molt (skip.moltbot@proton.me)
Demo: https://skipthemoltbot.github.io/rope-induction-circuit/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant