diff --git a/.claude/commands/createpr.md b/.claude/commands/createpr.md
index ad83b1c..ffb999e 100644
--- a/.claude/commands/createpr.md
+++ b/.claude/commands/createpr.md
@@ -3,12 +3,13 @@ Create a pull request using the repository-local PR workflow.
Workflow:
1. Re-read `README.md`, `AGENTS.md`, and the referenced `ai/*.md` rule files.
-2. Review docs consistency across `README.md`, `docs/design/**`, and public rustdoc for the current diff.
-3. Confirm the repository still has auto-merge enabled and the required branch protection checks configured.
-4. Draft a concise PR title and body.
-5. Run `bash scripts/create-pr.sh --ai-tool-name "Claude Code" --ai-tool-url "https://claude.com/claude-code" --title "
" --body-file "$@"`.
-6. If the monitor reports a failed check, inspect that failure immediately. Do not wait for other jobs to finish.
-7. Fix the failure locally, rerun the relevant local verification, push, and resume with `bash scripts/monitor-pr-checks.sh --interval 30` until all required checks pass.
-8. Use `--allow-stale` only when the user explicitly accepts creating a PR with stale or unverified agent assets.
+2. Self-review the diff against all rules in `ai/numerical-rust-rules.md` and `ai/common-agent-rules.md`. Read those files, then check every changed `.rs` file for violations. Fix any violations before proceeding.
+3. Review docs consistency across `README.md`, `docs/design/**`, and public rustdoc for the current diff.
+4. Confirm the repository still has auto-merge enabled and the required branch protection checks configured.
+5. Draft a concise PR title and body.
+6. Run `bash scripts/create-pr.sh --ai-tool-name "Claude Code" --ai-tool-url "https://claude.com/claude-code" --title "" --body-file "$@"`.
+7. If the monitor reports a failed check, inspect that failure immediately. Do not wait for other jobs to finish.
+8. Fix the failure locally, rerun the relevant local verification, push, and resume with `bash scripts/monitor-pr-checks.sh --interval 30` until all required checks pass.
+9. Use `--allow-stale` only when the user explicitly accepts creating a PR with stale or unverified agent assets.
Do not skip the script's verification steps. The script is responsible for freshness checks, formatting, release-mode coverage verification, docs, PR creation, optional auto-merge, and fail-fast PR polling.
diff --git a/AGENTS.md b/AGENTS.md
index 6ac9c49..bc0d728 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -7,6 +7,8 @@ Read `README.md` before starting work.
- Always think/reason in English (set thinking language to English)
- Source code and docs in English
- **Bug fixing**: When a bug is discovered, always check related files for similar bugs and propose to the user to inspect them
+- **Plan-time rule review**: Before creating any new plan, reload and review the full coding ruleset (`README.md`, `AGENTS.md`, and the shared rule files under `ai/`). Do this every time a new plan is written, even within the same session; do not rely on memory of an earlier read.
+- **Planning mindset**: When writing a plan, prefer approaches that leave the code cleaner than before. Avoid ad hoc extensions to the current structure; ask what the design would look like from scratch and steer the implementation toward that end state within the task's scope.
## Context-Efficient Exploration
diff --git a/ai/common-agent-rules.md b/ai/common-agent-rules.md
index 620e096..f26c63a 100644
--- a/ai/common-agent-rules.md
+++ b/ai/common-agent-rules.md
@@ -1,5 +1,24 @@
# Common Agent Rules
+## Software Engineering Best Practices
+
+**Follow established software engineering best practices at all times.** The
+rules in this file are concrete examples and reminders, not an exhaustive list.
+When a situation is not explicitly covered, reason from first principles.
+
+Key principles (illustrative, not exhaustive):
+
+- **KISS** — write the simplest code that correctly solves the problem. Do not add abstraction or generality until a second concrete use case requires it.
+- **DRY / Single Source of Truth** — every piece of knowledge has one authoritative representation. When logic, data, or structure appears in more than one place, extract it. This applies to production code, tests, and documentation equally.
+- **Abstraction** — introduce an abstraction when it genuinely hides complexity or enables reuse. Do not abstract speculatively. Once introduced, maintain it: callers must not reach through it to touch internals.
+- **Layering** — each crate exposes a deliberate public API. Downstream crates use high-level APIs; they must not bypass layers to access internals. If a detail is needed, evolve the upstream API instead of leaking it.
+- **Separation of Concerns** — each module, file, and function has one clear responsibility.
+- **Fix root causes, not symptoms** — prefer fundamental redesign over ad-hoc patches. When a proper fix requires changing more code, do it.
+- **Design toward a cleaner end state** — when planning a change, prefer an approach that leaves the codebase clearer than it was before. Do not just bolt new behavior onto the current structure because it is already there; step back, ask how you would design that area from scratch, and move the implementation toward that shape within the task's scope.
+- **No speculative backward compatibility** — do not preserve old interfaces or add shims unless the user explicitly asks. Clean up call sites instead.
+
+When in doubt, ask: *"Would an experienced software engineer consider this clean, maintainable, and easy to reason about?"* If not, simplify.
+
## General
- Think and write in English.
@@ -9,6 +28,7 @@
## Startup Context
- At session start, read `README.md`, `AGENTS.md`, and the shared rule files under `ai/`.
+- Before creating any new plan, reload and review the full coding ruleset: `README.md`, `AGENTS.md`, and the shared rule files under `ai/`. Do this every time a new plan is written, even within the same session; do not rely on memory of an earlier read.
- If the repository contains local AI workflow files, inspect them before acting. This includes:
- repo-local skill files such as `ai/**/SKILL.md`
- project-local command docs such as `.claude/commands/*.md`
@@ -47,12 +67,6 @@
Keep source files small and focused. Split by behavior or abstraction boundary, not by arbitrary line count.
-Benefits:
-
-- smaller public/private surfaces to review
-- easier parallel editing
-- faster navigation for humans and AI
-
## Unit Test Organization
- Keep inline `#[cfg(test)]` blocks only in genuinely tiny leaf modules.
@@ -72,9 +86,3 @@ Benefits:
- Use `[workspace.dependencies]` for dependencies shared across crates in the same workspace.
- Do not commit sibling local `path` dependencies for repositories meant to build in CI.
- Prefer reproducible sources for cross-repository dependencies.
-
-## Layering
-
-- Keep public APIs small and deliberate.
-- Downstream crates should use high-level APIs rather than reaching into lower-level internals.
-- Tests should follow the same layering rules as production code.
diff --git a/ai/numerical-rust-rules.md b/ai/numerical-rust-rules.md
index e215892..f324235 100644
--- a/ai/numerical-rust-rules.md
+++ b/ai/numerical-rust-rules.md
@@ -1,5 +1,8 @@
# Numerical Rust Rules
+Domain-specific elaborations of `common-agent-rules.md`. The examples below
+are illustrative — apply the same reasoning to any new pattern.
+
## Testing
- Use small deterministic inputs for correctness tests.
@@ -9,19 +12,113 @@
## Generic Test Patterns
-- Share test logic across scalar types with generic helpers or macros instead of duplicating bodies.
+Test logic should be written once and instantiated for each scalar type, not
+copy-pasted (DRY).
+
+- Write test functions generic over `T: Scalar` and instantiate with a macro:
+ ```rust
+ fn test_foo_generic() { ... }
+ macro_rules! scalar_tests {
+ ($name:ident, $fn:ident) => { paste::paste! {
+ #[test] fn [<$name _f64>]() { $fn::(); }
+ #[test] fn [<$name _c64>]() { $fn::(); }
+ }};
+ }
+ scalar_tests!(test_foo, test_foo_generic);
+ ```
- Centralize approximate equality helpers so tolerance policy lives in one place.
- Parameterize over algorithms when the validation logic is the same.
## Performance Habits
-- Avoid duplicate typed implementations when generic code or a macro is enough.
- Avoid allocating dense temporary buffers when the operation can work over strided or borrowed views.
- Avoid zero-filling buffers that are immediately overwritten.
- Avoid repeated index multiplication inside hot loops when incremental offsets suffice.
- Avoid allocations inside hot loops.
- Precompute plans and reusable metadata outside execution loops.
+## Backend and Scalar Type Genericity
+
+Every time you write a function, ask: "Can I replace this concrete type with a
+type parameter and a trait bound?" If yes, do it.
+
+### No TypeId / type_name dispatch in library code
+
+Never use `std::any::TypeId` or `std::any::type_name` to branch on scalar or
+backend types in production code. Use trait dispatch instead.
+
+**Bad:**
+```rust
+if TypeId::of::() == TypeId::of::() { ... }
+if type_name::() == type_name::() { ... } // also unstable
+```
+
+**Good — trait with associated constant:**
+```rust
+trait HasCDtype { const C_TYPE: CDataType; }
+impl HasCDtype for f64 { const C_TYPE: CDataType = C_F64; }
+fn get_dtype() -> CDataType { T::C_TYPE }
+```
+
+If a closed set of types must be mapped to external C constants and a trait is
+truly impractical, isolate the TypeId dispatch in one private function and
+document why. No new TypeId checks elsewhere.
+
+### No hardcoded concrete backend types in APIs
+
+**Bad:**
+```rust
+pub(crate) type ActiveBackend = CpuBackend;
+pub fn run(ctx: &mut CpuContext, ...) { ... }
+```
+
+**Good:**
+```rust
+pub fn run(ctx: &mut B::Context, ...) { ... }
+```
+
+### Scalar type conversions: use From/Into or T::from_xxx
+
+Never embed type names in conversion function names (`from_f64`, `to_c64`,
+`dense_f64_to_tensor`). Use standard traits or trait methods instead.
+
+**Bad:**
+```rust
+fn dense_f64_to_tensor(s: &Storage) -> Tensor { ... }
+fn dense_c64_to_tensor(s: &Storage) -> Tensor { ... }
+```
+
+**Good:**
+```rust
+fn storage_to_tensor(s: &Storage) -> Tensor { ... }
+// T::from_real(x) — nalgebra ComplexField
+// T::from_f64(x) — num_traits FromPrimitive
+// Complex64::from(1.0_f64) — std From
+```
+
+### Deduplicate scalar-type impls with macros
+
+When the same impl block, match arm, or function body repeats for multiple
+scalar types with only the type name changed, use a generic helper or a
+declarative macro.
+
+**Bad — repeated match arms:**
+```rust
+match storage {
+ Storage::DenseF64(d) => d.iter().map(|x| x.abs()).sum::(),
+ Storage::DenseF32(d) => d.iter().map(|x| x.abs()).sum::(),
+ ...
+}
+```
+
+**Good:**
+```rust
+fn sum_abs(slice: &[T]) -> f64 { slice.iter().map(|x| x.abs_real()).sum() }
+```
+
+For C-API boilerplate that cannot be made generic, use `macro_rules!` with
+`paste::paste!` to generate repeated struct and function names.
+
## Numerical API Design
- Keep examples small enough to read in rustdoc.
diff --git a/ai/pr-workflow-rules.md b/ai/pr-workflow-rules.md
index 19b1d97..ae0db73 100644
--- a/ai/pr-workflow-rules.md
+++ b/ai/pr-workflow-rules.md
@@ -7,6 +7,11 @@
## Required Checks
+Before pushing or creating a PR, reload and review the full coding ruleset:
+`README.md`, `AGENTS.md`, and the shared rule files under `ai/`. Do this every
+time you prepare a PR, even within the same session; do not rely on memory of
+an earlier read.
+
Before pushing or creating a PR, all of these must pass:
```bash