diff --git a/.claude/commands/createpr.md b/.claude/commands/createpr.md
index ad83b1c..ffb999e 100644
--- a/.claude/commands/createpr.md
+++ b/.claude/commands/createpr.md
@@ -3,12 +3,13 @@ Create a pull request using the repository-local PR workflow.
 Workflow:
 
 1. Re-read `README.md`, `AGENTS.md`, and the referenced `ai/*.md` rule files.
-2. Review docs consistency across `README.md`, `docs/design/**`, and public rustdoc for the current diff.
-3. Confirm the repository still has auto-merge enabled and the required branch protection checks configured.
-4. Draft a concise PR title and body.
-5. Run `bash scripts/create-pr.sh --ai-tool-name "Claude Code" --ai-tool-url "https://claude.com/claude-code" --title "<title>" --body-file <temp-file> "$@"`.
-6. If the monitor reports a failed check, inspect that failure immediately. Do not wait for other jobs to finish.
-7. Fix the failure locally, rerun the relevant local verification, push, and resume with `bash scripts/monitor-pr-checks.sh <pr-url-or-number> --interval 30` until all required checks pass.
-8. Use `--allow-stale` only when the user explicitly accepts creating a PR with stale or unverified agent assets.
+2. Self-review the diff against all rules in `ai/numerical-rust-rules.md` and `ai/common-agent-rules.md`. Read those files, then check every changed `.rs` file for violations. Fix any violations before proceeding.
+3. Review docs consistency across `README.md`, `docs/design/**`, and public rustdoc for the current diff.
+4. Confirm the repository still has auto-merge enabled and the required branch protection checks configured.
+5. Draft a concise PR title and body.
+6. Run `bash scripts/create-pr.sh --ai-tool-name "Claude Code" --ai-tool-url "https://claude.com/claude-code" --title "<title>" --body-file <temp-file> "$@"`.
+7. If the monitor reports a failed check, inspect that failure immediately. Do not wait for other jobs to finish.
+8. Fix the failure locally, rerun the relevant local verification, push, and resume with `bash scripts/monitor-pr-checks.sh <pr-url-or-number> --interval 30` until all required checks pass.
+9. Use `--allow-stale` only when the user explicitly accepts creating a PR with stale or unverified agent assets.
 
 Do not skip the script's verification steps. The script is responsible for freshness checks, formatting, release-mode coverage verification, docs, PR creation, optional auto-merge, and fail-fast PR polling.
diff --git a/AGENTS.md b/AGENTS.md
index 6ac9c49..bc0d728 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -7,6 +7,8 @@ Read `README.md` before starting work.
 - Always think/reason in English (set thinking language to English)
 - Source code and docs in English
 - **Bug fixing**: When a bug is discovered, always check related files for similar bugs and propose to the user to inspect them
+- **Plan-time rule review**: Before creating any new plan, reload and review the full coding ruleset (`README.md`, `AGENTS.md`, and the shared rule files under `ai/`). Do this every time a new plan is written, even within the same session; do not rely on memory of an earlier read.
+- **Planning mindset**: When writing a plan, prefer approaches that leave the code cleaner than before. Avoid ad hoc extensions to the current structure; ask what the design would look like from scratch and steer the implementation toward that end state within the task's scope.
 
 ## Context-Efficient Exploration
 
diff --git a/ai/common-agent-rules.md b/ai/common-agent-rules.md
index 620e096..f26c63a 100644
--- a/ai/common-agent-rules.md
+++ b/ai/common-agent-rules.md
@@ -1,5 +1,24 @@
 # Common Agent Rules
 
+## Software Engineering Best Practices
+
+**Follow established software engineering best practices at all times.** The
+rules in this file are concrete examples and reminders, not an exhaustive list.
+When a situation is not explicitly covered, reason from first principles.
+
+Key principles (illustrative, not exhaustive):
+
+- **KISS** — write the simplest code that correctly solves the problem. Do not add abstraction or generality until a second concrete use case requires it.
+- **DRY / Single Source of Truth** — every piece of knowledge has one authoritative representation. When logic, data, or structure appears in more than one place, extract it. This applies to production code, tests, and documentation equally.
+- **Abstraction** — introduce an abstraction when it genuinely hides complexity or enables reuse. Do not abstract speculatively. Once introduced, maintain it: callers must not reach through it to touch internals.
+- **Layering** — each crate exposes a deliberate public API. Downstream crates use high-level APIs; they must not bypass layers to access internals. If a detail is needed, evolve the upstream API instead of leaking it.
+- **Separation of Concerns** — each module, file, and function has one clear responsibility.
+- **Fix root causes, not symptoms** — prefer fundamental redesign over ad-hoc patches. When a proper fix requires changing more code, do it.
+- **Design toward a cleaner end state** — when planning a change, prefer an approach that leaves the codebase clearer than it was before. Do not just bolt new behavior onto the current structure because it is already there; step back, ask how you would design that area from scratch, and move the implementation toward that shape within the task's scope.
+- **No speculative backward compatibility** — do not preserve old interfaces or add shims unless the user explicitly asks. Clean up call sites instead.
+
+When in doubt, ask: *"Would an experienced software engineer consider this clean, maintainable, and easy to reason about?"* If not, simplify.
+
 ## General
 
 - Think and write in English.
@@ -9,6 +28,7 @@
 ## Startup Context
 
 - At session start, read `README.md`, `AGENTS.md`, and the shared rule files under `ai/`.
+- Before creating any new plan, reload and review the full coding ruleset: `README.md`, `AGENTS.md`, and the shared rule files under `ai/`. Do this every time a new plan is written, even within the same session; do not rely on memory of an earlier read.
 - If the repository contains local AI workflow files, inspect them before acting. This includes:
   - repo-local skill files such as `ai/**/SKILL.md`
   - project-local command docs such as `.claude/commands/*.md`
@@ -47,12 +67,6 @@
 
 Keep source files small and focused. Split by behavior or abstraction boundary, not by arbitrary line count.
 
-Benefits:
-
-- smaller public/private surfaces to review
-- easier parallel editing
-- faster navigation for humans and AI
-
 ## Unit Test Organization
 
 - Keep inline `#[cfg(test)]` blocks only in genuinely tiny leaf modules.
@@ -72,9 +86,3 @@ Benefits:
 - Use `[workspace.dependencies]` for dependencies shared across crates in the same workspace.
 - Do not commit sibling local `path` dependencies for repositories meant to build in CI.
 - Prefer reproducible sources for cross-repository dependencies.
-
-## Layering
-
-- Keep public APIs small and deliberate.
-- Downstream crates should use high-level APIs rather than reaching into lower-level internals.
-- Tests should follow the same layering rules as production code.
diff --git a/ai/numerical-rust-rules.md b/ai/numerical-rust-rules.md
index e215892..f324235 100644
--- a/ai/numerical-rust-rules.md
+++ b/ai/numerical-rust-rules.md
@@ -1,5 +1,8 @@
 # Numerical Rust Rules
 
+Domain-specific elaborations of `common-agent-rules.md`. The examples below
+are illustrative — apply the same reasoning to any new pattern.
+
 ## Testing
 
 - Use small deterministic inputs for correctness tests.
@@ -9,19 +12,113 @@
 
 ## Generic Test Patterns
 
-- Share test logic across scalar types with generic helpers or macros instead of duplicating bodies.
+Test logic should be written once and instantiated for each scalar type, not
+copy-pasted (DRY).
+
+- Write test functions generic over `T: Scalar` and instantiate with a macro:
+  ```rust
+  fn test_foo_generic<T: Scalar>() { ... }
+  macro_rules! scalar_tests {
+      ($name:ident, $fn:ident) => { paste::paste! {
+          #[test] fn [<$name _f64>]() { $fn::<f64>(); }
+          #[test] fn [<$name _c64>]() { $fn::<Complex64>(); }
+      }};
+  }
+  scalar_tests!(test_foo, test_foo_generic);
+  ```
 - Centralize approximate equality helpers so tolerance policy lives in one place.
 - Parameterize over algorithms when the validation logic is the same.
 
 ## Performance Habits
 
-- Avoid duplicate typed implementations when generic code or a macro is enough.
 - Avoid allocating dense temporary buffers when the operation can work over strided or borrowed views.
 - Avoid zero-filling buffers that are immediately overwritten.
 - Avoid repeated index multiplication inside hot loops when incremental offsets suffice.
 - Avoid allocations inside hot loops.
 - Precompute plans and reusable metadata outside execution loops.
 
+## Backend and Scalar Type Genericity
+
+Every time you write a function, ask: "Can I replace this concrete type with a
+type parameter and a trait bound?" If yes, do it.
+
+### No TypeId / type_name dispatch in library code
+
+Never use `std::any::TypeId` or `std::any::type_name` to branch on scalar or
+backend types in production code. Use trait dispatch instead.
+
+**Bad:**
+```rust
+if TypeId::of::<T>() == TypeId::of::<f64>() { ... }
+if type_name::<B>() == type_name::<CpuBackend>() { ... }  // also unstable
+```
+
+**Good — trait with associated constant:**
+```rust
+trait HasCDtype { const C_TYPE: CDataType; }
+impl HasCDtype for f64 { const C_TYPE: CDataType = C_F64; }
+fn get_dtype<T: HasCDtype>() -> CDataType { T::C_TYPE }
+```
+
+If a closed set of types must be mapped to external C constants and a trait is
+truly impractical, isolate the TypeId dispatch in one private function and
+document why. No new TypeId checks elsewhere.
+
+### No hardcoded concrete backend types in APIs
+
+**Bad:**
+```rust
+pub(crate) type ActiveBackend = CpuBackend;
+pub fn run(ctx: &mut CpuContext, ...) { ... }
+```
+
+**Good:**
+```rust
+pub fn run<B: TensorBackend>(ctx: &mut B::Context, ...) { ... }
+```
+
+### Scalar type conversions: use From/Into or T::from_xxx
+
+Never embed type names in conversion function names (`from_f64`, `to_c64`,
+`dense_f64_to_tensor`). Use standard traits or trait methods instead.
+
+**Bad:**
+```rust
+fn dense_f64_to_tensor(s: &Storage) -> Tensor<f64> { ... }
+fn dense_c64_to_tensor(s: &Storage) -> Tensor<Complex64> { ... }
+```
+
+**Good:**
+```rust
+fn storage_to_tensor<T: Scalar>(s: &Storage) -> Tensor<T> { ... }
+// T::from_real(x)  — nalgebra ComplexField
+// T::from_f64(x)   — num_traits FromPrimitive
+// Complex64::from(1.0_f64)  — std From
+```
+
+### Deduplicate scalar-type impls with macros
+
+When the same impl block, match arm, or function body repeats for multiple
+scalar types with only the type name changed, use a generic helper or a
+declarative macro.
+
+**Bad — repeated match arms:**
+```rust
+match storage {
+    Storage::DenseF64(d) => d.iter().map(|x| x.abs()).sum::<f64>(),
+    Storage::DenseF32(d) => d.iter().map(|x| x.abs()).sum::<f64>(),
+    ...
+}
+```
+
+**Good:**
+```rust
+fn sum_abs<T: Scalar>(slice: &[T]) -> f64 { slice.iter().map(|x| x.abs_real()).sum() }
+```
+
+For C-API boilerplate that cannot be made generic, use `macro_rules!` with
+`paste::paste!` to generate repeated struct and function names.
+
 ## Numerical API Design
 
 - Keep examples small enough to read in rustdoc.
diff --git a/ai/pr-workflow-rules.md b/ai/pr-workflow-rules.md
index 19b1d97..ae0db73 100644
--- a/ai/pr-workflow-rules.md
+++ b/ai/pr-workflow-rules.md
@@ -7,6 +7,11 @@
 
 ## Required Checks
 
+Before pushing or creating a PR, reload and review the full coding ruleset:
+`README.md`, `AGENTS.md`, and the shared rule files under `ai/`. Do this every
+time you prepare a PR, even within the same session; do not rely on memory of
+an earlier read.
+
 Before pushing or creating a PR, all of these must pass:
 
 ```bash