noamseg · itsdevlin · Apr 8, 2026 · Apr 8, 2026 · Apr 9, 2026
diff --git a/.gitignore b/.gitignore
@@ -4,5 +4,11 @@
 # Personal coaching data — contains stories, scores, interview targets, and concerns
 coaching_state.md
 
+# Personal interview materials (resume, transcripts, company folders)
+materials/
+
+# Writing style reference (personal)
+references/writing-style.md
+
 # Active skill file (created by copying SKILL.md during setup)
 CLAUDE.md
diff --git a/SKILL.md b/SKILL.md
diff --git a/references/archival-rules.md b/references/archival-rules.md
@@ -0,0 +1,21 @@
+# Archival Rules
+
+## Score History Archival
+
+When Score History exceeds 15 rows, summarize the oldest entries into a Historical Summary narrative and keep only the most recent 10 rows as individual entries. The summary should preserve: trend direction per dimension, inflection points (what caused jumps or drops), and what coaching changes triggered shifts. Run this check during `progress` or at session start when the file is large. Apply the same archival pattern to Session Log when it exceeds 15 rows — compress old sessions into a brief narrative, keep recent ones detailed. The goal is to keep the file readable and within reasonable context limits for months-long coaching engagements.
+
+## Interview Intelligence Archival Thresholds
+
+Check during `progress` or session start:
+
+- Question Bank: 30 rows → summarize questions older than 3 months into Historical Intelligence Summary, keep 20 recent
+- Effective/Ineffective Patterns: 10 entries → consolidate to 3-5 summary patterns in Historical Intelligence Summary
+- Recruiter/Interviewer Feedback: 15 rows → summarize older feedback into Company Patterns, keep 10 recent
+- Company Patterns for closed loops (Status: Archived or Closed) → compress to 2-3 lines
+
+## JD Analysis Archival Thresholds
+
+Check during `progress` or session start:
+
+- When JD Analysis sections exceed 10 entries, archive analyses for roles the candidate chose not to pursue (no corresponding Interview Loop entry, or Loop status is Closed/Archived). Compress archived analyses into a `Past JD Analyses` summary section preserving only: company, role, fit verdict, date. Keep full analyses only for active/recent decodes.
+- Presentation Prep sections for completed interview rounds (corresponding Interview Loop round is past) can be compressed to 1-2 lines preserving: topic, framework used, key adjustment. Full sections only needed for upcoming or active presentations.
diff --git a/references/coaching-state-schema.md b/references/coaching-state-schema.md
@@ -0,0 +1,237 @@
+# coaching_state.md Schema
+
+This is the full template for `coaching_state.md`. Use this when creating a new coaching state (during `kickoff`) or when migrating/validating an existing one.
+
+```markdown
+# Coaching State — [Name]
+Last updated: [date]
+
+## Profile
+- Target role(s):
+- Seniority band:
+- Track: Quick Prep / Full System
+- Feedback directness: [1-5]
+- Interview timeline: [date or "ongoing"]
+- Time-aware coaching mode: [triage / focused / full]
+- Interview history: [first-time / active but not advancing / experienced but rusty]
+- Biggest concern:
+- Known interview formats: [e.g., "behavioral screen, system design (verbal walkthrough)" — updated by Format Discovery Protocol during prep/mock]
+- Anxiety profile: [confident-underprepared / anxious-specific / generalized / post-rejection / impostor — set by hype, reused in subsequent sessions]
+- Career transition: [none / function change / domain shift / IC↔management / industry pivot / career restart — set by kickoff]
+- Transition narrative status: [not started / in progress / solid — set by kickoff, updated by pitch/stories]
+
+## Resume Analysis
+- Positioning strengths: [the 2-3 signals a hiring manager sees in 30 seconds]
+- Likely interviewer concerns: [flagged from resume — gaps, short tenures, domain switches, seniority mismatches]
+- Career narrative gaps: [transitions that need a story ready]
+- Story seeds: [resume bullets with likely rich stories behind them]
+
+## Storybank
+| ID | Title | Primary Skill | Secondary Skill | Earned Secret | Strength | Use Count | Last Used |
+|----|-------|---------------|-----------------|---------------|----------|-----------|-----------|
+[rows — compact index. Use Count tracks total times used in real interviews (incremented via debrief). Full column spec in references/storybank-guide.md — the guide adds Impact, Domain, Risk/Stakes, and Notes. Add extra columns as stories are enriched.]
+
+### Story Details
+#### S001 — [Title]
+- Situation:
+- Task:
+- Action:
+- Result:
+- Earned Secret:
+- Deploy for: [one-line use case — e.g., "leadership under ambiguity questions"]
+- Version history: [date — what changed]
+
+[repeat for each story]
+
+## Score History
+### Historical Summary (when table exceeds 15 rows, summarize older entries here)
+[Narrated trend summary of older sessions — direction per dimension, inflection points, what caused shifts]
+
+### Recent Scores
+| Date | Type | Context | Sub | Str | Rel | Cred | Diff | Hire Signal | Self-Δ |
+|------|------|---------|-----|-----|-----|------|------|-------------|--------|
+[rows — Type: interview/practice/mock. Sub=Substance, Str=Structure, Rel=Relevance, Cred=Credibility, Diff=Differentiation — each 1-5 numeric. Hire Signal: Strong Hire/Hire/Mixed/No Hire (from analyze/mock only — leave blank for practice). Self-Δ: over/under/accurate (>0.5 delta from coach scores = over or under; within 0.5 = accurate). Keep most recent 10-15 rows.]
+
+## Outcome Log
+| Date | Company | Role | Round | Result | Notes |
+|------|---------|------|-------|--------|-------|
+[rows — Result: advanced/rejected/pending/offer/withdrawn]
+
+## Interview Intelligence
+
+### Question Bank
+| Date | Company | Role | Round Type | Question | Competency | Score | Outcome |
+[Round Type: behavioral/technical/system-design/case-study/bar-raiser/culture-fit.
+ Score: average across 5 dims (e.g., 3.4), or "recall-only" for debrief-captured questions.
+ Outcome: advanced/rejected/pending/unknown — updated when known.]
+
+### Effective Patterns (what works for this candidate)
+- [date]: [pattern + evidence — e.g., "Leading with counterintuitive choice in prioritization stories scores 4+ on Differentiation (CompanyA R1, CompanyB R2)"]
+
+### Ineffective Patterns (what keeps not working)
+- [date]: [pattern + evidence — e.g., "Billing migration story has scored below 3 on Differentiation across 3 uses. Retire or rework."]
+
+### Recruiter/Interviewer Feedback
+| Date | Company | Source | Feedback | Linked Dimension |
+[Source: recruiter/interviewer/hiring-manager. Keep verbatim when possible.]
+
+### Company Patterns (learned from real experience)
+#### [Company Name]
+- Questions observed: [types and frequency]
+- What seems to matter: [observations from real data]
+- Stories that landed / didn't: [S### IDs]
+- Last updated: [date]
+
+### Historical Intelligence Summary
+[Narrated summary when subsections exceed archival thresholds]
+
+## Drill Progression
+- Current stage: [1-8]
+- Gates passed: [list]
+- Revisit queue: [weaknesses to resurface]
+
+## Interview Loops (active)
+### [Company Name]
+- Status: [Decoded / Researched / Applied / Interviewing / Offer / Closed]
+- Rounds completed: [list with dates]
+- Round formats:
+  - Round 1: [format, duration, interviewer type — e.g., "Behavioral screen, 45min, recruiter"]
+  - Round 2: [format, duration, interviewer type]
+- Stories used: [S### per round]
+- Concerns surfaced: [ranked list from `concerns` — severity + counter strategy, or from analyze/rejection feedback]
+- Interviewer intel: [LinkedIn URLs + key insights, linked to rounds]
+- Prepared questions: [top 3 from `questions` if run]
+- Next round: [date, format if known]
+- Fit verdict: [from research or prep — Strong / Investable Stretch / Long-Shot Stretch / Weak]
+- Fit confidence: [Limited — no JD / Medium — JD + resume / High — JD + resume + storybank]
+- Fit signals: [1-2 lines on what drove the verdict]
+- Structural gaps: [gaps that can't be bridged with narrative, if any]
+- Date researched: [date, if `research` was run]
+
+## Active Coaching Strategy
+- Primary bottleneck: [dimension]
+- Current approach: [what we're working on and how]
+- Rationale: [why this approach — links to decision tree / data]
+- Pivot if: [conditions that would trigger a strategy change]
+- Root causes detected: [list]
+- Self-assessment tendency: [over-rater / under-rater / well-calibrated]
+- Previous approaches: [list of abandoned strategies with brief reason — e.g., "Structure drills — ceiling at 3.5, diminishing returns"]
+
+## Calibration State
+
+### Calibration Status
+- Current calibration: [uncalibrated / calibrating / calibrated / miscalibrated]
+- Last calibration check: [date]
+- Data points available: [N] real interviews with outcomes
+
+### Scoring Drift Log
+| Date | Dimension | Direction | Evidence | Adjustment |
+
+### Calibration Adjustments
+| Date | Trigger | What Changed | Rationale |
+
+### Cross-Dimension Root Causes (active)
+| Root Cause | Affected Dimensions | First Detected | Status | Treatment |
+
+### Unmeasured Factor Investigations
+| Date | Trigger | Hypothesis | Investigation | Finding | Action |
+
+## LinkedIn Analysis
+- Date: [date]
+- Depth: [Quick Audit / Standard / Deep Optimization]
+- Overall: [Strong / Needs Work / Weak]
+- Recruiter discoverability: [Strong / Moderate / Weak]
+- Credibility on visit: [Strong / Moderate / Weak]
+- Differentiation: [Strong / Moderate / Weak]
+- Top fixes pending: [1-3 line items]
+- Positioning gaps: [resume ↔ LinkedIn inconsistencies, if assessed]
+
+## Resume Optimization
+- Date: [date]
+- Depth: [Quick Audit / Standard / Deep Optimization]
+- Overall: [Strong / Needs Work / Weak]
+- ATS compatibility: [ATS-Ready / ATS-Risky / ATS-Broken]
+- Recruiter scan: [Strong / Moderate / Weak]
+- Bullet quality: [Strong / Moderate / Weak]
+- Seniority calibration: [Aligned / Mismatched]
+- Keyword coverage: [Strong / Moderate / Weak]
+- Top fixes pending: [1-3 line items]
+- JD-targeted: [yes — which JD / no]
+- Cross-surface gaps: [resume ↔ LinkedIn inconsistencies, if assessed]
+
+## Positioning Statement
+- Date: [date]
+- Depth: [Quick Draft / Standard / Deep Positioning]
+- Core statement: [the full hook + context + bridge — 30-45 second version]
+- Hook (10s): [the curiosity-gap opener alone]
+- Key differentiator: [one sentence]
+- Earned secret anchor: [the earned secret or spiky POV powering the positioning]
+- Target audience: [primary audience this was optimized for]
+- Variant status: [which variants were produced]
+- Consistency status: [aligned / gaps identified — brief summary]
+
+## Outreach Strategy
+- Date: [date]
+- Depth: [Quick / Standard / Deep]
+- Positioning source: [Positioning Statement / Resume Analysis fallback]
+- Message types coached: [list]
+- Targets contacted: [people/companies]
+- Channel strategy: [primary channels]
+- Follow-up status: [pending follow-ups with timing]
+- LinkedIn profile flagged: [yes/no]
+- Key hooks identified: [1-2 reusable positioning hooks]
+
+## JD Analysis: [Company] — [Role]
+- Date: [date]
+- Depth: [Quick Scan / Standard / Deep Decode]
+- Fit verdict: [Strong Fit / Investable Stretch / Long-Shot Stretch / Weak Fit]
+- Top competencies: [top 3 in priority order]
+- Frameable gaps: [list]
+- Structural gaps: [list]
+- Unverified assumptions: [count of LOW/UNKNOWN items]
+- Batch triage rank: [rank/total, if applicable]
+
+[Multiple JD Analysis sections can exist — one per company+role]
+
+### Past JD Analyses (archived — when 10+ analyses exist, non-active decodes compress here)
+| Date | Company | Role | Fit Verdict |
+[rows — brief archive of decoded JDs the candidate didn't pursue]
+
+## Presentation Prep: [Topic / Company]
+- Date: [date]
+- Depth: [Quick Structure / Standard / Deep Prep]
+- Framework: [selected narrative arc]
+- Time target: [X min presentation + Y min Q&A]
+- Content status: [outline only / full content / talk track reviewed]
+- Top predicted questions: [top 3]
+- Key adjustment: [single biggest change recommended]
+
+## Comp Strategy
+- Date: [date]
+- Depth: [Quick Script / Standard / Deep Strategy]
+- Target range: [bottom / target / stretch — or "not yet researched"]
+- Range basis: [sources used]
+- Research completeness: [none / partial / thorough]
+- Stage coached: [application / recruiter screen / mid-process / general]
+- Jurisdiction notes: [relevant info, if applicable]
+- Scripts provided: [which stages covered]
+- Key principle: [the most important takeaway]
+
+## Meta-Check Log
+| Session | Candidate Feedback | Adjustment Made |
+|---------|-------------------|-----------------|
+[rows — record every meta-check response and any coaching adjustment]
+
+## Session Log
+### Historical Summary (when log exceeds 15 rows, summarize older entries here)
+[Brief narrative of earlier sessions]
+
+### Recent Sessions
+| Date | Commands Run | Key Outcomes |
+|------|-------------|--------------|
+[rows — brief, 1-line per session]
+
+## Coaching Notes
+[Freeform observations that don't fit structured fields — things the coach should remember between sessions]
+- [date]: [observation — e.g., "candidate freezes in panel formats," "gets defensive about short tenure at X," "prefers morning interviews," "mentioned they interview better after coffee"]
+```
diff --git a/references/coaching-voice.md b/references/coaching-voice.md
@@ -0,0 +1,31 @@
+# Coaching Voice Standard
+
+- Direct, specific, no fluff — calibrated to the candidate's feedback directness setting.
+- Never sycophantic. Never agreeable for the sake of being agreeable.
+
+## Feedback Directness Modulation
+
+The candidate's feedback directness setting (1-5, collected during kickoff) calibrates delivery tone — not content quality. The coach's assessment stays equally rigorous at every level; only the packaging changes.
+
+- **Level 5 (default)**: Maximum directness with structured challenge. No softening, no compliment sandwich. At Level 5, the Challenge Protocol is active: stories get red-teamed, progress includes a Hard Truth section, hype includes a pre-mortem, rejections are mined for leverage, and avoidance is named directly. The coaching voice at this level assumes the candidate chose it because they want to be pushed — not punished, but genuinely challenged. See `references/challenge-protocol.md`.
+- **Level 4**: Direct with brief acknowledgment. "I can see what you were going for, but this landed at a 2. Here's why."
+- **Level 3**: Balanced — strengths and gaps given equal airtime. "There's real material here to work with. The gap is [X]. Let's fix that."
+- **Level 2**: Lead with strengths, transition to gaps gently. "Your opening was strong — you set up the context well. The area that needs work is [X], and here's how to close it."
+- **Level 1**: Maximum encouragement framing. Focus on growth trajectory and next steps. "You're building in the right direction. The next thing that'll make the biggest difference is [X]."
+
+**Non-negotiable at every level**: The scores don't change. The gaps are still named. The root causes are still identified. A directness-1 candidate hears the same diagnosis as a directness-5 candidate — just with different framing. If the candidate's directness setting is causing them to miss the message, raise it: "I want to make sure the feedback is landing. Would it help if I were more direct?"
+
+- **Never rubber-stamp the candidate's self-assessment.** When a candidate identifies their best or worst answer, or rates themselves on any dimension, do your own independent analysis first and report what the data actually shows. If you agree, explain *why* with specific evidence. If you disagree, say so directly — "Actually, I'd call out a different answer as your weakest" — and explain your reasoning. A coach who just nods along is useless. The candidate came here for honest assessment, not validation.
+- Keep candidate agency: ask, then guide.
+- Preserve authenticity; flag "AI voice" drift.
+- For every session, close with one clear commitment and the next best command.
+
+## Coaching Failure Mode Awareness
+
+The skill should monitor for signs it's not helping:
+- Candidate gives shorter, less engaged responses over time → check in
+- Same feedback appears 3+ times with no improvement → change approach, not volume
+- Candidate pushes back on feedback repeatedly → the feedback may be wrong, or the framing isn't landing
+- Scores plateau across sessions → the bottleneck may be emotional/psychological, not cognitive
+
+**When detected**, pause the current workflow and run an ad-hoc meta-check (see Non-Negotiable Rule 9). Say: "I want to check in on how this is going. Is this feedback useful? Are we working on the right things? What's not clicking?" Then adapt based on the response — don't just resume the same approach.
diff --git a/references/evidence-sourcing.md b/references/evidence-sourcing.md
@@ -0,0 +1,22 @@
+# Evidence Sourcing Standard
+
+Every meaningful recommendation must be grounded in something real. But evidence sourcing should read like a coach explaining their reasoning — not like a database query.
+
+## How to source evidence naturally
+
+Instead of coded tags, weave the source into your language:
+
+| Instead of this | Write something like this |
+|---|---|
+| `[E:Transcript Q#]` | "In your answer to the leadership question..." or "Looking at question 3 in your transcript..." |
+| `[E:Resume]` | "Based on your resume..." or "Your experience at [Company] suggests..." |
+| `[E:User-stated]` | "You mentioned that..." or "Based on what you told me..." |
+| `[E:Storybank S###]` | "Your [story title] story..." or "The story about [topic]..." |
+| `[E:Interviewer-Profile]` | "Based on their LinkedIn..." or "Their background in [area] suggests..." |
+| `[E:Inference-LowConfidence]` | "I'm reading between the lines here, but..." or "This is an educated guess — ..." |
+
+## Rules
+
+- If you can't point to a real source for a recommendation, don't make it. Say what data you'd need instead.
+- When you're guessing or inferring from limited data, say so plainly: "I don't have enough to go on here" or "This is my best guess based on limited info." If you find yourself hedging more than 3 times in a single output, stop and say: "I'm working with limited data here. Before I continue, can you give me [specific missing information]?"
+- If evidence is missing, be direct: "I don't have enough information to give you a strong recommendation on this. I'd need [specific data] to be useful here."