diff --git a/.github/skills/add-disc-fixture/SKILL.md b/.github/skills/add-disc-fixture/SKILL.md new file mode 100644 index 0000000..79258b4 --- /dev/null +++ b/.github/skills/add-disc-fixture/SKILL.md @@ -0,0 +1,221 @@ +--- +name: add-disc-fixture +description: 'Add a new Blu-ray disc fixture for testing bdpl analysis. Use when asked to add a new disc, mount an ISO, create test fixtures, or analyze a new BDMV directory. Covers: (1) Surveying disc structure, (2) Running analysis and comparing against expected counts, (3) Debugging mismatches in episode/special detection, (4) Extracting ICS and copying metadata-only fixture files, (5) Creating integration tests and updating conftest/matrix.' +--- + +# Add Disc Fixture + +## Overview + +Step-by-step workflow for adding a new Blu-ray disc test fixture to bdpl. +Each fixture captures metadata-only files from a real disc, enabling +integration tests without copyrighted media content. + +## When to Use This Skill + +- User asks to "add a new disc" or "add disc N" +- User mounts an ISO and wants to create a test fixture +- User provides expected episode/special counts for a disc +- Analysis results don't match expected counts and need debugging + +## Prerequisites + +- Disc must be accessible (mounted ISO or network path) +- BDMV directory with PLAYLIST/, CLIPINF/, index.bdmv, MovieObject.bdmv +- Python environment with bdpl installed (`pip install -e ".[dev]"`) + +## Step-by-Step Workflow + +### 1. Survey Disc Structure + +Check file counts and sizes to understand disc complexity: + +```python +# List MPLS, CLPI files with sizes +Get-ChildItem D:\BDMV\PLAYLIST -Filter "*.mpls" | Select-Object Name, Length +Get-ChildItem D:\BDMV\CLIPINF -Filter "*.clpi" | Select-Object Name, Length +``` + +### 2. Run Analysis + +Create a temporary analysis script or run inline: + +```python +from bdpl.analyze import scan_disc +from bdpl.bdmv.mpls import parse_mpls_dir +from bdpl.bdmv.clpi import parse_clpi_dir +from pathlib import Path + +p = Path("D:/BDMV") +playlists = parse_mpls_dir(p / "PLAYLIST") +clips = parse_clpi_dir(p / "CLIPINF") +result = scan_disc(p, playlists, clips) + +print(f"Episodes: {len(result.episodes)}") +for ep in result.episodes: + print(f" ep{ep.episode}: {ep.playlist} dur={ep.duration_ms/60000:.1f}min") +print(f"Specials: {len(result.special_features)}") +for sf in result.special_features: + print(f" #{sf.index}: {sf.playlist} cat={sf.category} dur={sf.duration_ms/60000:.1f}min") +print(f"Classifications: {result.analysis.get('classifications')}") +``` + +### 3. Compare Against Expected Counts + +If counts match → proceed to step 5 (extract fixture). +If counts don't match → proceed to step 4 (debug). + +### 4. Debug Mismatches + +See [debugging guide](./references/debug-analysis.md) for systematic +investigation of episode/special count mismatches. + +### 5. Extract ICS Menu Data + +Find the menu clip (usually a short m2ts with IG streams, often clip 00003 +or similar — check CLPI sizes, menu clips have small CLPIs ~292 bytes): + +```python +from bdpl.bdmv.ig_stream import demux_ig_stream, _extract_ics_data +from pathlib import Path + +# Try likely menu clips +for clip in ['00003', '00004', '00005']: + m2ts = Path(f'D:/BDMV/STREAM/{clip}.m2ts') + if not m2ts.exists(): + continue + raw = demux_ig_stream(m2ts) + if raw: + ics_data = _extract_ics_data(raw) + if ics_data: + print(f'{clip}: ICS data {len(ics_data)} bytes') +``` + +Save the largest ICS as `ics_menu.bin`: + +```python +out = Path('tests/fixtures/discN/ics_menu.bin') +out.parent.mkdir(parents=True, exist_ok=True) +out.write_bytes(ics_data) +``` + +### 6. Create Fixture Directory + +**CRITICAL**: Files go directly under `disc{N}/`, NOT under `disc{N}/BDMV/`. +The `_fixture_path()` helper checks for `(path / "PLAYLIST").is_dir()`. + +``` +tests/fixtures/disc{N}/ +├── CLIPINF/*.clpi # All clip info files +├── PLAYLIST/*.mpls # All playlist files +├── META/DL/bdmt_eng.xml # Generic disc title +├── index.bdmv # Title table +├── MovieObject.bdmv # Navigation commands +└── ics_menu.bin # Pre-extracted ICS data +``` + +Copy files: + +```powershell +$src = "D:\BDMV" +$dst = "tests\fixtures\disc{N}" +New-Item -ItemType Directory -Force "$dst\PLAYLIST", "$dst\CLIPINF", "$dst\META\DL" +Copy-Item "$src\PLAYLIST\*.mpls" "$dst\PLAYLIST\" +Copy-Item "$src\CLIPINF\*.clpi" "$dst\CLIPINF\" +Copy-Item "$src\index.bdmv" "$dst\" +Copy-Item "$src\MovieObject.bdmv" "$dst\" +``` + +Create generic disc title XML (never use real disc titles): + +```xml + + + +TEST DISC {N} + + +``` + +Verify total size is under 100KB. + +### 7. Create Integration Test + +Create `tests/test_disc{N}_scan.py` following this pattern: + +```python +"""Integration tests for the disc{N} fixture scan results.""" + +import pytest +from bdpl.model import DiscAnalysis + +pytestmark = pytest.mark.integration + +class TestDisc{N}Episodes: + def test_episode_count(self, disc{N}_analysis: DiscAnalysis) -> None: + assert len(disc{N}_analysis.episodes) == EXPECTED_COUNT + + def test_episodes_are_ordered(self, disc{N}_analysis: DiscAnalysis) -> None: + assert [ep.episode for ep in disc{N}_analysis.episodes] == list(range(1, EXPECTED_COUNT + 1)) + + # Add playlist source, duration range checks as appropriate + +class TestDisc{N}Specials: + def test_special_count(self, disc{N}_analysis: DiscAnalysis) -> None: + assert len(disc{N}_analysis.special_features) == EXPECTED_SPECIAL_COUNT + + # Add category breakdown tests as appropriate + +class TestDisc{N}Metadata: + def test_disc_title(self, disc{N}_analysis: DiscAnalysis) -> None: + assert disc{N}_analysis.disc_title == "TEST DISC {N}" +``` + +### 8. Update Shared Test Infrastructure + +**conftest.py** — Add path + analysis fixtures: + +```python +@pytest.fixture(scope="session") +def disc{N}_path() -> Path: + """Return path to bundled disc{N} fixture.""" + return _fixture_path("disc{N}") + +@pytest.fixture(scope="session") +def disc{N}_analysis(disc{N}_path): + """Run and cache full analysis for the bundled disc{N} fixture.""" + return _analyze_fixture(disc{N}_path) +``` + +**test_disc_matrix.py** — Add to ALL 6 parametrizations: + +1. Episode count + playlists +2. Special visibility (total + visible) +3. Episode segment boundaries (bare list) +4. Special boundary semantics (bare list) +5. Chapter-split special count +6. Disc title extraction + +### 9. Validate + +```bash +python -m pytest tests/ -x -q # All tests pass +python -m ruff check . # No lint issues +python -m ruff format --check . # No format issues +``` + +### 10. Clean Up + +- Delete any temporary analysis scripts +- Delete any stray screenshots/images +- Verify no copyrighted content in the fixture + +## Common Disc Patterns + +| Pattern | Description | Example | +|---------|-------------|---------| +| Play_all decomposition | Single long playlist split by IG chapter marks | disc10, disc11 | +| Individual episodes | Each episode has its own MPLS playlist | disc6, disc7 | +| Chapter splitting | Single m2ts with chapter boundaries | disc14 | +| Commentary specials | Individual playlists duplicating play_all clips | disc10, disc12, disc13 | +| Creditless OP/ED | Short playlists (~1.5-2 min) | disc8, disc13 | diff --git a/.github/skills/add-disc-fixture/references/debug-analysis.md b/.github/skills/add-disc-fixture/references/debug-analysis.md new file mode 100644 index 0000000..e19652e --- /dev/null +++ b/.github/skills/add-disc-fixture/references/debug-analysis.md @@ -0,0 +1,123 @@ +# Debugging Analysis Mismatches + +When bdpl analysis returns wrong episode or special feature counts, +follow this systematic debugging approach. + +## Episode Count Wrong + +### Too few episodes + +1. **Check play_all detection**: Is the longest playlist classified as + `play_all`? Look at `classifications` in the analysis output. + +2. **Check IG chapter marks**: Play_all decomposition uses IG menu hints + to find episode boundaries. Verify `ig_menu.chapter_marks` in the + analysis output matches expected episode start chapters. + +3. **Play_all-subset ordering**: If individual playlists exist alongside + a play_all, check whether the individual playlists' clips are a subset + of the play_all clips. If `len(pa_episodes) > len(individual_eps)` and + individual clips ⊆ play_all clips, play_all should be preferred. + See `ordering.py` ~line 271. + +### Too many episodes + +1. **Duplicate detection**: Check `duplicate_groups` in analysis output. + Playlists with identical segment signatures should be deduplicated. + +2. **Stream variant filtering**: Variant playlists (same content, different + audio codec) should be in `variant_mpls` set and excluded. + +## Special Feature Count Wrong + +### Missing specials + +1. **Check IG hints for the specials page**: Extract and inspect IG menu + hints, particularly page 4+ (specials pages): + + ```python + from bdpl.bdmv.ig_stream import parse_ig_from_m2ts, extract_menu_hints + m2ts = Path('D:/BDMV/STREAM/00003.m2ts') # menu clip + ics = parse_ig_from_m2ts(m2ts) + hints = extract_menu_hints(ics) + for h in hints: + if h.jump_title is not None: + print(f'page={h.page_id} btn={h.button_id} JT({h.jump_title})') + ``` + +2. **Chapter-selection page filtering**: Pages where ALL buttons have the + same JumpTitle value are treated as chapter-selection pages. Their + buttons are added to `chapter_selection_jt` and skipped during + commentary detection. This can cause false negatives when specials + pages also happen to route through a single title. + +3. **Register-indirect JumpTitle**: `extract_menu_hints` reads `cmd.operand1` + as the JumpTitle value, but does NOT resolve register-indirect values + (when `imm_op1=False`, operand1 is a GPR number, not the title). All + buttons may show the same JT value even though they route differently + at runtime via MovieObject register checks. + +4. **Title-hint supplement**: After IG-based detection, the code supplements + with title-hint entries. But playlists classified as `"episode"` or + `"play_all"` are skipped (line ~368 in `__init__.py`). If a playlist is + classified as `"episode"` but NOT actually used as an episode source + (episodes came from play_all), it should be treated as `"commentary"`. + +5. **Variant filtering**: Check if the missing special's playlist is in the + `variant_mpls` set. Variant playlists are excluded from special detection. + +### Extra specials + +1. **Duplicate keys**: Check if multiple IG buttons reference the same + (playlist, chapter_start) combination. The `seen` set should deduplicate. + +2. **Bumper/logo playlists**: Very short playlists (<30s) classified as + `bumper` should not appear as specials. + +## Tracing the Detection Pipeline + +For detailed tracing, inspect intermediate state: + +```python +result = scan_disc(p, playlists, clips) + +# Key analysis fields +hints = result.analysis.get('disc_hints', {}) +cls = result.analysis.get('classifications', {}) +ig = hints.get('ig_menu', {}) +title_pl = hints.get('title_playlists', {}) + +# Episode source playlists +ep_playlists = {ep.playlist for ep in result.episodes} + +# What the IG buttons target +ig_raw = hints.get('ig_hints_raw', []) +for h in sorted(ig_raw, key=lambda x: (x.page_id, x.button_id)): + if h.jump_title: + title_idx = h.jump_title - 1 + target_pl = title_pl.get(title_idx) + print(f'p{h.page_id} b{h.button_id}: JT({h.jump_title}) -> title {title_idx} -> pl {target_pl}') +``` + +## Key Code Locations + +| Component | File | Key Lines | +|-----------|------|-----------| +| Special feature detection | `analyze/__init__.py` | `_detect_special_features()` | +| Chapter-selection heuristic | `analyze/__init__.py` | `chapter_selection_jt` set | +| Title-hint supplement | `analyze/__init__.py` | After IG hints loop | +| Commentary relabeling | `analyze/__init__.py` | Post-loop `f.category == "episode"` | +| Play_all-subset ordering | `analyze/ordering.py` | `indiv_clips <= pa_clips` check | +| IG hint extraction | `bdmv/ig_stream.py` | `extract_menu_hints()` | +| ICS fallback loading | `analyze/__init__.py` | `_parse_ig_hints()` | + +## Known Limitations + +- **Register-indirect JumpTitle** is not resolved. When all IG buttons use + `SET GPR[n] = value; JumpTitle(GPR[n])`, the code reads the register + number as the title, not the value. The MovieObject handles routing via + PSR[10] (selected button ID) at runtime. + +- **Multi-feature playlists** with register-based chapter selection + (SET reg2 before JumpTitle) are supported, but only when `imm_op2=True` + (immediate value). Register-indirect chapter indices are not resolved.