Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
221 changes: 221 additions & 0 deletions .github/skills/add-disc-fixture/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
---
name: add-disc-fixture
description: 'Add a new Blu-ray disc fixture for testing bdpl analysis. Use when asked to add a new disc, mount an ISO, create test fixtures, or analyze a new BDMV directory. Covers: (1) Surveying disc structure, (2) Running analysis and comparing against expected counts, (3) Debugging mismatches in episode/special detection, (4) Extracting ICS and copying metadata-only fixture files, (5) Creating integration tests and updating conftest/matrix.'
---

# Add Disc Fixture

## Overview

Step-by-step workflow for adding a new Blu-ray disc test fixture to bdpl.
Each fixture captures metadata-only files from a real disc, enabling
integration tests without copyrighted media content.

## When to Use This Skill

- User asks to "add a new disc" or "add disc N"
- User mounts an ISO and wants to create a test fixture
- User provides expected episode/special counts for a disc
- Analysis results don't match expected counts and need debugging

## Prerequisites

- Disc must be accessible (mounted ISO or network path)
- BDMV directory with PLAYLIST/, CLIPINF/, index.bdmv, MovieObject.bdmv
- Python environment with bdpl installed (`pip install -e ".[dev]"`)

## Step-by-Step Workflow

### 1. Survey Disc Structure

Check file counts and sizes to understand disc complexity:

```python
# List MPLS, CLPI files with sizes
Get-ChildItem D:\BDMV\PLAYLIST -Filter "*.mpls" | Select-Object Name, Length
Get-ChildItem D:\BDMV\CLIPINF -Filter "*.clpi" | Select-Object Name, Length
```

### 2. Run Analysis

Create a temporary analysis script or run inline:

```python
from bdpl.analyze import scan_disc
from bdpl.bdmv.mpls import parse_mpls_dir
from bdpl.bdmv.clpi import parse_clpi_dir
from pathlib import Path

p = Path("D:/BDMV")
playlists = parse_mpls_dir(p / "PLAYLIST")
clips = parse_clpi_dir(p / "CLIPINF")
result = scan_disc(p, playlists, clips)

print(f"Episodes: {len(result.episodes)}")
for ep in result.episodes:
print(f" ep{ep.episode}: {ep.playlist} dur={ep.duration_ms/60000:.1f}min")
print(f"Specials: {len(result.special_features)}")
for sf in result.special_features:
print(f" #{sf.index}: {sf.playlist} cat={sf.category} dur={sf.duration_ms/60000:.1f}min")
print(f"Classifications: {result.analysis.get('classifications')}")
```

### 3. Compare Against Expected Counts

If counts match → proceed to step 5 (extract fixture).
If counts don't match → proceed to step 4 (debug).

### 4. Debug Mismatches

See [debugging guide](./references/debug-analysis.md) for systematic
investigation of episode/special count mismatches.

### 5. Extract ICS Menu Data

Find the menu clip (usually a short m2ts with IG streams, often clip 00003
or similar — check CLPI sizes, menu clips have small CLPIs ~292 bytes):

```python
from bdpl.bdmv.ig_stream import demux_ig_stream, _extract_ics_data
from pathlib import Path

# Try likely menu clips
for clip in ['00003', '00004', '00005']:
m2ts = Path(f'D:/BDMV/STREAM/{clip}.m2ts')
if not m2ts.exists():
continue
raw = demux_ig_stream(m2ts)
if raw:
ics_data = _extract_ics_data(raw)
if ics_data:
print(f'{clip}: ICS data {len(ics_data)} bytes')
```

Save the largest ICS as `ics_menu.bin`:

```python
out = Path('tests/fixtures/discN/ics_menu.bin')
out.parent.mkdir(parents=True, exist_ok=True)
out.write_bytes(ics_data)
```

### 6. Create Fixture Directory

**CRITICAL**: Files go directly under `disc{N}/`, NOT under `disc{N}/BDMV/`.
The `_fixture_path()` helper checks for `(path / "PLAYLIST").is_dir()`.

```
tests/fixtures/disc{N}/
├── CLIPINF/*.clpi # All clip info files
├── PLAYLIST/*.mpls # All playlist files
├── META/DL/bdmt_eng.xml # Generic disc title
├── index.bdmv # Title table
├── MovieObject.bdmv # Navigation commands
└── ics_menu.bin # Pre-extracted ICS data
```

Copy files:

```powershell
$src = "D:\BDMV"
$dst = "tests\fixtures\disc{N}"
New-Item -ItemType Directory -Force "$dst\PLAYLIST", "$dst\CLIPINF", "$dst\META\DL"
Copy-Item "$src\PLAYLIST\*.mpls" "$dst\PLAYLIST\"
Copy-Item "$src\CLIPINF\*.clpi" "$dst\CLIPINF\"
Copy-Item "$src\index.bdmv" "$dst\"
Copy-Item "$src\MovieObject.bdmv" "$dst\"
```

Create generic disc title XML (never use real disc titles):

```xml
<?xml version="1.0" encoding="UTF-8"?>
<disclib>
<di:discinfo xmlns:di="urn:BDA:bdmv;discinfo">
<di:title><di:name>TEST DISC {N}</di:name></di:title>
</di:discinfo>
</disclib>
```

Verify total size is under 100KB.

### 7. Create Integration Test

Create `tests/test_disc{N}_scan.py` following this pattern:

```python
"""Integration tests for the disc{N} fixture scan results."""

import pytest
from bdpl.model import DiscAnalysis

pytestmark = pytest.mark.integration

class TestDisc{N}Episodes:
def test_episode_count(self, disc{N}_analysis: DiscAnalysis) -> None:
assert len(disc{N}_analysis.episodes) == EXPECTED_COUNT

def test_episodes_are_ordered(self, disc{N}_analysis: DiscAnalysis) -> None:
assert [ep.episode for ep in disc{N}_analysis.episodes] == list(range(1, EXPECTED_COUNT + 1))

# Add playlist source, duration range checks as appropriate

class TestDisc{N}Specials:
def test_special_count(self, disc{N}_analysis: DiscAnalysis) -> None:
assert len(disc{N}_analysis.special_features) == EXPECTED_SPECIAL_COUNT

# Add category breakdown tests as appropriate

class TestDisc{N}Metadata:
def test_disc_title(self, disc{N}_analysis: DiscAnalysis) -> None:
assert disc{N}_analysis.disc_title == "TEST DISC {N}"
```

### 8. Update Shared Test Infrastructure

**conftest.py** — Add path + analysis fixtures:

```python
@pytest.fixture(scope="session")
def disc{N}_path() -> Path:
"""Return path to bundled disc{N} fixture."""
return _fixture_path("disc{N}")

@pytest.fixture(scope="session")
def disc{N}_analysis(disc{N}_path):
"""Run and cache full analysis for the bundled disc{N} fixture."""
return _analyze_fixture(disc{N}_path)
```

**test_disc_matrix.py** — Add to ALL 6 parametrizations:

1. Episode count + playlists
2. Special visibility (total + visible)
3. Episode segment boundaries (bare list)
4. Special boundary semantics (bare list)
5. Chapter-split special count
6. Disc title extraction

### 9. Validate

```bash
python -m pytest tests/ -x -q # All tests pass
python -m ruff check . # No lint issues
python -m ruff format --check . # No format issues
```

### 10. Clean Up

- Delete any temporary analysis scripts
- Delete any stray screenshots/images
- Verify no copyrighted content in the fixture

## Common Disc Patterns

| Pattern | Description | Example |
|---------|-------------|---------|
| Play_all decomposition | Single long playlist split by IG chapter marks | disc10, disc11 |
| Individual episodes | Each episode has its own MPLS playlist | disc6, disc7 |
| Chapter splitting | Single m2ts with chapter boundaries | disc14 |
| Commentary specials | Individual playlists duplicating play_all clips | disc10, disc12, disc13 |
| Creditless OP/ED | Short playlists (~1.5-2 min) | disc8, disc13 |
123 changes: 123 additions & 0 deletions .github/skills/add-disc-fixture/references/debug-analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Debugging Analysis Mismatches

When bdpl analysis returns wrong episode or special feature counts,
follow this systematic debugging approach.

## Episode Count Wrong

### Too few episodes

1. **Check play_all detection**: Is the longest playlist classified as
`play_all`? Look at `classifications` in the analysis output.

2. **Check IG chapter marks**: Play_all decomposition uses IG menu hints
to find episode boundaries. Verify `ig_menu.chapter_marks` in the
analysis output matches expected episode start chapters.

3. **Play_all-subset ordering**: If individual playlists exist alongside
a play_all, check whether the individual playlists' clips are a subset
of the play_all clips. If `len(pa_episodes) > len(individual_eps)` and
individual clips ⊆ play_all clips, play_all should be preferred.
See `ordering.py` ~line 271.

### Too many episodes

1. **Duplicate detection**: Check `duplicate_groups` in analysis output.
Playlists with identical segment signatures should be deduplicated.

2. **Stream variant filtering**: Variant playlists (same content, different
audio codec) should be in `variant_mpls` set and excluded.

## Special Feature Count Wrong

### Missing specials

1. **Check IG hints for the specials page**: Extract and inspect IG menu
hints, particularly page 4+ (specials pages):

```python
from bdpl.bdmv.ig_stream import parse_ig_from_m2ts, extract_menu_hints
m2ts = Path('D:/BDMV/STREAM/00003.m2ts') # menu clip
ics = parse_ig_from_m2ts(m2ts)
hints = extract_menu_hints(ics)
for h in hints:
if h.jump_title is not None:
print(f'page={h.page_id} btn={h.button_id} JT({h.jump_title})')
```

2. **Chapter-selection page filtering**: Pages where ALL buttons have the
same JumpTitle value are treated as chapter-selection pages. Their
buttons are added to `chapter_selection_jt` and skipped during
commentary detection. This can cause false negatives when specials
pages also happen to route through a single title.

3. **Register-indirect JumpTitle**: `extract_menu_hints` reads `cmd.operand1`
as the JumpTitle value, but does NOT resolve register-indirect values
(when `imm_op1=False`, operand1 is a GPR number, not the title). All
buttons may show the same JT value even though they route differently
at runtime via MovieObject register checks.

4. **Title-hint supplement**: After IG-based detection, the code supplements
with title-hint entries. But playlists classified as `"episode"` or
`"play_all"` are skipped (line ~368 in `__init__.py`). If a playlist is
classified as `"episode"` but NOT actually used as an episode source
(episodes came from play_all), it should be treated as `"commentary"`.

5. **Variant filtering**: Check if the missing special's playlist is in the
`variant_mpls` set. Variant playlists are excluded from special detection.

### Extra specials

1. **Duplicate keys**: Check if multiple IG buttons reference the same
(playlist, chapter_start) combination. The `seen` set should deduplicate.

2. **Bumper/logo playlists**: Very short playlists (<30s) classified as
`bumper` should not appear as specials.

## Tracing the Detection Pipeline

For detailed tracing, inspect intermediate state:

```python
result = scan_disc(p, playlists, clips)

# Key analysis fields
hints = result.analysis.get('disc_hints', {})
cls = result.analysis.get('classifications', {})
ig = hints.get('ig_menu', {})
title_pl = hints.get('title_playlists', {})

# Episode source playlists
ep_playlists = {ep.playlist for ep in result.episodes}

# What the IG buttons target
ig_raw = hints.get('ig_hints_raw', [])
for h in sorted(ig_raw, key=lambda x: (x.page_id, x.button_id)):
if h.jump_title:
title_idx = h.jump_title - 1
target_pl = title_pl.get(title_idx)
print(f'p{h.page_id} b{h.button_id}: JT({h.jump_title}) -> title {title_idx} -> pl {target_pl}')
```

## Key Code Locations

| Component | File | Key Lines |
|-----------|------|-----------|
| Special feature detection | `analyze/__init__.py` | `_detect_special_features()` |
| Chapter-selection heuristic | `analyze/__init__.py` | `chapter_selection_jt` set |
| Title-hint supplement | `analyze/__init__.py` | After IG hints loop |
| Commentary relabeling | `analyze/__init__.py` | Post-loop `f.category == "episode"` |
| Play_all-subset ordering | `analyze/ordering.py` | `indiv_clips <= pa_clips` check |
| IG hint extraction | `bdmv/ig_stream.py` | `extract_menu_hints()` |
| ICS fallback loading | `analyze/__init__.py` | `_parse_ig_hints()` |

## Known Limitations

- **Register-indirect JumpTitle** is not resolved. When all IG buttons use
`SET GPR[n] = value; JumpTitle(GPR[n])`, the code reads the register
number as the title, not the value. The MovieObject handles routing via
PSR[10] (selected button ID) at runtime.

- **Multi-feature playlists** with register-based chapter selection
(SET reg2 before JumpTitle) are supported, but only when `imm_op2=True`
(immediate value). Register-indirect chapter indices are not resolved.