Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
.ruff_cache
.uv-cache
.venv
.python-version
__pycache__/
*.pyc
*.pyo
Expand Down Expand Up @@ -33,3 +34,7 @@ Output/
CorridorKeyModule/checkpoints/
gvm_core/weights/
VideoMaMaInferenceModule/checkpoints/

# Frontend build artifacts (node_modules, not the build output)
web/frontend/node_modules/
web/frontend/.svelte-kit/
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,14 @@ CorridorKey_remote.bat
.ipynb_checkpoints/
.DS_Store

# Projects / user content
Projects/

# WebUI
web/frontend/node_modules/
web/frontend/build/
web/frontend/.svelte-kit/

# IDE
.vscode/
.idea/
Expand Down
48 changes: 44 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ Naturally, I have not tested everything. If you encounter errors, please conside
* **Resolution Independent:** The engine dynamically scales inference to handle 4K plates while predicting using its native 2048x2048 high-fidelity backbone.
* **VFX Standard Outputs:** Natively reads and writes 16-bit and 32-bit Linear float EXR files, preserving true color math for integration in Nuke, Fusion, or Resolve.
* **Auto-Cleanup:** Includes a morphological cleanup system to automatically prune any tracking markers or tiny background features that slip through CorridorKey's detection.
* **WebUI:** Browser-based interface with drag-and-drop upload, one-click full pipeline, real-time progress, video playback, A/B comparison, and project management. Run via `docker compose --profile web up -d` and open `localhost:3000`.

## Hardware Requirements

Expand Down Expand Up @@ -104,14 +105,53 @@ Perhaps in the future, I will implement other generators for the AlphaHint! In t

Please give feedback and share your results!

### Docker (Linux + NVIDIA GPU)
### WebUI (Browser-based)

If you prefer not to install dependencies locally, you can run CorridorKey in Docker.
CorridorKey includes a full web interface for managing clips, running inference, and previewing results — no terminal required.

**Quick start with Docker Compose:**
```bash
docker compose --profile web up -d --build # first run builds the image (~5 min)
# Open http://localhost:3000

# Subsequent runs (no rebuild needed unless code changes):
docker compose --profile web up -d
```

**Quick start without Docker:**
```bash
uv sync --group dev --extra web # install web dependencies
uv sync --group dev --extra web --extra cuda # with CUDA GPU support
uv run uvicorn web.api.app:create_app --factory --port 3000
# Open http://localhost:3000
```

**WebUI Features:**
- **Upload & organize** — drag-and-drop videos or zipped frame sequences, organize into projects
- **Full pipeline** — one-click processing: extract frames → generate alpha hints (GVM/VideoMaMa) → run inference
- **Real-time progress** — WebSocket-driven progress bars with ETA and fps counter
- **Frame viewer** — scrub through frames, play as video (ffmpeg-stitched MP4), A/B comparison mode
- **Download outputs** — download any pass (FG, Matte, Comp, Processed) as ZIP
- **Job queue** — parallel CPU jobs (extraction) + GPU jobs with configurable VRAM limits
- **Weight management** — download CorridorKey, GVM, and VideoMaMa weights from HuggingFace directly in Settings
- **VRAM monitoring** — system-wide GPU memory usage (via nvidia-smi)
- **Right-click context menus** — rename projects, move clips, batch process, delete
- **Keyboard shortcuts** — press `?` to see all shortcuts

**Important notes:**
- **Clip storage:** The WebUI manages clips under `Projects/`, while the CLI wizard uses `ClipsForInference/`. These directories are independent — clips created in the WebUI won't appear in the CLI and vice versa. Set `CK_CLIPS_DIR` to point at `ClipsForInference/` if you want both to use the same directory.
- **Mac / MLX:** The WebUI has not been validated on Mac with MLX inference. The server will start and the UI will work, but the VRAM meter will show N/A (nvidia-smi is not available on Mac) and the VRAM concurrency limit for parallel jobs is not enforced on non-CUDA systems.
- Model weights are volume-mounted and persist across Docker rebuilds.
- The web service uses the `web` Docker Compose profile.

### Docker CLI (Linux + NVIDIA GPU)

If you prefer the command-line interface in Docker:

Prerequisites:
- Docker Engine + Docker Compose plugin installed.
- NVIDIA driver installed on the host (Linux), with CUDA compatibility for the PyTorch CUDA 12.6 wheels used by this project.
- NVIDIA Container Toolkit installed and configured for Docker (`nvidia-smi` should work on host, and `docker run --rm --gpus all nvidia/cuda:12.6.3-runtime-ubuntu22.04 nvidia-smi` should succeed).
- NVIDIA driver installed on the host (Linux), with CUDA compatibility for the PyTorch CUDA 12.8 wheels used by this project.
- NVIDIA Container Toolkit installed and configured for Docker (`nvidia-smi` should work on host, and `docker run --rm --gpus all nvidia/cuda:12.8.0-runtime-ubuntu22.04 nvidia-smi` should succeed).

1. Build the image:
```bash
Expand Down
103 changes: 55 additions & 48 deletions backend/job_queue.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
- Jobs have stable IDs assigned at creation time
- Deduplication prevents double-submit of same clip+job_type
- Job history preserved for UI display (cancelled/completed/failed)
- Multiple jobs can run simultaneously (local + remote nodes)
"""

from __future__ import annotations
Expand Down Expand Up @@ -85,7 +86,7 @@ def check_cancelled(self) -> None:


class GPUJobQueue:
"""Thread-safe GPU job queue with mutual exclusion.
"""Thread-safe GPU job queue supporting multiple concurrent running jobs.

Usage (CLI mode):
queue = GPUJobQueue()
Expand All @@ -103,15 +104,15 @@ class GPUJobQueue:
except Exception as e:
queue.fail_job(job, str(e))

Usage (GUI mode):
The GPU worker QThread calls next_job() / start_job() / complete_job()
in its run loop. The UI submits jobs from the main thread.
Usage (distributed):
Multiple workers (local + remote nodes) can claim and run jobs
simultaneously. All running jobs are tracked and visible in the API.
"""

def __init__(self):
self._queue: deque[GPUJob] = deque()
self._lock = threading.Lock()
self._current_job: GPUJob | None = None
self._running_jobs: list[GPUJob] = []
self._history: list[GPUJob] = [] # completed/cancelled/failed jobs for UI display

# Callbacks (set by UI or CLI)
Expand Down Expand Up @@ -143,17 +144,17 @@ def submit(self, job: GPUJob) -> bool:
f"(already queued as {existing.id})"
)
return False
if (
self._current_job
and self._current_job.clip_name == job.clip_name
and self._current_job.job_type == job.job_type
and self._current_job.status == JobStatus.RUNNING
):
logger.warning(
f"Duplicate job rejected: {job.job_type.value} for '{job.clip_name}' "
f"(already running as {self._current_job.id})"
)
return False
for running in self._running_jobs:
if (
running.clip_name == job.clip_name
and running.job_type == job.job_type
and running.status == JobStatus.RUNNING
):
logger.warning(
f"Duplicate job rejected: {job.job_type.value} for '{job.clip_name}' "
f"(already running as {running.id})"
)
return False

job.status = JobStatus.QUEUED
self._queue.append(job)
Expand All @@ -173,15 +174,15 @@ def start_job(self, job: GPUJob) -> None:
if job in self._queue:
self._queue.remove(job)
job.status = JobStatus.RUNNING
self._current_job = job
self._running_jobs.append(job)
logger.info(f"Job started [{job.id}]: {job.job_type.value} for '{job.clip_name}'")

def complete_job(self, job: GPUJob) -> None:
"""Mark a job as successfully completed."""
with self._lock:
job.status = JobStatus.COMPLETED
if self._current_job is job:
self._current_job = None
if job in self._running_jobs:
self._running_jobs.remove(job)
self._history.append(job)
logger.info(f"Job completed [{job.id}]: {job.job_type.value} for '{job.clip_name}'")
# Emit AFTER lock release (Codex: no deadlock risk)
Expand All @@ -193,25 +194,20 @@ def fail_job(self, job: GPUJob, error: str) -> None:
with self._lock:
job.status = JobStatus.FAILED
job.error_message = error
if self._current_job is job:
self._current_job = None
if job in self._running_jobs:
self._running_jobs.remove(job)
self._history.append(job)
logger.error(f"Job failed [{job.id}]: {job.job_type.value} for '{job.clip_name}': {error}")
# Emit AFTER lock release
if self.on_error:
self.on_error(job.clip_name, error)

def mark_cancelled(self, job: GPUJob) -> None:
"""Mark a running job as cancelled AND clear _current_job.

This is the cancel-safe path that was missing — calling
job.request_cancel() alone doesn't clear _current_job, which
poisons queue state for subsequent jobs.
"""
"""Mark a running job as cancelled AND remove from running list."""
with self._lock:
job.status = JobStatus.CANCELLED
if self._current_job is job:
self._current_job = None
if job in self._running_jobs:
self._running_jobs.remove(job)
self._history.append(job)
logger.info(f"Job cancelled [{job.id}]: {job.job_type.value} for '{job.clip_name}'")

Expand All @@ -230,17 +226,19 @@ def cancel_job(self, job: GPUJob) -> None:
logger.info(f"Job cancel requested [{job.id}]: {job.job_type.value} for '{job.clip_name}'")

def cancel_current(self) -> None:
"""Cancel the currently running job, if any."""
"""Cancel all currently running jobs."""
with self._lock:
if self._current_job and self._current_job.status == JobStatus.RUNNING:
self._current_job.request_cancel()
for job in self._running_jobs:
if job.status == JobStatus.RUNNING:
job.request_cancel()

def cancel_all(self) -> None:
"""Cancel current job and clear the queue."""
"""Cancel all running jobs and clear the queue."""
with self._lock:
# Cancel current
if self._current_job and self._current_job.status == JobStatus.RUNNING:
self._current_job.request_cancel()
# Cancel running
for job in self._running_jobs:
if job.status == JobStatus.RUNNING:
job.request_cancel()
# Clear queue — preserve in history
for job in self._queue:
job.status = JobStatus.CANCELLED
Expand All @@ -249,10 +247,13 @@ def cancel_all(self) -> None:
logger.info("All jobs cancelled")

def report_progress(self, clip_name: str, current: int, total: int) -> None:
"""Report progress for the current job. Called by processing code."""
if self._current_job:
self._current_job.current_frame = current
self._current_job.total_frames = total
"""Report progress for a job by clip name. Called by processing code."""
with self._lock:
for job in self._running_jobs:
if job.clip_name == clip_name and job.status == JobStatus.RUNNING:
job.current_frame = current
job.total_frames = total
break
if self.on_progress:
self.on_progress(clip_name, current, total)

Expand All @@ -263,10 +264,11 @@ def report_warning(self, message: str) -> None:
self.on_warning(message)

def find_job_by_id(self, job_id: str) -> GPUJob | None:
"""Find a job by ID in queue, current, or history."""
"""Find a job by ID in running, queue, or history."""
with self._lock:
if self._current_job and self._current_job.id == job_id:
return self._current_job
for job in self._running_jobs:
if job.id == job_id:
return job
for job in self._queue:
if job.id == job_id:
return job
Expand All @@ -292,8 +294,15 @@ def has_pending(self) -> bool:

@property
def current_job(self) -> GPUJob | None:
"""Return the first running job (backward compat). Use running_jobs for all."""
with self._lock:
return self._running_jobs[0] if self._running_jobs else None

@property
def running_jobs(self) -> list[GPUJob]:
"""Return a copy of all currently running jobs."""
with self._lock:
return self._current_job
return list(self._running_jobs)

@property
def pending_count(self) -> int:
Expand All @@ -314,11 +323,9 @@ def history_snapshot(self) -> list[GPUJob]:

@property
def all_jobs_snapshot(self) -> list[GPUJob]:
"""Return current + queued + history for full queue panel display."""
"""Return running + queued + history for full queue panel display."""
with self._lock:
result = []
if self._current_job:
result.append(self._current_job)
result = list(self._running_jobs)
result.extend(self._queue)
result.extend(self._history)
return result
21 changes: 21 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,24 @@ services:
- ./VideoMaMaInferenceModule/checkpoints:/app/VideoMaMaInferenceModule/checkpoints
stdin_open: true
tty: true

corridorkey-web:
profiles: ["web"]
build:
context: .
dockerfile: web/Dockerfile.web
image: corridorkey-web:latest
user: "${UID:-1000}:${GID:-1000}"
gpus: ${CK_GPUS:-all}
ports:
- "3000:3000"
environment:
- OPENCV_IO_ENABLE_OPENEXR=1
- NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES:-all}
- NVIDIA_DRIVER_CAPABILITIES=${NVIDIA_DRIVER_CAPABILITIES:-compute,utility,video}
- CK_CLIPS_DIR=/app/Projects
volumes:
- ./Projects:/app/Projects
- ./CorridorKeyModule/checkpoints:/app/CorridorKeyModule/checkpoints
- ./gvm_core/weights:/app/gvm_core/weights
- ./VideoMaMaInferenceModule/checkpoints:/app/VideoMaMaInferenceModule/checkpoints
5 changes: 5 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,11 @@ cuda = [
mlx = [
"corridorkey-mlx ; python_version >= '3.11'",
]
web = [
"fastapi>=0.115",
"uvicorn[standard]>=0.34",
"python-multipart>=0.0.9",
]

[dependency-groups]
dev = ["pytest", "pytest-cov", "ruff", "hypothesis"]
Expand Down
Loading
Loading