Skip to content

Ryukijano/Nvidia-Cosmos-Cookoff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cosmos Sentinel 🚦

Cosmos Sentinel is an agentic, demo-first traffic safety pipeline. It evaluates dashcam and traffic videos by combining early-warning collision prediction with high-level multimodal reasoning and future-state video generation.

Live Gradio Space

🏗️ Architecture

Cosmos Sentinel runs a three-stage intelligent pipeline:

  1. Gate: BADAS (Ego-Centric Collision Prediction) acts as a high-frequency predictive gate. It processes the video using V-JEPA2 to find the exact high-risk collision timeframe.
  2. Reason: NVIDIA Cosmos Reason 2 provides incident understanding. It takes the full video, the BADAS-identified high-risk clip, and generates structured analysis (severity, actor behavior, environmental hazards).
  3. Predict: NVIDIA Cosmos Predict 2.5 acts as a world-simulator. Based on the Reason narrative, it performs "what-if" rollouts (e.g., generating a future where the collision is prevented vs. observed).

Flow Diagram

graph TD
    A[Input Dashcam Video] -->|Raw Frames| B[BADAS Detector V-JEPA2]
    B -->|Collision Probabilities| C{Risk Threshold Met?}
    C -->|No| D[Log: Safe state, Keep monitoring]
    C -->|Yes| E[Extract Pre-Alert Focused Clip]
    A --> F[NVIDIA Cosmos Reason 2 8B]
    E --> F
    F -->|Risk Analysis & Bounding Boxes| G[Structured Payload Generation]
    G --> H{Run Predict Rollout?}
    H -->|Yes| I[NVIDIA Cosmos Predict 2.5 2B]
    I -->|Prompt: Prevented Collision| J[Counterfactual Video]
    I -->|Prompt: Observed Trajectory| K[Continuation Video]
    G --> L[Streamlit UI Dashboard]
    J --> L
    K --> L
Loading

🚀 Features

  • End-to-End Pipeline: Fully orchestrated from raw MP4 video to intelligent analysis and generated video continuations.
  • Streamlit Dashboard: A rich local dashboard for debugging, visualizing timelines, and reviewing logs.
  • Visual Diagnostics: Generates gradient saliency maps, bounding box overlays, risk gauges, and artifact heatmaps dynamically.

📂 Repository Structure

.
├── demo_streamlit.py         # Streamlit local dashboard
├── main.py                   # Entry point (launches Streamlit)
├── badas_detector.py         # BADAS model loading and sliding-window inference
├── cosmos_risk_narrator.py   # Cosmos Reason 2 prompt building and inference
├── cosmos_predict_runner.py  # Cosmos Predict 2.5 generation logic
├── extract_clip.py           # Focused clip extraction utility
└── main_pipeline.py          # CLI orchestration for the full pipeline

💻 Quickstart (Local Streamlit)

1. Requirements

  • NVIDIA GPU (Ampere or newer, e.g., RTX 3090, A100, H100)
  • Linux (Ubuntu 22.04+)
  • Python 3.10+

2. Install Dependencies

You need to install the dependencies for both the pipeline and the vendored Cosmos Predict package.

pip install -r requirements.txt

Note: If you want to use the Cosmos Predict module locally, you must follow the Cosmos Predict 2.5 Setup Guide to install its specific uv workspace dependencies.

3. Authentication

You need a Hugging Face token to download the gated models (BADAS and Cosmos).

export HF_TOKEN="your_hugging_face_token"
# Optional: Set a persistent cache directory to avoid re-downloading models
export HF_HOME="/path/to/your/large/storage/.huggingface"

4. Run the Streamlit Dashboard

streamlit run demo_streamlit.py

☁️ Hugging Face Space

A Gradio-based version of this app is available on the huggingface-spaces branch and deployed to Cosmos Sentinel on Hugging Face Spaces.

📚 Acknowledgements & References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages