Skip to content

Latest commit

 

History

History
executable file
·
58 lines (42 loc) · 1.87 KB

File metadata and controls

executable file
·
58 lines (42 loc) · 1.87 KB

video-reason-experiments

Wrapper for running video reasoning experiments using VMEvalKit.

Setup

# Clone and setup
git clone https://github.com/hokindeng/video-reason-experiments.git
cd video-reason-experiments
git submodule update --init --recursive --remote --merge

# Environment
python3 -m venv env && source env/bin/activate
pip install -r requirements.txt
export OPENAI_API_KEY="your-openai-api-key"  # Required for GPT-4O evaluator

# AWS credentials (optional)
cp env.template .env  # Edit with your AWS credentials

Usage

# Generate videos
./scripts/run_inference.sh --model hunyuan-video-i2v --gpu 0 --questions-dir ./data/questions

# Generate all videos automatically (e.g. videocrafter2-512)
bash run_all_parallel.sh videocrafter2-512

# Evaluate videos
# Combine evaluation method (sampling strategy) with evaluator (VLM model)
# Methods: last_frame, multi_frame_uniform, keyframe_detection, hybrid_sampling
# Evaluators: gpt4o, internvl, qwen (must be specified)
./scripts/run_evaluation.sh --eval-method last_frame --evaluator gpt4o
./scripts/run_evaluation.sh --eval-method multi_frame_uniform --evaluator internvl

# S3 sync
python data/s3_sync.py upload ./data/outputs s3://your-bucket/outputs
python data/s3_sync.py download s3://your-bucket/questions ./data/questions

Models

Supported models: All 29+ models in VMEvalKit (see VMEvalKit/docs/MODELS.md for full list)

Evaluation methods:

  • last_frame - Fast single-frame evaluation (recommended)
  • multi_frame_uniform, keyframe_detection, hybrid_sampling - Multi-frame evaluation

Evaluators: gpt4o, internvl, qwen (must be specified)

License

Apache License 2.0 - See LICENSE file for details.