Skip to content

Latest commit

 

History

History
27 lines (16 loc) · 855 Bytes

File metadata and controls

27 lines (16 loc) · 855 Bytes

VJEPA2

Self-supervised visual representation learning from video. Part of the Zen LM ecosystem.

License

Overview

VJEPA2 implements Video Joint-Embedding Predictive Architecture for learning visual representations from unlabeled video data without relying on hand-crafted augmentations.

Features

  • Self-supervised learning from video
  • No hand-crafted augmentations required
  • Pre-trained visual encoder for downstream tasks
  • Efficient training with masking strategies

Related

  • jin — Multimodal understanding framework
  • Zen LM — Full model family

License

See LICENSE file.

Part of the Zen LM ecosystem by Hanzo AI