AcT2I

Official code release for "AcT2I: Evaluating and Improving Action Depiction in Text-to-Image Models".

Installation

pip install -e .            # core dependencies
pip install -e ".[analysis]" # + spacy, matplotlib, seaborn, plotly
pip install -e ".[all]"      # everything including dev tools

Overview

The act2i package provides:

Prompt Enhancement (act2i.prompt) — LLM-based knowledge distillation to enrich T2I prompts along emotional, spatial, and temporal dimensions.
Image Generation (act2i.generate) — Diffusers-based T2I generation across multiple seeds and prompt variants.
Feature Extraction (act2i.features) — DINOv2 and SigLIP feature extraction for reference image sets.
Evaluation (act2i.evaluate) — CLIPScore, DINOv2 similarity scoring, OWLv2 zero-shot object detection, and classification metrics.
Analysis (act2i.analysis) — Structural NLP analysis of prompt quality.

See scripts/ for CLI entrypoints.

Citation

@article{malaviya2025act2i,
  author    = {Malaviya, Vatsal and Chatterjee, Agneet and Patel, Maitreya and Yang, Yezhou and Baral, Chitta},
  title     = {AcT2I: Evaluating and Improving Action Depiction in Text-to-Image Models},
  journal   = {arXiv preprint arXiv:2509.16141},
  year      = {2025}
}

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AcT2I

Installation

Overview

Citation

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

AcT2I

Installation

Overview

Citation

License