iWISDM

iWISDM, short for instructed-Virtual VISual Decision Making, is a virtual environment capable of generating a limitless array of vision-language tasks with varying complexity.

It is a toolkit designed to evaluate the ability of multimodal models to follow instructions in visual tasks. It builds on the compositional nature of human behavior, and the fact that complex tasks are often constructed by combining smaller task units together in time.

iWISDM encompasses a broad spectrum of subtasks that engage executive functions such as inhibition of action, working memory, attentional set, task switching, and schema generalization.

It is also a scalable and extensible framework which allows users to easily define their own task space and stimuli dataset.

Examples

Using iWISDM, we have compiled four distinct benchmarks of increasing complexity levels for evaluation of large multi-modal models.

Below is an example of the generated tasks:

These datasets can be generated from /benchmarking or downloaded at: iWISDM_benchsets.tar.gz

iWISDM inherits several classes from COG (github.com/google/cog) to build task graphs. For convenience, we have also pre-implemented several commonly used cognitive tasks in task_bank.py.

For further details, please refer to our paper at:

https://arxiv.org/submit/5678755/view

Usage

Install Instructions

To install the iWISDM package, simply run the following command:

pip install iwisdm

If you would like to install the package from source, you can clone the repository and follow the instructions below:

Install Poetry

curl -sSL https://install.python-poetry.org | python3 -

Create conda python environment

conda create --name iwisdm python=3.11

Install dependencies

poetry install

ShapeNet Environment Initialization

To initialize the ShapeNet environment, you will need to download the ShapeNet dataset, this is for rendering the trials.

To replicate our experiments, you also need to download the benchmarking configurations.

ShapeNet is a large-scale repository of shapes represented by 3D CAD models of objects (Chang et. al. 2015).

Pre-rendered Dataset Download

shapenet_handpicked.tar.gz

Benchmarking Configs Download

configs.tar.gz

Basic Usage

# imports
from iwisdm import make
from iwisdm import read_write

# environment initialization
with open('your/path/to/env_config', 'r') as f:
    config = json.load(f)  # using pre-defined AutoTask configuration
env = make(
    env_id='ShapeNet',
    dataset_fp='your/path/to/shapenet_handpicked',
)
env.set_env_spec(
    env.init_env_spec(
        auto_gen_config=config,
    )
)

# AutoTask procedural task generation and saving trial
tasks = env.generate_tasks(10)  # generate 10 random task graphs and tasks
_, (_, temporal_task) = tasks[0]
trials = env.generate_trials(tasks=[temporal_task])  # generate a trial
imgs, _, info_dict = trials[0]
read_write.write_trial(imgs, info_dict, f'output/trial_{i}')

See /tutorials for more examples.

Acknowledgements

This repository builds upon the foundational work presented in the COG paper (Yang et al.).

Yang, Guangyu Robert, et al. "A dataset and architecture for visual reasoning with a working memory." Proceedings of the European Conference on Computer Vision (ECCV). 2018.

Citation

If you find iWISDM useful in your research, please use the following BibTex:

@InProceedings{pmlr-v274-lei25a,
  title = 	 {iWISDM: Assessing instruction following in multimodal models at scale},
  author =       {Lei, Xiaoxuan and Gomez, Lucas and Bai, Hao Yuan and Bashivan, Pouya},
  booktitle = 	 {Proceedings of The 3rd Conference on Lifelong Learning Agents},
  pages = 	 {457--480},
  year = 	 {2025},
  editor = 	 {Lomonaco, Vincenzo and Melacci, Stefano and Tuytelaars, Tinne and Chandar, Sarath and Pascanu, Razvan},
  volume = 	 {274},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jul--01 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v274/main/assets/lei25a/lei25a.pdf},
  url = 	 {https://proceedings.mlr.press/v274/lei25a.html}
}

Name		Name	Last commit message	Last commit date
Latest commit History 472 Commits
benchmarking		benchmarking
iwisdm		iwisdm
tutorials		tutorials
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

iWISDM

Examples

For further details, please refer to our paper at:

Usage

Install Instructions

Install Poetry

Create conda python environment

Install dependencies

ShapeNet Environment Initialization

Pre-rendered Dataset Download

Benchmarking Configs Download

Basic Usage

See /tutorials for more examples.

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

BashivanLab/iWISDM

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

iWISDM

Examples

For further details, please refer to our paper at:

Usage

Install Instructions

Install Poetry

Create conda python environment

Install dependencies

ShapeNet Environment Initialization

Pre-rendered Dataset Download

Benchmarking Configs Download

Basic Usage

See /tutorials for more examples.

Acknowledgements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages