Context
NNsight provides explicit support and convenience operations for certain types of models under src/nnsight/modeling/. As of writing, the supported model types are:
LanguageModel — causal language models (GPT-2, LLaMA, etc.)
VisionLanguageModel — vision-language models (LLaVA, Qwen2-VL, etc.)
DiffusionModel — diffusion pipelines (Stable Diffusion, etc.)
These wrappers handle model-specific concerns like tokenization, input preprocessing, batching, and generation — so users can pass natural inputs (strings, images, etc.) directly into .trace() and .invoke().
Many model types are not yet supported with dedicated wrappers: audio models (Whisper, EnCodec), speech-to-text, text-to-speech, video models, multimodal models beyond vision-language, and more.
The Bounty
Pick one of the following:
Option A: Create a new model type wrapper
Write a new class under src/nnsight/modeling/ for a currently unsupported model type. Good candidates include:
- Audio models — e.g., Whisper (speech-to-text), EnCodec (audio codec), MusicGen (music generation)
- Text-to-speech models — e.g., Bark, SpeechT5
- Video models — e.g., CogVideo, VideoMAE
- Image generation models (non-diffusion) — e.g., VAE-based generators
- Encoder-only models — e.g., a dedicated BERT/ViT wrapper with task-specific conveniences
- Or any other model type you find interesting
Your wrapper should:
- Inherit from the appropriate base class (
HuggingFaceModel, TransformersModel, etc.)
- Handle input preprocessing (e.g., audio file → features, text → tokens)
- Implement
_prepare_input() and _batch() for multi-input batching via invokers
- Include a working example demonstrating a trace with at least one intervention (e.g., saving or modifying an intermediate activation)
Look at language.py and vlm.py as references for how existing wrappers are structured.
Option B: Extend an existing model type
Add meaningful new features or helpers to one of the existing model type wrappers. For example:
- Better generation support or sampling controls
- New convenience methods for common intervention patterns
- Support for additional input formats or preprocessing options
- Improved handling of model-specific outputs (e.g., attention masks, cross-attention)
Submission
Add your submission to submissions/<issue-number>-<your-github-username>/ with:
- A
README.md explaining your approach, what model type you chose and why, and how to test it
- Your code (the new or modified wrapper class, plus any utilities)
- A working example script or notebook demonstrating the wrapper in action
- Either include the code directly in your submission, or link to an open NNsight PR — both are accepted
Resources
Context
NNsight provides explicit support and convenience operations for certain types of models under
src/nnsight/modeling/. As of writing, the supported model types are:LanguageModel— causal language models (GPT-2, LLaMA, etc.)VisionLanguageModel— vision-language models (LLaVA, Qwen2-VL, etc.)DiffusionModel— diffusion pipelines (Stable Diffusion, etc.)These wrappers handle model-specific concerns like tokenization, input preprocessing, batching, and generation — so users can pass natural inputs (strings, images, etc.) directly into
.trace()and.invoke().Many model types are not yet supported with dedicated wrappers: audio models (Whisper, EnCodec), speech-to-text, text-to-speech, video models, multimodal models beyond vision-language, and more.
The Bounty
Pick one of the following:
Option A: Create a new model type wrapper
Write a new class under
src/nnsight/modeling/for a currently unsupported model type. Good candidates include:Your wrapper should:
HuggingFaceModel,TransformersModel, etc.)_prepare_input()and_batch()for multi-input batching via invokersLook at
language.pyandvlm.pyas references for how existing wrappers are structured.Option B: Extend an existing model type
Add meaningful new features or helpers to one of the existing model type wrappers. For example:
Submission
Add your submission to
submissions/<issue-number>-<your-github-username>/with:README.mdexplaining your approach, what model type you chose and why, and how to test itResources