- Separate results from clip benchmark repo
- Update default output file to {dataset}{pretrained}{model}{language}{task}.json, otherwise same result.json file overwritten
- Fix imagenetv2 dead urls
- Add XTD_10 dataset
- Fix issues (#131, #125, #138)
- Fix missing sugar crepe example #119 thanks to @samarth4149
- Fix overwritten zeroshot templates issue (#109)
- Support new multilingual retrieval datasets: Crossmodal-3600, XTD10, Flickr30k-200, and XTD200
- Support tuning linear probing on validation set
- Custom classnames and templates
- support wds for captioning evaluation
- support imagenet-w
- support babel imagenet
- support chinese flickr30k/8k
- support sugar crepe (compositionality)
- support (optional) sharding evaluation based on rank, for parallel runs
- fix many issues
- Fix silent webdataset error-handling
- Added support for wds/voc2007_multilabel
- default to float32
- add mscoco generative benchmark
- update flickr8k results, solve issue #48, thanks to @orchidmajumder
- Evaluate multiple models/datasets/languages using the CLI directly
- Support Japanese CLIP by rinna
- Add arabic imagenet
- updating CuPL prompts with more generated sentences + ensembled with openAI prompts
- put model in eval mode before evaluation
- Webdataset updates
- Make verbose the default
- Added support for loading webdatasets
- Added better support for multilingual eval
- Added better support for linear probing
- Added support for CuPL prompts
- pypi description as markdown
- Actual first release on PyPI.
- First release on PyPI.