Skip to content

RuntimeError: Sizes of tensors must match except in dimension 1. [WSL on Windows] #15

@tjennings

Description

@tjennings

Hey! Thanks for your hard work creating captionr. Unfortunately, I'm seeing this error on both Windows native and WSL. Any help is appreciated!

Python version is 3.8.10.

Command:

python captionr.py /mnt/d/model_training/deltron/images/500px/people --blip2_question_file /mnt/d/model_training/deltron/captions/blip2/question.txt --prepend_text "a photo of " --existing=skip --cap_length=75 --blip_pass --use_blip2 --blip2_model blip2_opt/pretrain_opt6.7b --clip_model_name=ViT-L-14/openai --uniquify_tags --device=cuda --extension=txt

Exception:

ERROR:root:Exception during BLIP captioning
Traceback (most recent call last):
  File "/mnt/d/model_training/code/captionr/captionr/captionr_class.py", line 139, in process_img
    new_caption = config._blip.caption(img)
  File "/mnt/d/model_training/code/captionr/captionr/blip2_cap.py", line 22, in caption
    return self.model.generate({"image": image})[0]
  File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/lavis/models/blip2_models/blip2_opt.py", line 213, in generate
    outputs = self.opt_model.generate(
  File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/transformers/generation/utils.py", line 1490, in generate
    return self.beam_search(
  File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/transformers/generation/utils.py", line 2749, in beam_search
    outputs = self(
  File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/lavis/models/blip2_models/modeling_opt.py", line 1037, in forward
    outputs = self.model.decoder(
  File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/lavis/models/blip2_models/modeling_opt.py", line 703, in forward
    inputs_embeds = torch.cat([query_embeds, inputs_embeds], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 25 but got size 5 for tensor number 1 in the list.

Pip packages:

`Package Version


altair 4.2.2
antlr4-python3-runtime 4.9.3
asttokens 2.2.1
attrs 22.2.0
backcall 0.2.0
backports.zoneinfo 0.2.1
blinker 1.5
blip-vit 0.0.3
blis 0.7.9
braceexpand 0.1.7
cachetools 5.3.0
catalogue 2.0.8
certifi 2022.12.7
cfgv 3.3.1
charset-normalizer 3.1.0
click 8.1.3
cmake 3.26.0
confection 0.0.4
contexttimer 0.3.3
contourpy 1.0.7
cycler 0.11.0
cymem 2.0.7
decorator 5.1.1
decord 0.6.0
distlib 0.3.6
einops 0.6.0
entrypoints 0.4
executing 1.2.0
fairscale 0.4.4
filelock 3.10.0
fonttools 4.39.2
ftfy 6.1.1
gitdb 4.0.10
GitPython 3.1.31
huggingface-hub 0.13.2
identify 2.5.21
idna 3.4
imageio 2.26.0
importlib-metadata 6.0.0
importlib-resources 5.12.0
iopath 0.1.10
ipython 8.11.0
jedi 0.18.2
Jinja2 3.1.2
jsonschema 4.17.3
kaggle 1.5.13
kiwisolver 1.4.4
langcodes 3.3.0
lazy-loader 0.1
Levenshtein 0.20.9
lit 15.0.7
markdown-it-py 2.2.0
MarkupSafe 2.1.2
matplotlib 3.7.1
matplotlib-inline 0.1.6
mdurl 0.1.2
mpmath 1.3.0
murmurhash 1.0.9
networkx 3.0
nodeenv 1.7.0
numpy 1.24.2
omegaconf 2.3.0
open-clip-torch 2.16.0
opencv-python-headless 4.5.5.64
opendatasets 0.1.22
packaging 23.0
pandas 1.5.3
parso 0.8.3
pathy 0.10.1
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.4.0
pip 21.1.1
pkgutil-resolve-name 1.3.10
platformdirs 3.1.1
plotly 5.13.1
portalocker 2.7.0
pre-commit 3.2.0
preshed 3.0.8
prompt-toolkit 3.0.38
protobuf 3.20.3
ptyprocess 0.7.0
pure-eval 0.2.2
pyarrow 11.0.0
pycocoevalcap 1.2
pycocotools 2.0.6
pydantic 1.10.6
pydeck 0.8.0
Pygments 2.14.0
Pympler 1.0.1
pyparsing 3.0.9
pyrsistent 0.19.3
python-dateutil 2.8.2
python-Levenshtein 0.20.9
python-magic 0.4.27
python-slugify 8.0.1
pytz 2022.7.1
pytz-deprecation-shim 0.1.0.post0
PyWavelets 1.4.1
PyYAML 6.0
rapidfuzz 2.13.7
regex 2022.10.31
requests 2.28.2
rich 13.3.2
salesforce-lavis 1.0.0
scikit-image 0.20.0
scipy 1.9.1
semver 2.13.0
sentencepiece 0.1.97
setuptools 56.0.0
six 1.16.0
smart-open 6.3.0
smmap 5.0.0
spacy 3.5.1
spacy-legacy 3.0.12
spacy-loggers 1.0.4
srsly 2.4.6
stack-data 0.6.2
streamlit 1.20.0
sympy 1.11.1
tenacity 8.2.2
text-unidecode 1.3
thefuzz 0.19.0
thinc 8.1.9
tifffile 2023.3.15
timm 0.4.12
tokenizers 0.13.2
toml 0.10.2
toolz 0.12.0
torch 2.0.0+cu117
torchvision 0.15.1+cu117
tornado 6.2
tqdm 4.65.0
traitlets 5.9.0
transformers 4.28.0.dev0
triton 2.0.0
typer 0.7.0
typing-extensions 4.5.0
tzdata 2022.7
tzlocal 4.3
urllib3 1.26.15
validators 0.20.0
virtualenv 20.21.0
wasabi 1.1.1
watchdog 2.3.1
wcwidth 0.2.6
webdataset 0.2.43
wheel 0.40.0
zipp 3.15.0`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions