Failed to run `demo/inference.py` on multiple GPUs with RuntimeError: Expected all tensors to be on the same device

I successfully ran `demo/inference.py` on the CPU, but it responds slowly. Due to limited memory on the 3090 GPU, I attempted to run it on two GPUs. However, I meet an error in Chat.answer(), indicating: "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!". Screenshot of the error:
![Screenshot 2024-06-15 at 00 20 40](https://github.com/EmbodiedGPT/EmbodiedGPT_Pytorch/assets/38801129/dbe02181-aab3-49a3-9af0-02dea155fcc4)
And I also type the device map of the model:
![image](https://github.com/EmbodiedGPT/EmbodiedGPT_Pytorch/assets/38801129/b2b0319d-7a54-446b-83b7-799d2e2e58e5)
I am unsure why this error occurs. I've tried to fix it all day. Any insights or solutions would be greatly appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to run `demo/inference.py` on multiple GPUs with RuntimeError: Expected all tensors to be on the same device #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Failed to run demo/inference.py on multiple GPUs with RuntimeError: Expected all tensors to be on the same device #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Failed to run `demo/inference.py` on multiple GPUs with RuntimeError: Expected all tensors to be on the same device #7