fp16推理结果全是感叹号

**请问是我漏了什么信息导致llm_dtype=fp16时推理错误还是模型并不支持fp16推理？**

推理脚本使用的是demo2.py：
```
import torch
from model import FunASRNano


def main():
    model_dir = "/workspace/fun-asr-nano/model_bin/Fun-ASR-Nano-2512"
    device = (
        "cuda:0"
        if torch.cuda.is_available()
        else "mps" if torch.backends.mps.is_available() else "cpu"
    )
    m, kwargs = FunASRNano.from_pretrained(model=model_dir, device=device)
    m.eval()

    wav_path = f"/workspace/fun-asr-nano/data_bin/yue.mp3"
    res = m.inference(data_in=[wav_path],llm_dtype='fp16', **kwargs)
    text = res[0][0]["text"]
    print(text)


if __name__ == "__main__":
    main()

```

model.py使用的是当前最新版model.py文件，仅在inferencew_llm函数中添加print：
```
def inference_llm(
        self,
        data_in,
        data_lengths=None,
        key: list = None,
        tokenizer=None,
        frontend=None,
        **kwargs,
    ):
        inputs_embeds, contents, batch, source_ids, meta_data = self.inference_prepare(
            data_in, data_lengths, key, tokenizer, frontend, **kwargs
        )
        llm_dtype = kwargs.get("llm_dtype", "fp32")
        if llm_dtype == "fp32":
            llm_dtype = "fp16" if kwargs.get("fp16", False) else llm_dtype
            llm_dtype = "bf16" if kwargs.get("bf16", False) else llm_dtype
        print(llm_dtype)
        device_type = torch.device(kwargs.get("device", "cuda")).type
        with torch.autocast(
            device_type=device_type if device_type in ["cuda", "xpu", "mps"] else "cpu",
            enabled=True if llm_dtype != "fp32" else False,
            dtype=dtype_map[llm_dtype]
        ):
```

以下是推理结果：
```
python /workspace/fun-asr-nano/tmp-workspace/Fun-ASR/demo2.py
WARNING:root:trust_remote_code: True
Loading remote code successfully: model
fp16
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:151645 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
```
fp32和bf16推理结果都正常，以下是fp32推理结果：
```
poetry run python /workspace/fun-asr-nano/tmp-workspace/Fun-ASR/demo2.py
WARNING:root:trust_remote_code: True
Loading remote code successfully: model
fp32
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:151645 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
呢几个字都表达唔到我想讲嘅意思。
```

以下是bf16推理结果:
```
poetry run python /workspace/fun-asr-nano/tmp-workspace/Fun-ASR/demo2.py
WARNING:root:trust_remote_code: True
Loading remote code successfully: model
bf16
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:151645 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
呢几个字都表达唔到我想讲嘅意思。
```

以下是我的环境：
```
pip list
Package                  Version     Editable project location
------------------------ ----------- -------------------------
aliyun-python-sdk-core   2.16.0
aliyun-python-sdk-kms    2.16.5
antlr4-python3-runtime   4.9.3
audioread                3.1.0
decorator                5.2.1
editdistance             0.8.1
filelock                 3.20.1
fsspec                   2025.12.0
fun-asr-nano             0.1.0       /workspace/fun-asr-nano
funasr                   1.2.9
…
huggingface-hub          0.36.0
hydra-core               1.3.2
jieba                    0.42.1
Jinja2                   3.1.6
jmespath                 0.10.0
kaldiio                  2.18.1
librosa                  0.11.0
llvmlite                 0.46.0
mdurl                    0.1.2
modelscope               1.33.0
numba                    0.63.1
numpy                    1.26.4
nvidia-cublas-cu12       12.8.4.1
nvidia-cuda-cupti-cu12   12.8.90
nvidia-cuda-nvrtc-cu12   12.8.93
nvidia-cuda-runtime-cu12 12.8.90
nvidia-cudnn-cu12        9.10.2.21
nvidia-cufft-cu12        11.3.3.83
nvidia-cufile-cu12       1.13.1.3
nvidia-curand-cu12       10.3.9.90
nvidia-cusolver-cu12     11.7.3.90
nvidia-cusparse-cu12     12.5.8.93
nvidia-cusparselt-cu12   0.7.1
nvidia-nccl-cu12         2.27.3
nvidia-nvjitlink-cu12    12.8.93
nvidia-nvtx-cu12         12.8.90
sentencepiece            0.2.1
setuptools               80.9.0
shellingham              1.5.4
soundfile                0.13.1
soxr                     1.0.0
sympy                    1.14.0
tensorboardX             2.6.4
threadpoolctl            3.6.0
tokenizers               0.22.1
torch                    2.8.0
torch-complex            0.4.4
torchaudio               2.8.0
tqdm                     4.67.1
transformers             4.57.3
triton                   3.4.0
typer                    0.20.1
typing_extensions        4.15.0
umap-learn               0.5.9.post2
urllib3                  2.6.2
```

**请问是我漏了什么信息导致llm_dtype=fp16时推理错误还是模型并不支持fp16推理**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fp16推理结果全是感叹号 #48

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fp16推理结果全是感叹号 #48

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions