对长音频切片并保存到内存中，被读取进行语音识别时报错：IndexError: list index out of range


问题一：对于长音频进行切片后保存到内存中（使用io.BytesIO()），一旦被FunAsr读取并进行语音识别，会报如下错误（保存至本地磁盘则无此问题，sensevoice small也可以正常进行语音识别 ）：

Traceback (most recent call last):
  File "d:\python\Fun-ASR\.venv\Lib\site-packages\gradio\queueing.py", line 766, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\python\Fun-ASR\.venv\Lib\site-packages\gradio\route_utils.py", line 355, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\python\Fun-ASR\.venv\Lib\site-packages\gradio\blocks.py", line 2147, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\python\Fun-ASR\.venv\Lib\site-packages\gradio\blocks.py", line 1629, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\python\Fun-ASR\.venv\Lib\site-packages\anyio\to_thread.py", line 61, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\python\Fun-ASR\.venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 2525, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "d:\python\Fun-ASR\.venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 986, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\python\Fun-ASR\.venv\Lib\site-packages\gradio\utils.py", line 1034, in wrapper
    response = f(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^
  File "D:\python\Fun-ASR\webui4srt_direct copy.py", line 116, in model_inference
    res = m.inference(
          ^^^^^^^^^^^^
  File "D:\python\Fun-ASR\model.py", line 602, in inference
    return self.inference_llm(
           ^^^^^^^^^^^^^^^^^^^
  File "D:\python\Fun-ASR\model.py", line 620, in inference_llm
    inputs_embeds, contents, batch, source_ids, meta_data = self.inference_prepare(
                                                            ^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\python\Fun-ASR\model.py", line 464, in inference_prepare
    contents = self.data_template(data_in[0])
                                  ~~~~~~~^^^
IndexError: list index out of range

问题二：两种推理方式都会报如下错误：

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:151645 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

对长音频切片并保存到内存中，被读取进行语音识别时报错：IndexError: list index out of range #51

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

对长音频切片并保存到内存中，被读取进行语音识别时报错：IndexError: list index out of range #51

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions