Skip to content

ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')") #10

@besilate

Description

@besilate

I'm using the workflow from the example, but I got this error

Image

0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] Triton compilation failed: triton_poi_fused__to_copy_0
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] xnumel = 26214400
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] xoffset = tl.program_id(0) * XBLOCK
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] xindex = xoffset + tl.arange(0, XBLOCK)[:]
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] xmask = tl.full([XBLOCK], True, tl.int1)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] x0 = xindex
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] tmp0 = tl.load(in_ptr0 + (x0), None)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] tmp1 = tmp0.to(tl.float32)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] tl.store(out_ptr0 + (x0), tmp1, None)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1]
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] metadata: {'signature': {'in_ptr0': '*fp8e4nv', 'out_ptr0': '*fp32', 'xnumel': 'i32', 'XBLOCK': 'constexpr'}, 'device': 0, 'constants': {'XBLOCK': 1024}, 'configs': [{(0,): [['tt.divisibility', 16]], (1,): [['tt.divisibility', 16]], (2,): [['tt.divisibility', 16]]}], 'device_type': 'cuda', 'num_warps': 4, 'num_stages': 1, 'debug': True, 'cc': 86}
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] Traceback (most recent call last):
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\triton_heuristics.py", line 713, in _precompile_config
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] binary = triton.compile(*compile_args, **compile_kwargs)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\compiler\compiler.py", line 339, in compile
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] module = src.make_ir(options, codegen_fns, module_map, context)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\compiler\compiler.py", line 83, in make_ir
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] module_map=module_map)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "ast.py", line 422, in visit
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "ast.py", line 430, in generic_visit
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] triton.compiler.errors.CompilationError: at 1:0:
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] ^
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
Error during model prediction: CompilationError: at 1:0:
def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
^
ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

0%| | 0/4 [00:01<?, ?it/s]
Error during sampling: CompilationError: at 1:0:
def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
^
ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

!!! Exception during processing !!! CompilationError: at 1:0:
def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
^
ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

Traceback (most recent call last):
File "H:\ComfyUI_windows_portable\ComfyUI\execution.py", line 496, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\ComfyUI\execution.py", line 315, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\ComfyUI\execution.py", line 289, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "H:\ComfyUI_windows_portable\ComfyUI\execution.py", line 277, in process_inputs
result = f(**inputs)
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes.py", line 3568, in process
raise e
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes.py", line 3462, in process
noise_pred, self.cache_state = predict_with_cfg(
~~~~~~~~~~~~~~~~^
latent_model_input,
^^^^^^^^^^^^^^^^^^^
...<3 lines>...
timestep, idx, image_cond, clip_fea, control_latents, vace_data, unianim_data, audio_proj, control_camera_latents, add_cond,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cache_state=self.cache_state, fantasy_portrait_input=fantasy_portrait_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes.py", line 2623, in predict_with_cfg
raise e
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes.py", line 2524, in predict_with_cfg
noise_pred_cond, cache_state_cond = transformer(
~~~~~~~~~~~^
[z_pos], context=positive_embeds, y=[image_cond_input] if image_cond_input is not None else None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<3 lines>...
**base_params
^^^^^^^^^^^^^
)
^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 2080, in forward
x, x_ip = block(x, x_ip=x_ip, **kwargs) #run block
~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 375, in call
return super().call(*args, **kwargs)
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 749, in compile_wrapper
raise e.remove_dynamo_frames() from None # see TORCHDYNAMO_VERBOSE=1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\compile_fx.py", line 923, in _compile_fx_inner
raise InductorError(e, currentframe()).with_traceback(
e.traceback
) from None
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\compile_fx.py", line 907, in _compile_fx_inner
mb_compiled_graph = fx_codegen_and_compile(
gm, example_inputs, inputs_to_check, **graph_kwargs
)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\compile_fx.py", line 1578, in fx_codegen_and_compile
return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\compile_fx.py", line 1456, in codegen_and_compile
compiled_module = graph.compile_to_module()
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\graph.py", line 2293, in compile_to_module
return self._compile_to_module()
~~~~~~~~~~~~~~~~~~~~~~~^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\graph.py", line 2303, in _compile_to_module
mod = self._compile_to_module_lines(wrapper_code)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\graph.py", line 2371, in _compile_to_module_lines
mod = PyCodeCache.load_by_key_path(
key,
...<2 lines>...
attrs={**self.constants, **self.torchbind_constants},
)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\codecache.py", line 3296, in load_by_key_path
mod = _reload_python_module(key, path, set_sys_modules=in_toplevel)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\compile_tasks.py", line 31, in _reload_python_module
exec(code, mod.dict, mod.dict)
~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\FISHER~1\AppData\Local\Temp\torchinductor_fisherman\hy\chyvfxcqsourmtxtorr6norzxvcazusooyz4sktacivztsdlc3cr.py", line 42, in
triton_poi_fused__to_copy_0 = async_compile.triton('triton_poi_fused__to_copy_0', '''
import triton
...<23 lines>...
tl.store(out_ptr0 + (x0), tmp1, None)
''', device_str='cuda')
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\async_compile.py", line 404, in triton
kernel.precompile(
~~~~~~~~~~~~~~~~~^
warm_cache_only=False,
^^^^^^^^^^^^^^^^^^^^^^
static_triton_bundle_key=CompiledTritonKernels.key(source_code),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\triton_heuristics.py", line 408, in precompile
self._precompile_worker()
~~~~~~~~~~~~~~~~~~~~~~~^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\triton_heuristics.py", line 430, in _precompile_worker
compile_results.append(self._precompile_config(c))
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\triton_heuristics.py", line 713, in _precompile_config
binary = triton.compile(*compile_args, **compile_kwargs)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\compiler\compiler.py", line 339, in compile
module = src.make_ir(options, codegen_fns, module_map, context)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\compiler\compiler.py", line 83, in make_ir
return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
module_map=module_map)
File "ast.py", line 422, in visit
File "ast.py", line 430, in generic_visit
torch._inductor.exc.InductorError: CompilationError: at 1:0:
def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
^
ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions