-
Notifications
You must be signed in to change notification settings - Fork 11
Description
I'm using the workflow from the example, but I got this error
0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] Triton compilation failed: triton_poi_fused__to_copy_0
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] xnumel = 26214400
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] xoffset = tl.program_id(0) * XBLOCK
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] xindex = xoffset + tl.arange(0, XBLOCK)[:]
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] xmask = tl.full([XBLOCK], True, tl.int1)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] x0 = xindex
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] tmp0 = tl.load(in_ptr0 + (x0), None)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] tmp1 = tmp0.to(tl.float32)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] tl.store(out_ptr0 + (x0), tmp1, None)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1]
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] metadata: {'signature': {'in_ptr0': '*fp8e4nv', 'out_ptr0': '*fp32', 'xnumel': 'i32', 'XBLOCK': 'constexpr'}, 'device': 0, 'constants': {'XBLOCK': 1024}, 'configs': [{(0,): [['tt.divisibility', 16]], (1,): [['tt.divisibility', 16]], (2,): [['tt.divisibility', 16]]}], 'device_type': 'cuda', 'num_warps': 4, 'num_stages': 1, 'debug': True, 'cc': 86}
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] Traceback (most recent call last):
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\triton_heuristics.py", line 713, in _precompile_config
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] binary = triton.compile(*compile_args, **compile_kwargs)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\compiler\compiler.py", line 339, in compile
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] module = src.make_ir(options, codegen_fns, module_map, context)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\compiler\compiler.py", line 83, in make_ir
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] module_map=module_map)
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "ast.py", line 422, in visit
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] File "ast.py", line 430, in generic_visit
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] triton.compiler.errors.CompilationError: at 1:0:
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] ^
E0828 17:02:17.473000 5360 Lib\site-packages\torch_inductor\runtime\triton_heuristics.py:715] [4/2_1] ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
Error during model prediction: CompilationError: at 1:0:
def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
^
ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
0%| | 0/4 [00:01<?, ?it/s]
Error during sampling: CompilationError: at 1:0:
def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
^
ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
!!! Exception during processing !!! CompilationError: at 1:0:
def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
^
ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
Traceback (most recent call last):
File "H:\ComfyUI_windows_portable\ComfyUI\execution.py", line 496, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\ComfyUI\execution.py", line 315, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\ComfyUI\execution.py", line 289, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "H:\ComfyUI_windows_portable\ComfyUI\execution.py", line 277, in process_inputs
result = f(**inputs)
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes.py", line 3568, in process
raise e
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes.py", line 3462, in process
noise_pred, self.cache_state = predict_with_cfg(
~~~~~~~~~~~~~~~~^
latent_model_input,
^^^^^^^^^^^^^^^^^^^
...<3 lines>...
timestep, idx, image_cond, clip_fea, control_latents, vace_data, unianim_data, audio_proj, control_camera_latents, add_cond,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cache_state=self.cache_state, fantasy_portrait_input=fantasy_portrait_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes.py", line 2623, in predict_with_cfg
raise e
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes.py", line 2524, in predict_with_cfg
noise_pred_cond, cache_state_cond = transformer(
~~~~~~~~~~~^
[z_pos], context=positive_embeds, y=[image_cond_input] if image_cond_input is not None else None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<3 lines>...
**base_params
^^^^^^^^^^^^^
)
^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "H:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 2080, in forward
x, x_ip = block(x, x_ip=x_ip, **kwargs) #run block
~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 375, in call
return super().call(*args, **kwargs)
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 749, in compile_wrapper
raise e.remove_dynamo_frames() from None # see TORCHDYNAMO_VERBOSE=1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\compile_fx.py", line 923, in _compile_fx_inner
raise InductorError(e, currentframe()).with_traceback(
e.traceback
) from None
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\compile_fx.py", line 907, in _compile_fx_inner
mb_compiled_graph = fx_codegen_and_compile(
gm, example_inputs, inputs_to_check, **graph_kwargs
)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\compile_fx.py", line 1578, in fx_codegen_and_compile
return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\compile_fx.py", line 1456, in codegen_and_compile
compiled_module = graph.compile_to_module()
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\graph.py", line 2293, in compile_to_module
return self._compile_to_module()
~~~~~~~~~~~~~~~~~~~~~~~^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\graph.py", line 2303, in _compile_to_module
mod = self._compile_to_module_lines(wrapper_code)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\graph.py", line 2371, in _compile_to_module_lines
mod = PyCodeCache.load_by_key_path(
key,
...<2 lines>...
attrs={**self.constants, **self.torchbind_constants},
)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\codecache.py", line 3296, in load_by_key_path
mod = _reload_python_module(key, path, set_sys_modules=in_toplevel)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\compile_tasks.py", line 31, in _reload_python_module
exec(code, mod.dict, mod.dict)
~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\FISHER~1\AppData\Local\Temp\torchinductor_fisherman\hy\chyvfxcqsourmtxtorr6norzxvcazusooyz4sktacivztsdlc3cr.py", line 42, in
triton_poi_fused__to_copy_0 = async_compile.triton('triton_poi_fused__to_copy_0', '''
import triton
...<23 lines>...
tl.store(out_ptr0 + (x0), tmp1, None)
''', device_str='cuda')
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\async_compile.py", line 404, in triton
kernel.precompile(
~~~~~~~~~~~~~~~~~^
warm_cache_only=False,
^^^^^^^^^^^^^^^^^^^^^^
static_triton_bundle_key=CompiledTritonKernels.key(source_code),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\triton_heuristics.py", line 408, in precompile
self._precompile_worker()
~~~~~~~~~~~~~~~~~~~~~~~^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\triton_heuristics.py", line 430, in _precompile_worker
compile_results.append(self._precompile_config(c))
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\runtime\triton_heuristics.py", line 713, in _precompile_config
binary = triton.compile(*compile_args, **compile_kwargs)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\compiler\compiler.py", line 339, in compile
module = src.make_ir(options, codegen_fns, module_map, context)
File "H:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\compiler\compiler.py", line 83, in make_ir
return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
module_map=module_map)
File "ast.py", line 422, in visit
File "ast.py", line 430, in generic_visit
torch._inductor.exc.InductorError: CompilationError: at 1:0:
def triton_poi_fused__to_copy_0(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
^
ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"