-
Notifications
You must be signed in to change notification settings - Fork 148
Gp/output #825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gp/output #825
Conversation
…streamline path handling and improve consistency
…pport conditional output based on input_info settings
Summary of ChangesHello @GACLove, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on enhancing the consistency and maintainability of image and video processing workflows within the system. It primarily refactors the VAE output conversion utility into a more unified and efficient function, and standardizes the return types of several image runner pipelines. Additionally, it refines the image saving logic to be more flexible and organized, contributing to a more robust and developer-friendly codebase. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors VAE output conversion functions and standardizes the return type of run_pipeline across several image-related runners. The vae_to_comfyui_image and vae_to_comfyui_image_inplace functions have been consolidated into wan_vae_to_comfy, which performs in-place normalization but returns a new tensor. The run_pipeline methods in longcat_image_runner.py, qwen_image_runner.py, and z_image_runner.py now consistently return a dictionary instead of a tuple, which is a breaking API change that should be noted. Additionally, minor f-string cleanups and improved image saving logic were implemented.
| Convert VAE decoder output to ComfyUI Image format (inplace operation) | ||
| Args: | ||
| vae_output: VAE decoder output tensor, typically in range [-1, 1] | ||
| Shape: [B, C, T, H, W] or [B, C, H, W] | ||
| WARNING: This tensor will be modified in-place! | ||
| Returns: | ||
| ComfyUI Image tensor in range [0, 1] | ||
| Shape: [B, H, W, C] for single frame or [B*T, H, W, C] for video | ||
| Note: The returned tensor is the same object as input (modified in-place) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring for wan_vae_to_comfy states that it performs an "inplace operation" and that "The returned tensor is the same object as input (modified in-place)". While the normalization steps (add_, mul_, clamp_) are indeed in-place, the subsequent permute and flatten operations create new tensors (or views that are then copied to CPU). Therefore, the returned tensor is not the same object as the input vae_output. This contradiction can be misleading and potentially lead to unexpected behavior if callers rely on the returned object being the original input tensor. Please clarify the docstring to accurately reflect that the input tensor is modified in-place for normalization, but a new tensor is returned after permutation and flattening.
| Convert VAE decoder output to ComfyUI Image format (inplace operation) | |
| Args: | |
| vae_output: VAE decoder output tensor, typically in range [-1, 1] | |
| Shape: [B, C, T, H, W] or [B, C, H, W] | |
| WARNING: This tensor will be modified in-place! | |
| Returns: | |
| ComfyUI Image tensor in range [0, 1] | |
| Shape: [B, H, W, C] for single frame or [B*T, H, W, C] for video | |
| Note: The returned tensor is the same object as input (modified in-place) | |
| Convert VAE decoder output to ComfyUI Image format | |
| Args: | |
| vae_output: VAE decoder output tensor, typically in range [-1, 1] | |
| Shape: [B, C, T, H, W] or [B, C, H, W] | |
| WARNING: The input tensor `vae_output` will be modified in-place for normalization. | |
| Returns: | |
| A new ComfyUI Image tensor in range [0, 1] | |
| Shape: [B, H, W, C] for single frame or [B*T, H, W, C] for video |
|
|
||
| def process_images_after_vae_decoder(self): | ||
| self.gen_video_final = vae_to_comfyui_image(self.gen_video_final) | ||
| self.gen_video_final = wan_vae_to_comfy(self.gen_video_final) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change from vae_to_comfyui_image (which was not in-place) to wan_vae_to_comfy (which performs in-place normalization) represents a behavioral change for self.gen_video_final. While the assignment immediately overwrites the reference, it's important to be aware that self.gen_video_final is now modified in-place before being reassigned. This is generally acceptable for efficiency, but it's a subtle change from the previous non-in-place function.
| if input_info.return_result_tensor: | ||
| return {"images": images} | ||
| elif input_info.save_result_path is not None: | ||
| return {"images": None} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if input_info.return_result_tensor: | ||
| return {"images": images} | ||
| elif input_info.save_result_path is not None: | ||
| return {"images": None} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to other image runners, the return type of the run_pipeline method has changed from a tuple (images, None) to a dictionary {"images": images} or {"images": None}. This is a breaking API change that external callers might need to adapt to. Please ensure this change is documented and communicated appropriately.
No description provided.