If you want the voice enhancement and advanced audio processing nodes (Seed-VC integration), you MUST complete the additional setup:
Simply run the provided batch file:
Portable_auto_install_nodes.batThis will automatically:
- Initialize the seed-vc git submodule
- Install all required dependencies
- Set up audio processing packages
If you're using a virtual environment or conda:
Step 1: Initialize seed-vc submodule
cd /path/to/ComfyUI/custom_nodes/Boyonodes
git submodule add https://github.com/Plachtaa/seed-vc.git seed-vc
git submodule update --init --recursiveStep 2: Install dependencies
# Core audio dependencies
pip install librosa>=0.10.0 transformers>=4.30.0 safetensors>=0.3.0
pip install huggingface_hub>=0.15.0 pyloudnorm>=0.1.0 soundfile>=0.12.0
# Seed-VC specific requirements
pip install hydra-core>=1.3.0 omegaconf munch descript-audio-codec
# TTS packages (no dependencies to avoid conflicts)
pip install chatterbox-tts --no-deps
pip install resemble-perth --no-deps
# Install remaining requirements
pip install -r requirements.txtStep 3: Restart ComfyUI
Essential ComfyUI nodes for semantic image editing, audio processing, LoRA management, and automated workflow generation. Streamlines complex pipelines with intelligent automation and robust error handling.
git clone https://github.com/DragonDiffusionbyBoyo/Boyonodes.git
cp -r Boyonodes /path/to/ComfyUI/custom_nodes/Restart ComfyUI after installation. Most nodes work immediately with no additional dependencies.
- Semantic Image Editing nodes
- LoRA Management System
- Workflow Enhancement nodes
- Basic Utility nodes
pip install librosa transformers safetensors huggingface_hub pyloudnorm soundfile
pip install chatterbox-tts --no-deps
pip install resemble-perth --no-depsSee installation instructions at the top of this README
pip install numpy==1.26 matplotlib pillow tqdm torch- Windows: Download from ffmpeg.org or
choco install ffmpeg - macOS:
brew install ffmpeg - Linux:
sudo apt-get install ffmpeg
High-quality voice cloning and text-to-speech generation with emotion support.
Key Features:
- Voice Cloning: Clone any voice from 5+ second reference audio
- Emotion Tags: Natural expressions like
[chuckle],[sigh],[gasp] - GPU Accelerated: CUDA support for fast generation
- 24kHz Quality: Professional audio output
Available Emotion Tags:
[clear throat] [sigh] [shush] [cough] [groan] [sniff] [gasp] [chuckle] [laugh]
Workflow:
- Boyo Chatterbox Turbo Loader - Load TTS model
- Boyo Chatterbox Turbo Generate - Create speech with emotion
- Standard ComfyUI audio nodes for processing
Finally solved the community's lip-sync timing challenge! Precision audio padding for perfect video-audio synchronization.
Key Nodes:
- BoyoAudioDurationAnalyzer - Extract precise duration from any audio tensor
- BoyoAudioPadder - Intelligent silence padding with auto-centering
Why These Nodes Succeed:
- β
Handles all ComfyUI audio formats (
(batch,channels,samples)vs(batch,samples,channels)) - β Automatic 12kHz output for proper playback speed
- β
Intelligent feedback:
π Audio: 3.2s + Padding: 7.3s = Total: 10.5s β Perfect match! - β Memory-efficient processing for large files
- β Auto-centering with target duration matching
Quick Workflow:
Load Audio β BoyoAudioPadder (target_duration: 10.5s, auto_center: true) β Save Audio
Perfect for Kontext, Qwen Image Edit, and HiDream E1.1 workflows with automated iteration and dataset creation.
Boyo Image Grab
- Auto-monitors directories for newest images
- Enables seamless iterative editing chains
- Perfect for progressive semantic modifications
- Real-time directory monitoring
Boyo Paired Image Saver
- Saves original/edited pairs with sequential naming
- Dataset creation for training workflows
- ControlNet format compatibility
- Organized file management
Boyo Incontext Saver
- Specialized for semantic editing outputs
- Dataset-ready organization
- Maintains editing relationships
Boyo Universal Image + Prompt Saver
- Strips metadata for clean publication
- Saves actual prompts (crucial for wildcard workflows)
- Creates organized image/text pairs
- Publication-ready outputs
Original β Semantic Edit β Paired Saver β Image Grab (auto-feeds next iteration)
Revolutionary paired LoRA management with intelligent prompt handling for complex workflows requiring multiple LoRA types.
Create and save LoRA configurations with flexible prompt management.
Key Features:
- Supports paired LoRAs (high/low noise variants)
- Multiple prompts per configuration
- Handles utility LoRAs (no prompts needed)
- Auto-saves to organized directory
Load multiple LoRA configurations simultaneously with advanced prompt strategies.
Key Features:
- 3 simultaneous config slots for layered effects
- Prompt strategies: Mute, Concatenate, Merge per config
- Prompt modes: First Only, Cycle Through, Random (seed-based)
- 6 LoRA path outputs + 4 prompt string outputs
- Direct connection to standard LoRA loaders
Preview and analyze LoRA configurations before loading.
Sample Output:
π LoRA Configuration: Character_Cyborg
π― LoRA FILES:
π High Noise: β
cyborg_char_v2.safetensors
π Low Noise: β
cyborg_char_v2_low.safetensors
π Type: PAIRED LoRA
π¬ PROMPTS (3 total):
1. cyborg woman, metallic skin, glowing eyes
2. android female, chrome details, futuristic
3. robotic humanoid, synthetic appearance
π‘ USAGE RECOMMENDATIONS:
β’ Use 'Cycle Through' for variety
β’ Use 'Random' for experimentation
Automated storyboard creation using local ollama models for consistent multi-scene video workflows.
Intelligent prompt generator for structured storyboard sequences.
Key Features:
- Model-agnostic trigger words - works with any LoRA/video model
- Two modes: 6-scene storyboards or traveling prompt sequences
- Consistent character/style across all scenes
- Optimized for abliterated coder models (Qwen 30B A3B Coder recommended)
System Prompt 1 (Standard): 6 image + 6 video prompts for Next Scene LoRA workflows
System Prompt 2 (Traveling): 6 images + 6 multi-line video sequences for extended content
Parses ollama JSON responses into 12 separate prompt outputs for direct workflow integration.
Workflow:
Storyboard Prompt β ollama Generate β Storyboard Output β 12 individual prompts
One-click workflow dependency installation. Drop JSON manifest files to automatically download custom nodes, models, and dependencies.
Features:
- Automatic GitHub repository cloning
- Python dependency installation via pip
- Model downloads from direct URLs
- Available in Simple (auto-detect) and Advanced (custom paths) versions
- Boyo Empty Latent - Smart aspect ratio calculator
- Load Image List - Batch image processor for mass operations
- Boyo VAE Decode - Stealth NSFW filtering for controlled environments
- Boyo Tiled VAE Decode - Memory-efficient large image processing
- Mandelbrot Video Generator - Fractal art for creative projects
- BoyoVision Node - Qwen2.5VL vision with abliterated model compatibility
- Create configs for utility (lightning), character, and style LoRAs
- Load all three simultaneously in Paired Loader
- Set strategies: utility = "Mute", character/style = "Concatenate"
- Get combined prompts and all LoRA paths in one node
- Load initial image
- Apply semantic edit (Kontext/Qwen/HiDream)
- Boyo Paired Image Saver stores original + edit
- Boyo Image Grab auto-feeds edit for next iteration
- Repeat for progressive modifications
- Load Video β VideoHelperSuite Info β get duration
- Generate TTS audio β BoyoAudioPadder with auto-center
- Perfect timing match for lip-sync models
- Configure story/character in Storyboard Prompt
- Generate via ollama β parse with Storyboard Output
- Connect 6 image outputs to Next Scene LoRA
- Connect 6 video outputs to video generation
- Audio too fast: Node automatically outputs 12kHz for proper ComfyUI playback
- TTS loading errors: Ensure dependencies installed with
--no-depsflags - CUDA problems: Set device to "cpu" in loader node
- Config not loading: Use Inspector node to verify file paths and JSON syntax
- Missing LoRA files: Check paths use forward slashes, verify file existence
- Prompt issues: Inspector shows available prompts and recommendations
- Image Grab not updating: Verify directory path exists, check auto_refresh enabled
- Paired Saver failing: Confirm output directory exists and has write permissions
- Slow performance: Organize files into smaller subdirectories
- Poor outputs: Use recommended ollama models (Qwen 30B A3B Coder Abliterated)
- JSON parse errors: Add verbosity instructions in additional_details field
- Avoid: Google models (Gemma), Meta coding variants, thinking models
- Boyo/Audio/TTS - Text-to-speech and voice cloning
- Boyo/Audio/Analysis - Audio duration and analysis tools
- Boyo/Audio/Processing - Padding and timing control
- Boyo/Image/Editing - Semantic editing workflow tools
- Boyo/LoRA - LoRA management and configuration
- Boyo/Storyboard - AI storyboard generation
- Boyo/Utility - General workflow enhancement tools
- Fork the repository
- Create feature branch (
git checkout -b feature-name) - Commit changes
- Push to branch
- Open pull request
Documentation for new features is appreciated.
MIT License - see LICENSE file for details.
Built by DragonDiffusionbyBoyo for the semantic editing revolution.
Note: Vision nodes currently disabled due to dependency conflicts. Resolution in progress.