Skip to content

huangkun1985/comfy_AudioSeg

Repository files navigation

ComfyUI Audio Segment Splitter

License: MIT ComfyUI Python 3.8+ GitHub release

🎡 A ComfyUI custom node for intelligent audio segmentation with overlap support

Split audio at integer-second start points with decimal-duration segments, designed for context-aware audio processing tasks.


πŸ“‘ Table of Contents


✨ Features

  • 🎯 Integer-Second Start Points: Split at 0s, 10s, 20s, 30s...
  • πŸ“ Decimal Segment Duration: Support for 10.44s, 20.28s, etc.
  • πŸ”„ Intentional Overlap: Preserve context between segments
  • ⏱️ Timestamp Information: Detailed start/end times for each segment
  • πŸ“Š ASCII Visualization: Timeline preview of segmentation
  • πŸš€ Zero Dependencies: Uses ComfyUI built-in libraries
  • πŸŽ›οΈ Flexible Output: Independent audio segments as list

🎯 Use Cases

Perfect for audio processing tasks requiring contextual information:

  • 🎀 Speech Recognition: Prevent sentence truncation at split points
  • 🎡 Music Analysis: Maintain note and beat integrity
  • πŸ”Š Audio Transcription: Ensure sufficient context per segment
  • 🎬 Video Dubbing: Align audio segments for post-production
  • πŸ§ͺ Audio Research: Consistent windowing for ML/AI applications

πŸ“¦ Installation

Method 1: ComfyUI Manager (Recommended)

  1. Open ComfyUI Manager
  2. Search for "Audio Segment Splitter" or "comfy_AudioSeg"
  3. Click Install
  4. Restart ComfyUI

Method 2: Manual Installation

cd ComfyUI/custom_nodes
git clone https://github.com/huangkun1985/comfy_AudioSeg.git
# Restart ComfyUI

Method 3: Direct Download

  1. Download the latest release
  2. Extract to ComfyUI/custom_nodes/comfy_AudioSeg/
  3. Restart ComfyUI

πŸš€ Usage

Quick Start

  1. Find the Node: Right-click in ComfyUI β†’ Search "Audio Segment Splitter" (under audio category)
  2. Connect Input: Link an audio source (e.g., LoadAudio)
  3. Set Duration: Configure segment_duration parameter (default: 10.0s)
  4. Run: Execute workflow to get segmented audio

Node Parameters

Parameter Type Description Default Range
audio AUDIO Input audio data - -
segment_duration FLOAT Segment length in seconds 10.0 0.1 - 3600.0

Node Outputs

Output Type Description
segments AUDIO (List) Independent audio segment list
segment_info STRING Detailed timing info & visualization

πŸ“Š How It Works

Segmentation Logic

Example: 60-second audio with segment_duration = 10.44s

Start Point (int)  β”‚  Segment Range      β”‚  Duration  β”‚  Overlap
─────────────────────────────────────────────────────────────────
0s                 β†’  [0.00 - 10.44s]    β”‚  10.44s   β”‚  -
10s                β†’  [10.00 - 20.44s]   β”‚  10.44s   β”‚  0.44s
20s                β†’  [20.00 - 30.44s]   β”‚  10.44s   β”‚  0.44s
30s                β†’  [30.00 - 40.44s]   β”‚  10.44s   β”‚  0.44s
40s                β†’  [40.00 - 50.44s]   β”‚  10.44s   β”‚  0.44s
50s                β†’  [50.00 - 60.00s]   β”‚  10.00s   β”‚  0.44s (final)

Algorithm

# Integer-second start points
split_interval = int(segment_duration)  # 10.44 β†’ 10
start_points = [0, 10, 20, 30, ...]

# Extract segments with decimal duration
for start in start_points:
    segment = audio[start : start + segment_duration]

πŸ“Έ Example Output

Visual Preview (segment_info output)

================================================================================
                                 Audio Segmentation Preview
================================================================================
Total Duration: 60.00 seconds
Segment Duration: 10.44 seconds
Start Interval: 10 seconds (integer)
Segment Overlap: 0.44 seconds
Number of Segments: 6
--------------------------------------------------------------------------------

Segment Details:

Index  Start Time   End Time     Duration   Notes
--------------------------------------------------------------------------------
0      0.00         10.44        10.44
1      10.00        20.44        10.44      (0.44s overlap with previous)
2      20.00        30.44        10.44      (0.44s overlap with previous)
3      30.00        40.44        10.44      (0.44s overlap with previous)
4      40.00        50.44        10.44      (0.44s overlap with previous)
5      50.00        60.00        10.00      (Final segment, shorter)
================================================================================

Timeline Visualization:
Time:    0.0   6.0  12.0  18.0  24.0  30.0  36.0  42.0  48.0  54.0  60.0
      |------|------|------|------|------|------|------|------|------|------|
  # 0: [============]
  # 1:       [============]
  # 2:                [============]
  # 3:                         [============]
  # 4:                                  [============]
  # 5:                                           [==========]
================================================================================

🎨 Workflow Examples

Example workflows are included in the workflow/ directory:

Workflow Example

Basic Workflow

LoadAudio β†’ AudioSegmentSplitter β†’ PreviewAudio
              ↓
        segment_info β†’ ShowText

Advanced Workflow

LoadAudio β†’ AudioSegmentSplitter β†’ [Process Each Segment] β†’ AudioConcat
              ↓
        segment_info β†’ SaveText

Common Configurations

Scenario segment_duration Effect
With Overlap 10.44 0.44s overlap between segments
No Overlap 10.0 Exact split, no overlap
Short Segments 5.5 0.5s overlap, 5s intervals
Long Segments 30.2 0.2s overlap, 30s intervals

βš™οΈ Technical Details

Audio Format

{
    "waveform": torch.Tensor,  # Shape: (batch, channels, samples)
    "sample_rate": int         # Sample rate in Hz
}

Dependencies

  • torch: PyTorch tensor operations
  • logging: Console output

All dependencies are included with ComfyUI - no additional installation required!

Performance

  • ⚑ Optimized with PyTorch native operations
  • πŸ’Ύ Memory-efficient tensor slicing
  • πŸ”§ Works with any sample rate
  • πŸ“¦ Supports mono and stereo audio

Special Cases Handling

  1. Segment < 1s: Start interval automatically adjusted to 1s
  2. Final Segment: Automatically truncated to audio end
  3. No Overlap: Use integer values (e.g., 10.0, 20.0)

πŸ› Troubleshooting

Node Not Appearing in Menu

Solution:

  1. Verify installation path: ComfyUI/custom_nodes/comfy_AudioSeg/
  2. Check ComfyUI console for errors
  3. Restart ComfyUI completely

Audio Artifacts at Boundaries

Cause: Sharp cutoff in waveform
Solution: Add fade in/out in post-processing (future feature)

Python/Import Errors

Solution: Ensure ComfyUI is up-to-date with PyTorch installed


πŸ§ͺ Testing

Run the included test suite:

cd ComfyUI/custom_nodes/comfy_AudioSeg
python test_splitter.py

Tests include:

  • βœ… Basic segmentation (60s β†’ 10.44s segments)
  • βœ… No-overlap mode (30s β†’ 10.0s segments)
  • βœ… Short segments (10s β†’ 0.5s segments)
  • βœ… Final segment handling

🀝 Contributing

We welcome contributions! Here's how you can help:

  1. πŸ› Report Bugs: Open an issue
  2. πŸ’‘ Suggest Features: Start a discussion
  3. πŸ”§ Submit PRs: Fork, code, test, and submit!
  4. πŸ“– Improve Docs: Help us make the documentation better

Development Setup

git clone https://github.com/huangkun1985/comfy_AudioSeg.git
cd comfy_AudioSeg
# Make changes and test
python test_splitter.py

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ“ Changelog

See CHANGELOG.md for version history and release notes.


πŸ™ Acknowledgments

  • ComfyUI Team: For the amazing framework
  • Community: For feedback and suggestions
  • Contributors: See Contributors

πŸ“ž Support & Contact


Made with ❀️ for the ComfyUI Community

Star History Fork

⬆ Back to Top


Version: 1.0.0 | Last Updated: 2025-11-29

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages