Model Merger

Memory-efficient merging of multiple Stable Diffusion models using weighted accumulation. Built for merging 8+ Pony/SDXL/Illustrious models without needing 56GB of RAM.

Why This Tool?

Supermerger and similar tools make you merge models in "tournament style" - pairing them up round by round. This causes unequal blending where early models get diluted more than later ones. Model Merger uses a proper accumulator pattern where all models contribute equally based on their weights.

Key Features:

✅ Memory Efficient - Merge 8+ models without massive RAM requirements
✅ Shape-Only Validation - Ultra-fast compatibility checks without loading full models
✅ True Weighted Blending - All models contribute equally (no tournament dilution!)
✅ Smart Conversion - Legacy .ckpt/.pt to safetensors with adaptive pruning
✅ Deep Verification - Tensor-by-tensor comparison of conversions
✅ VAE Baking - Optional VAE integration into merged models
✅ GPU Acceleration - CUDA support with auto-fallback to CPU
✅ Desktop Notifications - Toast notifications for long operations (Windows)
✅ Customizable - JSON-based architecture detection patterns
✅ Safe & Fast - Uses safetensors format (10x faster loading, can't execute malicious code)

Quick Start

# Install
pip install -r requirements.txt

# Convert old models to safetensors
python run.py convert old_model.ckpt

# Merge models in three steps
python run.py scan ./my_models                          # 1. Find models, generate manifest
# (optionally edit the manifest to adjust weights)
python run.py merge --manifest ./my_models/merge_manifest.json  # 2. Merge!

Done! Your merged model is ready to use.

Simple Example

# Scan a folder with 4 Pony models
python run.py scan ~/models/pony --vae ~/models/pony_vae.safetensors

# This generates merge_manifest.json with equal weights (0.25 each)
# Edit if you want to emphasize specific models

# Merge with GPU acceleration
python run.py merge --manifest ~/models/pony/merge_manifest.json --device cuda

# Result: Pony_Model1_Model2_Model3_Model4_merged.safetensors

The Accumulator Difference

Tournament style (what you were doing manually):

Round 1: (A+B)/2 → AB, (C+D)/2 → CD
Round 2: (AB+CD)/2 → ABCD

Problem: A and B get diluted more than C and D!

Accumulator style (this tool):

result = A*0.25 + B*0.25 + C*0.25 + D*0.25

All models contribute equally. Much better! 🎯

The accumulator pattern only keeps 2 models in memory at once:

result = model1 * weight1
result += model2 * weight2  (load, add, free)
result += model3 * weight3  (load, add, free)
...

This means you can merge 8+ models without needing 56GB of RAM!

Merge Methods

Model Merger supports two merging algorithms:

Weighted Sum (Default)

Traditional weighted linear interpolation. Fast and memory-efficient.

python run.py merge --manifest my_manifest.json --merge-method weighted

Pros:

Fast - uses accumulator pattern (only 2 models in RAM)
Memory efficient
Predictable results
Works with any number of models

Best for: General purpose merging, quick experiments

Consensus Merge (New!)

Outlier-resistant merging using inverse distance weighting. Computes adaptive weights for each parameter based on inter-model agreement.

python run.py merge --manifest my_manifest.json --merge-method consensus

Pros:

Suppresses outlier values automatically
Better coherence when merging diverse models
Reduces artifacts from conflicting models
Configurable outlier suppression strength

Best for: Merging many diverse models, reducing artifacts, producing balanced outputs

Parameters:

--consensus-exponent <int> - Outlier suppression strength (default: 4, range: 2-8)
- Higher = more aggressive outlier suppression
- Lower = more tolerance for diversity

How it works: For each weight position across all models, consensus merge computes the average distance from each value to all others. Values that are close to the consensus get high weights, outliers get suppressed exponentially. This happens per-element, so different layers can have different consensus patterns.

Comparison:

Feature	Weighted Sum	Consensus
Speed	⚡ Fast	🐢 Slower
Memory	💚 Low (2 models)	💛 Moderate (1 tensor × N models)
Outlier handling	❌ None	✅ Automatic suppression
User weights	✅ Respected	❌ Ignored (computed adaptively)
Best use case	Quick merges, few models	Diverse models, artifact reduction

Example:

# Quick weighted merge (respects your 0.5/0.5 weights)
python run.py merge --manifest manifest.json --merge-method weighted

# Consensus merge with moderate suppression
python run.py merge --manifest manifest.json --merge-method consensus --consensus-exponent 4

# Consensus with aggressive outlier removal
python run.py merge --manifest manifest.json --merge-method consensus --consensus-exponent 8

Documentation

Installation Guide - Setup, requirements, GPU acceleration
Usage Guide - Converting, verifying, scanning, merging
Customization Guide - Architecture patterns, manifest editing
Troubleshooting Guide - Common issues and solutions
FAQ - Frequently asked questions
CHANGELOG - Version history and release notes
ROADMAP - Future plans and development ideas

Features Deep Dive

Smart Format Detection & Conversion

Convert legacy .ckpt, .pt, .pth, .bin files to safetensors:

python run.py convert old_model.ckpt
python run.py convert model.pt --output new_name.safetensors --notify

Why our converter is better:

Auto-detects file type (SD model, VAE, LoRA, embedding, upscaler)
Adaptive pruning strategies for each type
DataParallel prefix removal (module.*)
Shared tensor detection and cloning
Extension validation with --force bypass
Desktop notifications for long operations
Output verification
Numpy fallback for problematic saves

Deep Verification

Verify conversions are pixel-perfect:

python run.py convert old_model.ckpt
python run.py verify old_model.ckpt old_model.safetensors --verbose

Checks:

Key sets match
Tensor shapes match
Values match (within floating-point tolerance)
Module prefix handling

Uses PyTorch's own testing standards (rtol=1e-5, atol=1e-8).

Flexible Merging

Equal weights:

{"models": [
  {"path": "model1.safetensors", "weight": 0.25},
  {"path": "model2.safetensors", "weight": 0.25},
  {"path": "model3.safetensors", "weight": 0.25},
  {"path": "model4.safetensors", "weight": 0.25}
]}

Emphasize specific models:

{"models": [
  {"path": "base.safetensors", "weight": 0.5},
  {"path": "style_a.safetensors", "weight": 0.3},
  {"path": "detail.safetensors", "weight": 0.2}
]}

Experimental "spicy" weights:

{"models": [
  {"path": "model1.safetensors", "weight": 1.5},
  {"path": "model2.safetensors", "weight": -0.2}
]}

Weights don't need to sum to 1.0!

GPU Acceleration

50x faster merging with CUDA:

python run.py merge --manifest config.json --device cuda

Auto-detects CUDA availability and falls back to CPU with helpful installation instructions.

Requirements:

NVIDIA GPU with CUDA support
CUDA-enabled PyTorch (see Installation Guide)
~14GB VRAM for merging 8 SDXL models

Customizable Architecture Detection

Create ~/.model_merger/architecture_patterns.json to add custom architectures:

{
  "patterns": {
    "Pony": ["pony", "ponyxl", "ponydiffusion"],
    "Flux": ["flux", "flux-dev", "flux-schnell"],
    "MyCustomArch": ["mycustom", "custom-model"]
  },
  "default": "SDXL"
}

No code changes needed! See Customization Guide.

Project Structure

model_merger/
├── __init__.py                    # Package exports
├── config.py                      # Constants, patterns, defaults
├── architecture_patterns.json     # Default architecture detection patterns
├── loader.py                      # Load models/VAEs, compute hashes
├── manifest.py                    # Scan folders, generate/validate manifests
├── merger.py                      # Core accumulator merge logic
├── vae.py                         # VAE baking
├── saver.py                       # Save merged models
├── converter.py                   # Convert legacy formats to safetensors
├── verifier.py                    # Deep verification of conversions
├── pruner.py                      # Smart format detection and pruning
├── notifier.py                    # Desktop notifications for long operations
├── console.py                     # Rich UI formatting and progress bars
└── cli.py                         # Command-line interface

run.py                              # Entry point
docs/                               # Documentation

Each module has a single, clear responsibility. Easy to test, easy to extend!

Requirements

Python 3.8+
PyTorch 2.0+ (CPU or CUDA)
safetensors 0.4.0+
rich 13.0+ (beautiful terminal output)
numpy 1.24+
packaging 21.0+
win10toast 0.9+ (optional, Windows only)

See Installation Guide for detailed setup.

Contributing

This is a clean, focused tool. If you want to add features:

Keep separation of concerns (one module = one job)
Follow existing code style
Add docstrings!
Test with real models before submitting

License

Do whatever you want with this. No warranty, use at your own risk, etc.

Made with ❤️ and too much coffee by someone who was tired of clicking through Supermerger's UI 8 times per merge session.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.github		.github
docs		docs
model_merger		model_merger
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
FINGERPRINT_IDEA.md		FINGERPRINT_IDEA.md
INFERENCE_IMAGES.md		INFERENCE_IMAGES.md
LICENSE		LICENSE
MULTISTEP_MERGE.md		MULTISTEP_MERGE.md
README.md		README.md
ROADMAP.md		ROADMAP.md
TODO.md		TODO.md
example_civitai_api.py		example_civitai_api.py
example_manifest.json		example_manifest.json
requirements-cpu.txt		requirements-cpu.txt
requirements-dev.txt		requirements-dev.txt
requirements-gpu.txt		requirements-gpu.txt
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Model Merger

Why This Tool?

Quick Start

Simple Example

The Accumulator Difference

Merge Methods

Weighted Sum (Default)

Consensus Merge (New!)

Documentation

Features Deep Dive

Smart Format Detection & Conversion

Deep Verification

Flexible Merging

GPU Acceleration

Customizable Architecture Detection

Project Structure

Requirements

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

dterracino/stmerge

Folders and files

Latest commit

History

Repository files navigation

Model Merger

Why This Tool?

Quick Start

Simple Example

The Accumulator Difference

Merge Methods

Weighted Sum (Default)

Consensus Merge (New!)

Documentation

Features Deep Dive

Smart Format Detection & Conversion

Deep Verification

Flexible Merging

GPU Acceleration

Customizable Architecture Detection

Project Structure

Requirements

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages