Memory-efficient merging of multiple Stable Diffusion models using weighted accumulation. Built for merging 8+ Pony/SDXL/Illustrious models without needing 56GB of RAM.
Supermerger and similar tools make you merge models in "tournament style" - pairing them up round by round. This causes unequal blending where early models get diluted more than later ones. Model Merger uses a proper accumulator pattern where all models contribute equally based on their weights.
Key Features:
- ✅ Memory Efficient - Merge 8+ models without massive RAM requirements
- ✅ Shape-Only Validation - Ultra-fast compatibility checks without loading full models
- ✅ True Weighted Blending - All models contribute equally (no tournament dilution!)
- ✅ Smart Conversion - Legacy
.ckpt/.ptto safetensors with adaptive pruning - ✅ Deep Verification - Tensor-by-tensor comparison of conversions
- ✅ VAE Baking - Optional VAE integration into merged models
- ✅ GPU Acceleration - CUDA support with auto-fallback to CPU
- ✅ Desktop Notifications - Toast notifications for long operations (Windows)
- ✅ Customizable - JSON-based architecture detection patterns
- ✅ Safe & Fast - Uses safetensors format (10x faster loading, can't execute malicious code)
# Install
pip install -r requirements.txt
# Convert old models to safetensors
python run.py convert old_model.ckpt
# Merge models in three steps
python run.py scan ./my_models # 1. Find models, generate manifest
# (optionally edit the manifest to adjust weights)
python run.py merge --manifest ./my_models/merge_manifest.json # 2. Merge!Done! Your merged model is ready to use.
# Scan a folder with 4 Pony models
python run.py scan ~/models/pony --vae ~/models/pony_vae.safetensors
# This generates merge_manifest.json with equal weights (0.25 each)
# Edit if you want to emphasize specific models
# Merge with GPU acceleration
python run.py merge --manifest ~/models/pony/merge_manifest.json --device cuda
# Result: Pony_Model1_Model2_Model3_Model4_merged.safetensorsTournament style (what you were doing manually):
Round 1: (A+B)/2 → AB, (C+D)/2 → CD
Round 2: (AB+CD)/2 → ABCD
Problem: A and B get diluted more than C and D!
Accumulator style (this tool):
result = A*0.25 + B*0.25 + C*0.25 + D*0.25
All models contribute equally. Much better! 🎯
The accumulator pattern only keeps 2 models in memory at once:
result = model1 * weight1
result += model2 * weight2 (load, add, free)
result += model3 * weight3 (load, add, free)
...
This means you can merge 8+ models without needing 56GB of RAM!
Model Merger supports two merging algorithms:
Traditional weighted linear interpolation. Fast and memory-efficient.
python run.py merge --manifest my_manifest.json --merge-method weightedPros:
- Fast - uses accumulator pattern (only 2 models in RAM)
- Memory efficient
- Predictable results
- Works with any number of models
Best for: General purpose merging, quick experiments
Outlier-resistant merging using inverse distance weighting. Computes adaptive weights for each parameter based on inter-model agreement.
python run.py merge --manifest my_manifest.json --merge-method consensusPros:
- Suppresses outlier values automatically
- Better coherence when merging diverse models
- Reduces artifacts from conflicting models
- Configurable outlier suppression strength
Best for: Merging many diverse models, reducing artifacts, producing balanced outputs
Parameters:
--consensus-exponent <int>- Outlier suppression strength (default: 4, range: 2-8)- Higher = more aggressive outlier suppression
- Lower = more tolerance for diversity
How it works: For each weight position across all models, consensus merge computes the average distance from each value to all others. Values that are close to the consensus get high weights, outliers get suppressed exponentially. This happens per-element, so different layers can have different consensus patterns.
Comparison:
| Feature | Weighted Sum | Consensus |
|---|---|---|
| Speed | ⚡ Fast | 🐢 Slower |
| Memory | 💚 Low (2 models) | 💛 Moderate (1 tensor × N models) |
| Outlier handling | ❌ None | ✅ Automatic suppression |
| User weights | ✅ Respected | ❌ Ignored (computed adaptively) |
| Best use case | Quick merges, few models | Diverse models, artifact reduction |
Example:
# Quick weighted merge (respects your 0.5/0.5 weights)
python run.py merge --manifest manifest.json --merge-method weighted
# Consensus merge with moderate suppression
python run.py merge --manifest manifest.json --merge-method consensus --consensus-exponent 4
# Consensus with aggressive outlier removal
python run.py merge --manifest manifest.json --merge-method consensus --consensus-exponent 8- Installation Guide - Setup, requirements, GPU acceleration
- Usage Guide - Converting, verifying, scanning, merging
- Customization Guide - Architecture patterns, manifest editing
- Troubleshooting Guide - Common issues and solutions
- FAQ - Frequently asked questions
- CHANGELOG - Version history and release notes
- ROADMAP - Future plans and development ideas
Convert legacy .ckpt, .pt, .pth, .bin files to safetensors:
python run.py convert old_model.ckpt
python run.py convert model.pt --output new_name.safetensors --notifyWhy our converter is better:
- Auto-detects file type (SD model, VAE, LoRA, embedding, upscaler)
- Adaptive pruning strategies for each type
- DataParallel prefix removal (
module.*) - Shared tensor detection and cloning
- Extension validation with
--forcebypass - Desktop notifications for long operations
- Output verification
- Numpy fallback for problematic saves
Verify conversions are pixel-perfect:
python run.py convert old_model.ckpt
python run.py verify old_model.ckpt old_model.safetensors --verboseChecks:
- Key sets match
- Tensor shapes match
- Values match (within floating-point tolerance)
- Module prefix handling
Uses PyTorch's own testing standards (rtol=1e-5, atol=1e-8).
Equal weights:
{"models": [
{"path": "model1.safetensors", "weight": 0.25},
{"path": "model2.safetensors", "weight": 0.25},
{"path": "model3.safetensors", "weight": 0.25},
{"path": "model4.safetensors", "weight": 0.25}
]}Emphasize specific models:
{"models": [
{"path": "base.safetensors", "weight": 0.5},
{"path": "style_a.safetensors", "weight": 0.3},
{"path": "detail.safetensors", "weight": 0.2}
]}Experimental "spicy" weights:
{"models": [
{"path": "model1.safetensors", "weight": 1.5},
{"path": "model2.safetensors", "weight": -0.2}
]}Weights don't need to sum to 1.0!
50x faster merging with CUDA:
python run.py merge --manifest config.json --device cudaAuto-detects CUDA availability and falls back to CPU with helpful installation instructions.
Requirements:
- NVIDIA GPU with CUDA support
- CUDA-enabled PyTorch (see Installation Guide)
- ~14GB VRAM for merging 8 SDXL models
Create ~/.model_merger/architecture_patterns.json to add custom architectures:
{
"patterns": {
"Pony": ["pony", "ponyxl", "ponydiffusion"],
"Flux": ["flux", "flux-dev", "flux-schnell"],
"MyCustomArch": ["mycustom", "custom-model"]
},
"default": "SDXL"
}No code changes needed! See Customization Guide.
model_merger/
├── __init__.py # Package exports
├── config.py # Constants, patterns, defaults
├── architecture_patterns.json # Default architecture detection patterns
├── loader.py # Load models/VAEs, compute hashes
├── manifest.py # Scan folders, generate/validate manifests
├── merger.py # Core accumulator merge logic
├── vae.py # VAE baking
├── saver.py # Save merged models
├── converter.py # Convert legacy formats to safetensors
├── verifier.py # Deep verification of conversions
├── pruner.py # Smart format detection and pruning
├── notifier.py # Desktop notifications for long operations
├── console.py # Rich UI formatting and progress bars
└── cli.py # Command-line interface
run.py # Entry point
docs/ # Documentation
Each module has a single, clear responsibility. Easy to test, easy to extend!
- Python 3.8+
- PyTorch 2.0+ (CPU or CUDA)
- safetensors 0.4.0+
- rich 13.0+ (beautiful terminal output)
- numpy 1.24+
- packaging 21.0+
- win10toast 0.9+ (optional, Windows only)
See Installation Guide for detailed setup.
This is a clean, focused tool. If you want to add features:
- Keep separation of concerns (one module = one job)
- Follow existing code style
- Add docstrings!
- Test with real models before submitting
Do whatever you want with this. No warranty, use at your own risk, etc.
Made with ❤️ and too much coffee by someone who was tired of clicking through Supermerger's UI 8 times per merge session.