A professional, Suno-like music generation studio for HeartLib
Features • Demo • Installation • Usage • Configuration • Credits
| Feature | Description |
|---|---|
| Full Song Generation | Create complete songs with vocals and lyrics up to 4+ minutes |
| Instrumental Mode | Generate instrumental tracks without vocals |
| Style Tags | Define genre, mood, tempo, and instrumentation |
| Seed Control | Reproduce exact generations for consistency |
| Queue System | Queue multiple generations and process them sequentially |
| Feature | Description |
|---|---|
| Audio Upload | Use any audio file as a style reference |
| Waveform Visualization | Professional waveform display powered by WaveSurfer.js |
| Region Selection | Draggable 10-second region selector for precise style sampling |
| Style Influence | Adjustable slider to control reference audio influence (1-100%) |
| Synced Playback | Modal waveform syncs with bottom player in real-time |
Coming Soon: LoRA Voice Training - We're actively developing LoRA-based voice training with exceptional results. Our early tests show voice consistency that surpasses Suno. Stay tuned for updates!
| Feature | Description |
|---|---|
| Lyrics Generation | Generate lyrics from a topic using LLMs |
| Multiple Providers | Support for Ollama (local) and OpenRouter (cloud) |
| Style Suggestions | AI-suggested style tags based on your concept |
| Prompt Enhancement | Improve your prompts with AI assistance |
| Feature | Description |
|---|---|
| Spotify-Inspired UI | Clean, modern design with dark/light mode |
| Bottom Player | Full-featured player with waveform, volume, and progress |
| History Feed | Browse, search, and manage all generated tracks |
| Likes & Playlists | Organize favorites into custom playlists |
| Real-time Progress | Live generation progress with step indicators |
| Responsive Design | Works on desktop and mobile devices |
| Layer | Technologies |
|---|---|
| Frontend | React 18, TypeScript, TailwindCSS, Framer Motion, WaveSurfer.js |
| Backend | FastAPI, SQLModel, SSE (Server-Sent Events) |
| AI Engine | HeartLib - MuQ, MuLan, HeartCodec |
| LLM Integration | Ollama, OpenRouter |
HeartMuLa Studio includes several optimizations for faster generation and lower VRAM usage:
Reduces VRAM usage from ~11GB to ~3GB using BitsAndBytes NF4 quantization:
HEARTMULA_4BIT=true python -m uvicorn backend.app.main:app --host 0.0.0.0 --port 8000Automatically configured based on your GPU:
| GPU | Flash Attention |
|---|---|
| NVIDIA SM 7.0+ (Volta, Turing, Ampere, Ada, Hopper) | ✅ Enabled |
| NVIDIA SM 6.x and older (Pascal, Maxwell) | ❌ Disabled (uses math backend) |
| AMD GPUs | ❌ Disabled (compatibility varies) |
Enable PyTorch 2.0+ compilation for ~2x faster inference on supported GPUs:
# Enable torch.compile
HEARTMULA_COMPILE=true python -m uvicorn backend.app.main:app --host 0.0.0.0 --port 8000
# With max performance (slower first run, faster subsequent runs)
HEARTMULA_COMPILE=true HEARTMULA_COMPILE_MODE=max-autotune python -m uvicorn backend.app.main:app --host 0.0.0.0 --port 8000| Mode | Description |
|---|---|
default |
Good balance of compile time and performance |
reduce-overhead |
Faster compilation, slightly less optimal code |
max-autotune |
Best performance, but slowest compilation (recommended for production) |
Requirements:
- PyTorch 2.0+
- Linux/WSL2: Install Triton (
pip install triton) - Windows: Install Triton-Windows (
pip install -U 'triton-windows>=3.2,<3.3')
Note: First generation will be slower due to compilation. Subsequent generations benefit from the compiled kernels.
Automatically selects the best GPU configuration:
- With 4-bit quantization: Prioritizes fastest GPU (highest compute capability)
- Without quantization: Prioritizes GPU with most VRAM
- HeartMuLa → Primary GPU, HeartCodec → Secondary GPU
Models are automatically downloaded from HuggingFace Hub on first run (~5GB):
- HeartMuLa (main model)
- HeartCodec (audio decoder)
- Tokenizer and generation config
./start.shThat's it! The system auto-detects your GPU and downloads models on first run.
HeartMuLa Studio is available as a standalone macOS application with a native app window, optimized for Apple Metal GPUs.
Download the latest macOS release from the Releases page:
- HeartMuLa-macOS.dmg - Drag and drop installer
- HeartMuLa-macOS.zip - Portable app bundle
- Download
HeartMuLa-macOS.dmg - Open the DMG and drag
HeartMuLa.appto your Applications folder - Double-click to launch (macOS may show a security warning on first run)
- If prompted, go to System Preferences → Security & Privacy → Click "Open Anyway"
All data is stored in your user Library folder, not in the app bundle:
~/Library/Application Support/HeartMuLa/
├── models/ # AI models (~5GB, auto-downloaded)
├── generated_audio/ # Your generated music files
├── ref_audio/ # Uploaded reference audio
└── jobs.db # Song history database
~/Library/Logs/HeartMuLa/
└── (application logs)
This ensures:
- ✅ App bundle remains read-only (code signing compatible)
- ✅ Your data persists across app updates
- ✅ Easy to find and manage your generated music
- ✅ Standard macOS app behavior
- Standalone App: No Python or Node.js installation required
- Native Window: Uses pywebview for a native macOS app experience (single instance only)
- Apple Metal GPU: Optimized for M1/M2/M3 and Intel Macs with Metal support
- Auto-Download: Models are automatically downloaded on first launch (~5GB)
- Code-Signed: Packaged with PyInstaller and ad-hoc code signing
- macOS 10.13 (High Sierra) or later
- Apple Silicon (M1/M2/M3) or Intel Mac with Metal support
- 10GB+ RAM
- 15GB+ free disk space
For more details, see build/macos/README.md
The easiest way to run HeartMuLa Studio - no Python/Node setup required.
- Docker with NVIDIA Container Toolkit
- NVIDIA GPU with 10GB+ VRAM
# Clone and start (uses pre-built image from GitHub Container Registry)
git clone https://github.com/fspecii/HeartMuLa-Studio.git
cd HeartMuLa-Studio
docker compose up -d
# View logs (watch model download progress on first run)
docker compose logs -f# Create directories for persistent data
mkdir -p backend/models backend/generated_audio backend/ref_audio
# Run the pre-built image (Docker Hub)
docker run -d \
--gpus all \
-p 8000:8000 \
-v ./backend/models:/app/backend/models \
-v ./backend/generated_audio:/app/backend/generated_audio \
-v ./backend/ref_audio:/app/backend/ref_audio \
--name heartmula-studio \
ambsd/heartmula-studio:latestAvailable registries:
- Docker Hub:
ambsd/heartmula-studio:latest - GitHub:
ghcr.io/fspecii/heartmula-studio:latest
- Docker builds the image (~10GB, includes CUDA + PyTorch)
- Models are automatically downloaded from HuggingFace (~5GB)
- Container starts with GPU auto-detection
- Frontend + API served on port 8000
All your data is preserved across container restarts:
| Data | Location | Description |
|---|---|---|
| Generated Music | ./backend/generated_audio/ |
Your MP3 files (accessible from host) |
| Models | ./backend/models/ |
Downloaded AI models (~5GB) |
| Reference Audio | ./backend/ref_audio/ |
Uploaded style references |
| Song History | Docker volume heartmula-db |
Database with all your generations |
# Start
docker compose up -d
# Stop
docker compose down
# View logs
docker compose logs -f
# Rebuild after updates
docker compose build --no-cache
docker compose up -d
# Reset database (fresh start)
docker compose down -v
docker compose up -dOverride settings in docker-compose.yml:
environment:
- HEARTMULA_4BIT=true # Force 4-bit quantization
- HEARTMULA_SEQUENTIAL_OFFLOAD=true # Force model swapping (low VRAM)
volumes:
# Use existing models from another location (e.g., ComfyUI)
- /path/to/comfyui/models/heartmula:/app/backend/modelsTo use Ollama (running on host) for AI lyrics generation:
- Ollama is auto-configured - The container uses
host.docker.internalto reach Ollama on your host machine - Just run Ollama normally on your host (not in Docker)
- The container will automatically connect to
http://host.docker.internal:11434
Custom Ollama URL:
environment:
- OLLAMA_HOST=http://your-ollama-server:11434- Python 3.10 or higher
- Node.js 18 or higher
- CUDA GPU with 10GB+ VRAM
- Git for cloning the repository
git clone https://github.com/fspecii/HeartMuLa-Studio.git
cd HeartMuLa-Studio# Create virtual environment in root folder
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install backend dependencies
pip install -r backend/requirements.txtNote: HeartLib models (~5GB) will be downloaded automatically from HuggingFace on first run.
cd frontend
# Install dependencies
npm install
# Build for production
npm run buildsource venv/bin/activate # Windows: venv\Scripts\activate
# Single GPU
python -m uvicorn backend.app.main:app --host 0.0.0.0 --port 8000
# Multi-GPU (recommended for 2+ GPUs)
CUDA_VISIBLE_DEVICES=0,1 python -m uvicorn backend.app.main:app --host 0.0.0.0 --port 8000Development mode:
cd frontend
npm run devProduction mode:
# Serve the dist folder with any static server
npx serve dist -l 5173| Mode | URL |
|---|---|
| Development | http://localhost:5173 |
| Production | http://localhost:8000 |
Create a .env file in the backend directory:
# OpenRouter API (for cloud LLM)
OPENROUTER_API_KEY=your_api_key_here
# Ollama (for local LLM)
OLLAMA_HOST=http://localhost:11434HeartMuLa Configuration (set when running):
| Variable | Default | Description |
|---|---|---|
HEARTMULA_MODEL_DIR |
backend/models |
Custom model directory (share with ComfyUI, etc.) |
HEARTMULA_4BIT |
auto |
4-bit quantization: auto, true, or false |
HEARTMULA_SEQUENTIAL_OFFLOAD |
auto |
Model swapping for low VRAM: auto, true, or false |
HEARTMULA_COMPILE |
false |
torch.compile for ~2x faster inference: true or false |
HEARTMULA_COMPILE_MODE |
default |
Compile mode: default, reduce-overhead, or max-autotune |
HEARTMULA_VERSION |
RL-3B-20260123 |
Model version (latest RL-tuned model) |
CUDA_VISIBLE_DEVICES |
all GPUs | Specify which GPUs to use (e.g., 0,1) |
Example: Use existing models from ComfyUI:
HEARTMULA_MODEL_DIR=/path/to/comfyui/models/heartmula ./start.shHeartMuLa Studio automatically detects your GPU VRAM and selects the optimal configuration:
| Your VRAM | Auto-Selected Mode | Speed | Example GPUs |
|---|---|---|---|
| 20GB+ | Full Precision | ~7 fps | RTX 4090, RTX 3090 Ti, A6000 |
| 14-20GB | 4-bit Quantized | ~7 fps | RTX 4060 Ti 16GB, RTX 3090 |
| 10-14GB | 4-bit + Model Swap | ~4 fps (+70s/song) | RTX 3060 12GB, RTX 4060 8GB |
| <10GB | Not supported | - | Insufficient VRAM |
Multi-GPU: Automatically detected and used. HeartMuLa goes to fastest GPU (Flash Attention), HeartCodec to largest VRAM GPU.
./start.sh # Auto-detect (recommended)
./start.sh --force-4bit # Force 4-bit quantization
./start.sh --force-swap # Force model swapping (low VRAM mode)
./start.sh --help # Show all optionsOverride auto-detection with environment variables:
# Force specific settings
HEARTMULA_4BIT=true HEARTMULA_SEQUENTIAL_OFFLOAD=false ./start.sh
# Or run directly
HEARTMULA_4BIT=true python -m uvicorn backend.app.main:app --host 0.0.0.0 --port 8000| Variable | Values | Description |
|---|---|---|
HEARTMULA_4BIT |
auto, true, false |
4-bit quantization (default: auto) |
HEARTMULA_SEQUENTIAL_OFFLOAD |
auto, true, false |
Model swapping for low VRAM (default: auto) |
CUDA_VISIBLE_DEVICES |
0, 0,1, etc. |
Select specific GPUs |
Memory Optimization:
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:TrueFor AI-powered lyrics generation:
Option A: Ollama (Local)
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull llama3.2Option B: OpenRouter (Cloud)
- Get an API key from OpenRouter
- Add it to your
.envfile
HeartMuLa-Studio/
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI application & routes
│ │ ├── models.py # Pydantic/SQLModel schemas
│ │ └── services/
│ │ ├── music_service.py # HeartLib integration
│ │ └── llm_service.py # LLM providers
│ ├── generated_audio/ # Output MP3 files
│ ├── ref_audio/ # Uploaded reference audio
│ ├── jobs.db # SQLite database
│ └── requirements.txt
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ │ ├── ComposerSidebar.tsx # Main generation form
│ │ │ ├── BottomPlayer.tsx # Audio player
│ │ │ ├── RefAudioRegionModal.tsx # Waveform selector
│ │ │ ├── HistoryFeed.tsx # Track history
│ │ │ └── ...
│ │ ├── App.tsx # Main application
│ │ └── api.ts # Backend API client
│ ├── public/
│ └── package.json
├── preview.gif
└── README.md
| Method | Endpoint | Description |
|---|---|---|
POST |
/generate/music |
Start music generation |
POST |
/generate/lyrics |
Generate lyrics with LLM |
POST |
/upload/ref_audio |
Upload reference audio |
GET |
/history |
Get generation history |
GET |
/jobs/{id} |
Get job status |
GET |
/events |
SSE stream for real-time updates |
GET |
/audio/{path} |
Stream generated audio |
| Issue | Solution |
|---|---|
| CUDA out of memory | System should auto-detect. Try ./start.sh --force-swap or reduce duration |
| Models not downloading | Check internet connection and disk space (~5GB needed in backend/models/) |
| Frontend can't connect | Ensure backend is running on port 8000 |
| LLM not working | Check Ollama is running or OpenRouter API key is set in backend/.env |
| Only one GPU detected | Set CUDA_VISIBLE_DEVICES=0,1 explicitly when starting backend |
| Slow generation | Check logs: tail -f /tmp/heartmula_backend.log for GPU config |
Models are auto-downloaded to backend/models/ (~5GB total):
backend/models/
├── HeartMuLa-oss-RL-3B-20260123/ # Main model
├── HeartCodec-oss/ # Audio codec
├── tokenizer.json
└── gen_config.json
- HeartMuLa/heartlib - The open-source AI music generation engine
- mainza-ai/milimomusic - Inspiration for the backend architecture
- WaveSurfer.js - Audio waveform visualization
This project is open source under the MIT License.
Contributions are welcome! Please feel free to:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
To build the macOS app locally:
# Install dependencies
brew install librsvg imagemagick
pip install -r requirements_macos.txt
# Build frontend
cd frontend && npm install && npm run build && cd ..
# Generate icon and build app
./build/macos/generate_icon.sh
pyinstaller HeartMuLa.spec --clean --noconfirm
# Sign and package
cp dist/HeartMuLa.app/Contents/MacOS/HeartMuLa_bin dist/HeartMuLa.app/Contents/MacOS/HeartMuLa
chmod +x dist/HeartMuLa.app/Contents/MacOS/HeartMuLa
./build/macos/codesign.sh dist/HeartMuLa.appThe macOS build is automatically created via GitHub Actions when a release is published. See .github/workflows/build-macos-release.yml for details.
Made with ❤️ for the open-source AI music community
