Skip to content

balch/orphic-fm-app

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

508 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Orpheus (from Ancient Greek: Ὀρφεύς) was a divine musician from Greek Mythology who used his music to charm Hades and Persephone so he could rescue his wife, Eurydice, from the Underworld. He was a master of the Lyre.

FM stands for Frequency Modulation, which describes a technique to enhance sound by creating rich harmonics involving changing the pitch of waveforms produced by Oscillators.

Orphic-FM

An 8-oscillator synthesizer built with Kotlin Multiplatform

Overview

Orphic-FM is an 8-oscillator Synthesizer Emulator combining sounds and harmonics with semi-random math and AI. Oscillators are hierarchically grouped and cross-modulated until the sound takes on a life of its own. Eight voices pair into four duos, which group into two quads, with modulation layered at every level. Add a dual delay system that can self-oscillate, a plate reverb, stereo distortion, plus AI agent control, and things get interesting fast.

This instrument is inspired by the Lyra-8 Orgasmic Synthesizer and adds additional synthesis engines ported from the awesome OSS Mutable Instruments' Eurorack firmware repository – FM, virtual analog, granular, physical modeling strings, modal resonators, additive, waveshaping, speech synthesis, and four drum voices.

Under the hood, a shared C++ DSP engine (liborpheus_dsp/) ports the Eurorack firmware and provides a graph-based audio routing system. On Android, it runs natively via Oboe (Google's C++ low-latency audio library) at 48kHz. On iOS, the C++ engine is linked as a static library via cinterop and renders through AVAudioEngine with an AVAudioSourceNode callback. On WASM, the same C++ code is compiled to WebAssembly via Emscripten and runs in a Web Worker. On Desktop (JVM), it loads liborpheus_desktop.dylib via JNI using miniaudio for low-latency playback.

I had multiple motivations for building this project, but I mainly did it because I've always wanted to build some kind of instrument, and now AI agents make that possible. AI played a big part in the development, and it will be interesting to see what happens as Orpheus learns to master the synth.

Check out orphic.fm for music generated by Orpheus

Screenshots

Desktop Android
Orpheus Desktop Delay Orpheus Android Strings
Orpheus Desktop Drums Orpheus Android Keys

Synthesis

  • 8 Synthesized Voices with non-linear envelopes, cross-FM, per-voice stereo panning, and hierarchical duo/quad grouping
  • 17 Plaits Engines ported from Mutable Instruments: FM, Virtual Analog, Additive, Waveshaping, Noise, Granular, String (Karplus-Strong), Modal Resonator, Speech (formant/LPC/SAM), Particle, Swarm, Chord, Wavetable
  • 4 Drum Engines – Analog Bass Drum, Analog Snare, Metallic Hi-Hat, FM Drum – each assignable to independent slots with a beat sequencer
  • FM Self-Feedback on the default oscillator, harmonics control across all engine types

Effects

  • Duo LFO – two oscillators with AND/OR/FM combining for complex modulation shapes
  • Dual Modulating Delays with self-modulation, LFO routing, and feedback loops capable of self-oscillation
  • Dattorro Plate Reverb ported from Mutable Instruments Rings
  • Stereo Distortion – parallel clean/drive paths with TanhLimiter soft-clipping
  • Resonator filter bank and Warps phase modulation

Performance & Control

  • Full MIDI with learn mode and arbitrary controller mapping
  • Tidal Cycles live-coding integration
  • Preset System for saving and recalling patches
  • Evolutionary Parameter Search – algorithmic exploration of the parameter space
  • Platform TTS - macOS say and Android native speech routed through the effects chain
  • Hand Tracking - camera-based ASL gesture control using MediaPipe hand landmarks with a hybrid ML + rule-based classifier, three interaction modes (ASL sign selection, conductor orchestration, and AR keyboard), and real-time camera overlay with hand skeleton visualization

AI Agent

An in-app chat agent (built on Koog with Gemini) can control the synth through natural language. It has tool access to set any parameter, trigger voices, switch engines, and speak words through the vocoder. The agent observes synth state changes in real time and can reason about the current sound.

How It's Built

Module Layout

core/audio/          DSP engine interfaces, plugin system, type-safe port DSL
core/dsp-engine/     Shared DSP graph: voice manager, wiring, automation
core/foundation/     MIDI, presets, SynthController event bus, speech
core/gestures/       ASL sign classifier, gesture interpretation engines
core/mediapipe/      MediaPipe hand tracking abstraction (Android + Desktop)
core/plugin-api/     Shared symbol definitions across all plugins
core/plugins/        14 self-contained DSP plugin modules
features/            20+ UI feature modules (Compose + ViewModel, MVI)
ui/theme, ui/widgets Dark synth theme, knobs, sliders, collapsible panels
apps/composeApp/     App wiring: signal routing, voice management, DI
liborpheus_dsp/      C++ DSP engine (Plaits, effects, graph routing)
build-logic/         Convention plugins for consistent KMP module config

Signal Path

8 Voices -> Per-Voice Stereo Pan -> Dry Bus
  -> Parallel Clean / Distortion
    -> Dual Modulating Delays (LFO + Feedback)  \
    -> Dattorro Plate Reverb (parallel send)      |-> Stereo Sum -> Master Out

Plugin Architecture

Every DSP module implements DspPlugin and declares its ports through a type-safe Kotlin DSL. Plugins register via Metro DI with @ContributesIntoSet and are discovered at compile time -- no runtime reflection, no service loaders.

Event Routing

SynthController is the central bus. Every control event carries an origin (MIDI, UI, SEQUENCER, TIDAL, AI, EVO) so the system knows who's driving a parameter and avoids conflicts. ViewModels observe StateFlow and update UI state based on events that happen throughout the system.

Hand Tracking & Gesture Control

Camera-based gesture control lets you play the synth with ASL (American Sign Language) hand signs. The system uses MediaPipe Hand Landmarker to detect 21 hand landmarks per hand at 30fps. A hybrid classifier fuses a custom ML model (trained via MediaPipe Model Maker on all 26 ASL letters + numbers 1-8) with a pure-Kotlin rule-based geometric classifier. The ML model runs on both desktop (JNI, VIDEO mode) and Android (Java API, LIVE_STREAM mode).

Multiple interaction modes are available:

  • ASL Mode -- Sign a number to select a voice, sign a letter to select a parameter, then pinch to gate the voice and drag to adjust the value. A breadcrumb bar shows selection progress.
  • Maestro Mode -- Sign ILY to enter. Each finger has a role: index/middle touch thumb to gate strings, ring modifier controls mod source level via hand roll, pinky modifier steps through hold detents. Hand height drives dynamics, hand openness drives timbre.
  • Keyboard Mode -- Sign E to enter. AR Keyboard Mode projects a chromatic 12-key piano keyboard onto the live camera view, turning any flat surface into a playable instrument.

The fusion algorithm boosts confidence when both classifiers agree and penalizes when they disagree, with geometric signs (R, H, D, Q, K, F) trusting the rule-based classifier. The ML model was trained using MediaPipe Model Maker on the Synthetic ASL Alphabet and Synthetic ASL Numbers datasets. See GESTURES.md for the full gesture reference.

Platforms

Platform Audio Status
Desktop (JVM) C++ via JNI + miniaudio Primary target
Android Oboe (C++ / JNI) at 48kHz Full support
wasmJs C++ DSP → Emscripten WASM → AudioWorklet Functional (orphic.fm)
iOS C++ static lib via cinterop → AVAudioEngine In development

The WASM target compiles the C++ DSP engine to WebAssembly via Emscripten. Audio runs in a Web Worker that renders 128-frame buffers and posts them to an AudioWorkletNode for gapless playback. The main thread keeps a local shadow of engine state for UI reads while forwarding parameter changes to the Worker via postMessage.

The iOS target links the C++ DSP engine as a static library via Kotlin/Native cinterop. Audio renders through AVAudioEngine with an AVAudioSourceNode callback. The C++ engine outputs interleaved stereo which is deinterleaved in C++ (orpheus_engine_process_deinterleaved) to match CoreAudio's non-interleaved buffer format — keeping the entire audio path in native code with zero GC pressure.

Build & Run

# Desktop (C++ DSP engine via JNI + miniaudio)
./gradlew buildDesktopNative && ./gradlew :apps:composeApp:run

# Android
./gradlew :apps:androidApp:installDebugRelease

# iOS (build framework, then open Xcode project)
./gradlew :apps:composeApp:linkDebugFrameworkIosSimulatorArm64
cd apps/iosApp && xcodegen generate
open OrpheusApp.xcodeproj

# WASM dev server (opens browser at localhost:8080)
./gradlew :apps:composeApp:wasmJsBrowserDevelopmentRun

# WASM in orphic.fm site (serves at localhost:4001/synth/)
./scripts/dev-site.sh

# Deploy WASM to GitHub Pages
./scripts/deploy-gh-pages.sh

# Desktop release (dmg/msi/deb depending on OS)
./gradlew :apps:composeApp:packageReleaseDistributionForCurrentOS

See BUILD.md for prerequisites, platform details, C++ DSP builds, Emscripten setup, and configuration. See TESTS.md for testing strategies, C++ test suites, and cross-platform verification.

Hand Tracking (Desktop)

Hand tracking on Desktop uses a pre-built native library (libmediapipe_hand_jni.dylib) and model file that are checked into the repository. No additional build steps are required -- hand tracking works out of the box on macOS arm64.

On Android, hand tracking uses the MediaPipe Tasks SDK (pulled via Gradle). No native setup needed.

Rebuilding the native library from source (optional)

Only needed if updating the MediaPipe version or modifying the JNI layer.

Prerequisites: Bazelisk (brew install bazelisk), MediaPipe source, Homebrew OpenCV and TBB (brew install opencv tbb).

The project includes custom build files in the MediaPipe tree that produce a single self-contained dylib with MediaPipe, OpenCV, protobuf, and TBB all statically linked (no Homebrew runtime dependencies):

  • mediapipe/tasks/c/vision/hand_landmarker/mediapipe_jni.cc -- JNI shim
  • mediapipe/tasks/c/vision/hand_landmarker/BUILD -- combined dylib target
  • third_party/opencv_macos.BUILD -- static OpenCV linkage
cd ~/Source/mediapipe
bazelisk build --config darwin_arm64 -c opt --strip always \
  --define MEDIAPIPE_DISABLE_GPU=1 \
  --repo_env=HERMETIC_PYTHON_VERSION=3.12 \
  "--per_file_copt=external/zlib/.*@-UTARGET_OS_MAC" \
  "--host_per_file_copt=external/zlib/.*@-UTARGET_OS_MAC" \
  //mediapipe/tasks/c/vision/hand_landmarker:libmediapipe_hand_jni.dylib

cp bazel-bin/mediapipe/tasks/c/vision/hand_landmarker/libmediapipe_hand_jni.dylib \
   ~/Source/orphic-fm-app/core/mediapipe/src/jvmMain/resources/native/darwin-aarch64/

To update the model file:

curl -L -o core/mediapipe/src/jvmMain/resources/models/hand_landmarker.task \
  https://storage.googleapis.com/mediapipe-models/hand_landmarker/hand_landmarker/float16/latest/hand_landmarker.task

Dependencies

Dependency Description
Kotlin Language and multiplatform framework (2.3.0)
Compose Multiplatform Cross-platform UI for Desktop, Android, and Web
Material3 Material Design 3 components and adaptive layouts
Liquid Glassmorphism blur effects for Compose
Metro Compile-time dependency injection for Kotlin by Zac Sweers
miniaudio C audio I/O library for Desktop JNI engine
Oboe Google's C++ low-latency audio library for Android
Tidal Cycles A REPL (Read-Eval-Print Loop) language for Live Coding Musical patterns
Strudel JS Live Coding Music Editor used for inspiration
ktmidi Kotlin Multiplatform MIDI I/O
CoreMIDI4J macOS CoreMIDI access for JVM
Mutable Instruments Eurorack Emilie Gillet's open-source module firmware -- Plaits engines, Rings reverb, and drum synthesis ported to C++
Koog AI agent framework with Gemini integration
Ktor Kotlin async HTTP client
KmLogging Kotlin Multiplatform structured logging
Markdown Renderer Multiplatform Markdown rendering for Compose by Mike Penz
BuildKonfig Cross-platform BuildConfig for KMP
Logback JVM logging framework
MediaPipe Hand landmark detection and gesture recognition (Tasks SDK on Android, C API via JNI on Desktop)
AndroidX CameraX Camera capture and lifecycle management on Android
JavaCV Camera capture on Desktop (FFmpeg/avfoundation)
Emscripten C++ to WebAssembly compiler for WASM DSP engine

License: GNU GPLv3

About

An 8-oscillator synthesizer built with Kotlin Multiplatform, targeting Desktop and Android.

Topics

Resources

License

Stars

Watchers

Forks

Contributors