Skip to content

Releases: aritropaul/voom

v3.1.0 — Bring Your Own AI

06 Mar 17:44

Choose a tag to compare

Bring Your Own AI

Connect your own API key from OpenAI, Anthropic, Google, or xAI to power AI-generated titles, summaries, and chapters. No middleman — Voom calls provider APIs directly.

What's new

  • BYOK AI integration — paste your API key in Settings and Voom auto-detects the provider (OpenAI, Anthropic, Google, xAI) from the key prefix
  • Direct provider APIs — no OpenRouter or proxy services; Voom calls each provider's native API (OpenAI chat completions, Anthropic Messages, Google Gemini, xAI)
  • Current models — GPT-5.4, Claude Opus/Sonnet 4.6, Gemini 3.1 Pro, Grok 4.1 Fast, and more
  • Seamless fallback — without an API key, everything works exactly as before using Apple's on-device Foundation Models
  • New VoomAI package — clean separation of AI provider logic into its own Swift package

Providers & models

Provider Models
OpenAI GPT-5.4, GPT-5 Mini, GPT-5 Nano
Anthropic Claude Opus 4.6, Claude Sonnet 4.6, Claude Haiku 4.5
Google Gemini 3.1 Pro, Gemini 3 Flash, Gemini 3.1 Flash Lite
xAI Grok 4.1 Fast, Grok 4 Fast

Zero regression

If you don't configure an API key, Voom behaves identically to v3.0.2 — on-device Apple Foundation Models handle all AI features automatically.

Full Changelog: v3.0.2...v3.1.0

v3.0.2 — Meeting Recorder Pipeline & AI Fixes

06 Mar 16:41

Choose a tag to compare

What's New

Meeting Recording Pipeline

  • Meeting recordings now use the dedicated MeetingRecorder — HD/2K resolution (capped at 2560), 30fps, with split-track audio for speaker diarization
  • Previously, meetings incorrectly used ScreenRecorder at native retina resolution with no diarization

AI Generation Fixes

  • Title and summary generation now subsamples transcripts (60-80 segments) to fit Apple Foundation Models' context window — fixes failures on long recordings
  • Speaker labels from diarization are included in AI prompts for meeting-aware titles and summaries
  • Lowered word count thresholds (title: 5→2, summary: 10→3) so short recordings still get AI-generated metadata

UI Improvements

  • Summary card in player view fits content height without internal scrolling
  • Edit summary via pencil icon in the section header; commits on focus loss

Full Changelog: v3.0.1...v3.0.2

v3.0.1 — Web-Optimized Sharing

06 Mar 07:52

Choose a tag to compare

What's New

  • Web-Optimized Sharing — Shared videos are re-encoded to H.264 for universal browser playback with smaller file sizes and faster loading.
  • Captions & Speaker Labels — Share pages now serve WebVTT captions with speaker names, visible in the player and in fullscreen.
  • Chapters on Share Pages — Auto-generated chapters appear on share pages with clickable timestamps and seekbar markers.
  • Share Pipeline Progress — Optimization progress is shown in the share sheet before upload begins.
  • What's New Sheet — First launch after update highlights v3.0.1 features via WhatsNewKit.

Worker Changes

  • New /vtt/:code route serves WebVTT captions with speaker labels.
  • Chapters and speaker metadata included in share uploads.
  • D1 migration 0003_chapters_speakers.sql adds chapters table, speaker column, and is_meeting flag.

Full Changelog

https://github.com/aritropaul/voom/blob/main/CHANGELOG.md

Full Changelog: v3.0.0...v3.0.1

v3.0.0

06 Mar 00:54

Choose a tag to compare

Voom 3.0.0

Major Changes

  • FluidAudio migration — replaced WhisperKit with FluidAudio for on-device ASR and speaker diarization
  • Architecture split — refactored into VoomCore, VoomApp, and VoomMeetings packages

Speaker Diarization

  • Split-track diarization: saves separate mic and system audio during meeting recordings for accurate speaker separation
  • "You" identification: mic audio diarized separately so the local user's segments are labeled "You"
  • Remote speaker separation: system audio diarized independently for cleaner multi-speaker identification

Auto Chapters

  • Chapters now auto-generate after transcription for both regular and meeting recordings
  • Fixed chapter generation only covering first ~2 minutes by subsampling transcript evenly across full duration

Bug Fixes

  • Fixed garbled mic audio caused by AVAudioEngine voice processing conflicting with ScreenCaptureKit audio capture
  • Speaker labels displayed in transcript view

Debug

  • Added debug mode in settings (debug builds only)

Full Changelog: v2.8.1...v3.0.0

v2.8.1

05 Mar 20:02

Choose a tag to compare

What's Changed

Performance

  • ~40% smaller video files — Optimized HEVC compression with 8 Mbps bitrate, B-frames enabled, and 4-second GOP intervals. Screen capture frame rate synced to 30fps to match encoding.

Audio

  • Hardware echo cancellation & noise suppression — Replaced raw AVCaptureSession mic input with AVAudioEngine voice processing unit for cleaner audio in all recordings.

AI Improvements

  • Smarter titles — AI title generation now uses the full transcript instead of only the first 2 minutes. Prompts tuned to focus on primary work topics, ignoring small talk and filler.
  • Better summaries — AI summaries now produce 4–8 sentences covering key decisions, outcomes, and action items.

Meeting Detection

  • Instant auto-stop — Meeting recordings now stop immediately when camera turns off, instead of waiting for 10 seconds of audio silence.

Full Changelog: v2.8.0...v2.8.1

v2.8.0 — Meeting Detection & Theme Polish

05 Mar 18:46

Choose a tag to compare

What's New

Meeting Detection

  • Voom can now detect when you're in a meeting by checking your calendar events and prompting you to record
  • Upcoming meetings appear in the menu bar tray with a quick-join link
  • Auto-start recording when a meeting begins (skips countdown) and auto-stop when it ends
  • System audio activity tracking ensures recordings don't stop while participants are still talking
  • New setting toggle in preferences with calendar permission request

Theme & UI

  • ShareSettingsSheet fully redesigned with VoomTheme — custom styled buttons, consistent typography, proper dividers
  • FillerWordSheet upgraded with themed progress indicator, ScrollView-based list, and improved layout
  • Removed camera/mic/system audio badges from player detail view for a cleaner look

AI Improvements

  • Title and summary generation now handles short or trivial transcripts gracefully (minimum word count guards)
  • Hardened prompts prevent the AI from misinterpreting raw transcript text as instructions
  • Recordings with very short transcripts get their filename restored instead of a confused AI title

Full Changelog: v2.7.3...v2.8.0

v2.7.3

04 Mar 22:55

Choose a tag to compare

Full Changelog: v2.7.2...v2.7.3

v2.7.2

04 Mar 06:13

Choose a tag to compare

Full Changelog: v2.7.1...v2.7.2

v2.7.1

04 Mar 06:02

Choose a tag to compare

Full Changelog: v2.7.0...v2.7.1

v2.7.0

04 Mar 05:39

Choose a tag to compare

Full Changelog: v2.6.2...v2.7.0