Skip to content

Conversation

@aguvener
Copy link

@aguvener aguvener commented Oct 8, 2025

Add macOS Support (Apple Silicon and Intel)

Draft status: NEON (ARM64) path in progress. ARM64 currently ships a scalar (optionally vDSP-accelerated) fallback, so performance trails x86_64 AVX2 for now.

Summary

Adds macOS build support on Apple Silicon (arm64) and Intel (x86_64) with automatic Homebrew dependency detection and architecture-aware flags.

Highlights

  • Detects host architecture and applies AVX2/FMA only on x86_64, keeping Docker builds unchanged.
  • Introduces phase/src/models/simd_compat.h to provide AVX2 vectors with a scalar fallback that preserves the existing HMM API.
  • Fixes macOS linker order by placing -L search paths before -l libraries.

Testing

  • Done: macOS 14 (ARM64)
  • Pending: macOS Intel (x86_64)
  • Pending: Linux x86_64 regression check

Technical notes

  • make now gates AVX2/FMA flags behind an x86_64 host check; other architectures emit a skip message instead of failing.
  • Linker flags are regrouped so search paths precede dependent libraries, resolving ld failures on macOS.
  • The SIMD layer falls back to scalar math on arm64 while keeping the vector API identical to the AVX2 implementation.

Compatibility

Platform Architecture SIMD path Status
Linux x86_64 AVX2 Pending verification
macOS x86_64 AVX2 Pending verification
macOS ARM64 Scalar fallback Verified (slower than AVX2)

Next steps

  • Port the SIMD layer to ARM NEON before leaving draft.
  • Confirm builds on macOS Intel and Linux x86_64.
  • Continue enabling Linux ARM64 builds with matching SIMD support.

Reviewer checklist

  • macOS Intel: build and exercise the AVX2 path.
  • Linux: ensure legacy builds still succeed.

macOS setup

brew install boost htslib openssl@3 libdeflate
make

Scalar fallback keeps correctness while NEON optimisations are prepared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant