-
Notifications
You must be signed in to change notification settings - Fork 6
test: add Whisper transcription tests with real MP3 file #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add test using real 1.7MB MP3 file for transcription validation - Add TTS → Whisper chain test to verify round-trip functionality - Skip large MP3 test by default (takes ~7 seconds) with manual run instructions - Include test-transcript.mp3 file for testing real-world audio - Add 20-second timeout for the large file test 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
WalkthroughAdds a new skipped integration test for Whisper transcription using a real MP3 file in src/lib/test/integration/ai.test.ts, including file loading via Bun.file, transcribeAudio invocation with model and language, assertions on output length, a 20-second timeout, and a duplicated insertion of the same test. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Comment |
Deploying opensecret-sdk with
|
| Latest commit: |
970e640
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://0dd41ff8.opensecret-sdk.pages.dev |
| Branch Preview URL: | https://feat-whisper-api-final.opensecret-sdk.pages.dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR enhances the AI integration test suite by adding comprehensive Whisper transcription testing with a real-world MP3 file. The changes add a new skipped test case that validates the transcription functionality using a 1.7MB MP3 file (test-transcript.mp3) rather than relying solely on synthetic audio generated from text-to-speech operations.
The new test integrates with the existing authentication flow by calling setupTestUser() and uses Bun's file system APIs to read the MP3 file from the local filesystem. It creates a proper File object with the correct MIME type and passes it to the existing transcribeAudio function, which interfaces with the Whisper API. The test includes proper assertions to verify that meaningful transcribed text is returned and includes console logging for debugging purposes.
By default, the test is skipped using test.skip() to avoid impacting regular test execution performance, since it takes approximately 7 seconds to complete. The developer has included detailed comments with manual execution instructions using Bun's test name pattern matching, allowing developers to run this specific test when needed. The test also includes a 20-second timeout to accommodate the processing time for the large audio file.
This addition complements the existing TTS → Whisper chain test, providing more comprehensive coverage of the audio transcription pipeline by testing with real-world audio data rather than only round-trip synthetic audio.
Confidence score: 4/5
- This PR is safe to merge with minimal risk as it only adds test coverage without modifying production code
- Score reflects good testing practices with appropriate timeouts and skip logic, though the MP3 file dependency introduces a minor external dependency concern
- Pay close attention to ensure the
test-transcript.mp3file is properly included in the repository
Context used:
Context - Use assertions in tests instead of returning booleans and calling exit() to allow test frameworks to handle reporting. (link)
1 file reviewed, 1 comment
| await setupTestUser(); | ||
|
|
||
| // Read the test MP3 file | ||
| const mp3Path = new URL('./test-transcript.mp3', import.meta.url).pathname; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Using import.meta.url with .pathname may cause issues on Windows due to file URL format differences. Consider using fileURLToPath from Node.js url module for cross-platform compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
src/lib/test/integration/ai.test.ts (6)
174-176: Make file path resolution cross‑platform (Windows-safe).Using
.pathnameon afile://URL can break on Windows. Convert the URL to a real path first.Apply:
- const mp3Path = new URL('./test-transcript.mp3', import.meta.url).pathname; - const mp3Buffer = await Bun.file(mp3Path).arrayBuffer(); + const mp3Url = new URL('./test-transcript.mp3', import.meta.url); + const mp3Path = fileURLToPath(mp3Url); + const mp3Buffer = await Bun.file(mp3Path).arrayBuffer();Add at the top of the file:
import { fileURLToPath } from "node:url";
181-185: Stabilize the test by forcing deterministic decoding.Set
temperatureto 0 for more repeatable assertions.const transcriptionResult = await transcribeAudio( audioFile, "whisper-large-v3", // model - "en" // language + "en", // language + undefined, // prompt + 0.0 // temperature );
187-191: Avoid logging the full transcript in test output.Limit logs to a short preview and make it opt‑in via env var.
- console.log("Transcribed text from test MP3:", transcriptionResult.text); + if (process.env.DEBUG_TRANSCRIPTS) { + const preview = transcriptionResult.text.slice(0, 120).replace(/\s+/g, " ").trim(); + console.log("Transcribed preview:", preview, `(${transcriptionResult.text.length} chars)`); + }
189-192: Strengthen assertions to catch regressions.If the fixture content is stable, assert on a known substring or compare to a normalized “golden” text file rather than only length.
I can add a sidecar
test-transcript.txtand wire the test to compare a normalized subset (case/whitespace-insensitive). Want me to push that?
168-170: Gate the heavy test with an env flag instead of hardskip.Keeps it opt‑in locally while allowing CI/nightly runs when desired.
-// bun test --test-name-pattern="Whisper transcription with real MP3 file" src/lib/test/integration/ai.test.ts --env-file .env.local - -test.skip("Whisper transcription with real MP3 file", async () => { +// bun test --test-name-pattern="Whisper transcription with real MP3 file" src/lib/test/integration/ai.test.ts --env-file .env.local +// or set RUN_LARGE_MP3=1 to include it in the run +const runLargeAudio = process.env.RUN_LARGE_MP3 === "1"; +(runLargeAudio ? test : test.skip)("Whisper transcription with real MP3 file", async () => {
174-176: Store large audio fixtures via Git LFS to avoid repo bloat.A 1.7 MB MP3 will clone to every user; LFS keeps the repo lean and still makes local runs easy.
Add to
.gitattributes:*.mp3 filter=lfs diff=lfs merge=lfs -textThen:
git lfs track "*.mp3"and commit the pointer.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/lib/test/integration/test-transcript.mp3is excluded by!**/*.mp3
📒 Files selected for processing (1)
src/lib/test/integration/ai.test.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/lib/test/integration/ai.test.ts (1)
src/lib/api.ts (1)
transcribeAudio(1513-1548)
🤖 Generated with Claude Code
Summary by CodeRabbit