-
-
Notifications
You must be signed in to change notification settings - Fork 1
feat:Add language_detection_options to TranscriptOptionalParams schema #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a new language_detection_options object to TranscriptOptionalParams in src/libs/AssemblyAI/openapi.yaml, introducing expected_languages (string array) and fallback_language (string with default "auto") to configure Automatic Language Detection behavior. No other schema or behavioral changes are introduced. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Client
participant API as Transcript API
participant LD as Language Detection
participant ASR as Transcription
C->>API: Create Transcript { language_detection_options { expected_languages, fallback_language } }
API->>LD: Detect language (with expected_languages)
alt Detected ∈ expected_languages
LD-->>API: detected_language
else Not in expected_languages
LD-->>API: use fallback_language (default "auto")
end
API->>ASR: Transcribe with selected language
ASR-->>API: Transcript result
API-->>C: Response
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (4)
src/libs/AssemblyAI/Generated/AssemblyAI.JsonSerializerContextTypes.g.csis excluded by!**/generated/**src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParams.g.csis excluded by!**/generated/**src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.Json.g.csis excluded by!**/generated/**src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.g.csis excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/AssemblyAI/openapi.yaml(1 hunks)
| language_detection_options: | ||
| x-label: Specify options for Automatic Language Detection. | ||
| description: Specify options for Automatic Language Detection. | ||
| type: object | ||
| additionalProperties: false | ||
| properties: | ||
| expected_languages: | ||
| x-label: Minimum speakers expected | ||
| description: List of languages expected in the audio file. | ||
| type: array | ||
| objects: | ||
| x-label: language | ||
| type: string | ||
| fallback_language: | ||
| x-label: Fallback language | ||
| description: | | ||
| If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score. | ||
| type: string | ||
| default: "auto" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Fix invalid OpenAPI array schema and misleading docs ("objects" → "items"; '["auto"]' → 'auto').
- Array items must be under "items", not "objects" (this breaks schema validation).
- The fallback_language description suggests an array value '["auto"]' but the schema is string; use "auto".
- Minor copy/paste: x-label for expected_languages says "Minimum speakers expected".
Apply this diff:
language_detection_options:
x-label: Specify options for Automatic Language Detection.
description: Specify options for Automatic Language Detection.
type: object
additionalProperties: false
properties:
expected_languages:
- x-label: Minimum speakers expected
+ x-label: Expected languages
description: List of languages expected in the audio file.
type: array
- objects:
- x-label: language
- type: string
+ items:
+ x-label: Language code
+ anyOf:
+ - $ref: "#/components/schemas/TranscriptLanguageCode"
+ - type: string
+ x-ts-type: LiteralUnion<TranscriptLanguageCode, string>
fallback_language:
x-label: Fallback language
description: |
- If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
+ If the detected language of the audio file is not in `expected_languages`, the `fallback_language` is used. Specify `auto` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"Run to surface any other misuses of "objects" under arrays and preview the changed block:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="src/libs/AssemblyAI/openapi.yaml"
echo "Searching for invalid 'objects:' under array types..."
rg -n -C2 -e 'type:\s*array' -e '^\s*objects:' "$FILE"
echo -e "\nShow the language_detection_options snippet for review:"
rg -n -C5 'language_detection_options:' "$FILE"Length of output: 7921
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="src/libs/AssemblyAI/openapi.yaml"
echo "Showing language_detection_options block (lines 1256-1275):"
sed -n '1256,1275p' "$FILE"
echo
echo "Searching for TranscriptLanguageCode schema definition:"
rg -n '#/components/schemas/TranscriptLanguageCode' "$FILE" || echo "Not found"Length of output: 1549
Fix OpenAPI schema for language_detection_options expected_languages and fallback_language
- File:
src/libs/AssemblyAI/openapi.yaml(around lines 1261–1270)- Update the
x-labelforexpected_languagesfrom “Minimum speakers expected” to “Expected languages” - Replace the invalid
objects:key under theexpected_languagesarray withitems: - Define each item using
anyOfto reference theTranscriptLanguageCodeenum or a raw string, and add the correspondingx-ts-type - Clarify the
fallback_languagedescription to instruct passingauto(string) instead of["auto"]
- Update the
Apply this diff:
--- a/src/libs/AssemblyAI/openapi.yaml
+++ b/src/libs/AssemblyAI/openapi.yaml
@@ -1261,11 +1261,18 @@
expected_languages:
- x-label: Minimum speakers expected
+ x-label: Expected languages
description: List of languages expected in the audio file.
type: array
- objects:
- x-label: language
- type: string
+ items:
+ x-label: Language code
+ anyOf:
+ - $ref: "#/components/schemas/TranscriptLanguageCode"
+ - type: string
+ x-ts-type: LiteralUnion<TranscriptLanguageCode, string>
fallback_language:
x-label: Fallback language
description: |
- If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
+ If the detected language of the audio file is not in `expected_languages`, the `fallback_language` is used. Specify `auto` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| language_detection_options: | |
| x-label: Specify options for Automatic Language Detection. | |
| description: Specify options for Automatic Language Detection. | |
| type: object | |
| additionalProperties: false | |
| properties: | |
| expected_languages: | |
| x-label: Minimum speakers expected | |
| description: List of languages expected in the audio file. | |
| type: array | |
| objects: | |
| x-label: language | |
| type: string | |
| fallback_language: | |
| x-label: Fallback language | |
| description: | | |
| If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score. | |
| type: string | |
| default: "auto" | |
| language_detection_options: | |
| x-label: Specify options for Automatic Language Detection. | |
| description: Specify options for Automatic Language Detection. | |
| type: object | |
| additionalProperties: false | |
| properties: | |
| expected_languages: | |
| x-label: Expected languages | |
| description: List of languages expected in the audio file. | |
| type: array | |
| items: | |
| x-label: Language code | |
| anyOf: | |
| - $ref: "#/components/schemas/TranscriptLanguageCode" | |
| - type: string | |
| x-ts-type: LiteralUnion<TranscriptLanguageCode, string> | |
| fallback_language: | |
| x-label: Fallback language | |
| description: | | |
| If the detected language of the audio file is not in `expected_languages`, the `fallback_language` is used. Specify `auto` to let our model choose the fallback language from `expected_languages` with the highest confidence score. | |
| type: string | |
| default: "auto" |
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1256-1275, fix the
language_detection_options schema: change the x-label for expected_languages
from "Minimum speakers expected" to "Expected languages"; replace the invalid
objects: key under expected_languages with items: and define each item as anyOf
referencing the TranscriptLanguageCode enum or a plain string, adding the
appropriate x-ts-type to reflect the union; and update the fallback_language
description to instruct passing the string "auto" (not ["auto"]) as the fallback
indicator.
Pull request was closed
Summary by CodeRabbit
New Features
Documentation