Skip to content

Conversation

@HavenDV
Copy link
Contributor

@HavenDV HavenDV commented Aug 27, 2025

Summary by CodeRabbit

  • New Features

    • Added configurable Language Detection Options for transcription requests, including:
      • Expected languages: specify a list of languages anticipated in the audio.
      • Fallback language: define a fallback when detected language isn’t in the expected list (defaults to auto).
    • Provides finer control over automatic language detection for improved accuracy.
  • Documentation

    • Added descriptive labels and guidance for configuring expected and fallback languages.

@coderabbitai
Copy link

coderabbitai bot commented Aug 27, 2025

Walkthrough

Adds a new language_detection_options object to TranscriptOptionalParams in src/libs/AssemblyAI/openapi.yaml, introducing expected_languages (string array) and fallback_language (string with default "auto") to configure Automatic Language Detection behavior. No other schema or behavioral changes are introduced.

Changes

Cohort / File(s) Summary of Changes
AssemblyAI OpenAPI schema
src/libs/AssemblyAI/openapi.yaml
Added language_detection_options to TranscriptOptionalParams with fields: expected_languages (array of strings) and fallback_language (string, default "auto"), including x-labels and descriptions documenting Automatic Language Detection configuration.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant API as Transcript API
  participant LD as Language Detection
  participant ASR as Transcription

  C->>API: Create Transcript { language_detection_options { expected_languages, fallback_language } }
  API->>LD: Detect language (with expected_languages)
  alt Detected ∈ expected_languages
    LD-->>API: detected_language
  else Not in expected_languages
    LD-->>API: use fallback_language (default "auto")
  end
  API->>ASR: Transcribe with selected language
  ASR-->>API: Transcript result
  API-->>C: Response
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitch my ears at YAML’s tune,
New options hop in, right on cue—
Expected tongues, a fallback too,
If words get lost, we’ll know what to do.
Carrots compiled, transcripts bloom—
Language detected, thump-thump boom! 🥕🐇

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bot/update-openapi_202508270627

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@HavenDV HavenDV enabled auto-merge (squash) August 27, 2025 06:28
@coderabbitai coderabbitai bot changed the title feat:@coderabbitai feat:Add language_detection_options to TranscriptOptionalParams schema Aug 27, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c531b33 and 4affd9f.

⛔ Files ignored due to path filters (4)
  • src/libs/AssemblyAI/Generated/AssemblyAI.JsonSerializerContextTypes.g.cs is excluded by !**/generated/**
  • src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParams.g.cs is excluded by !**/generated/**
  • src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.Json.g.cs is excluded by !**/generated/**
  • src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.g.cs is excluded by !**/generated/**
📒 Files selected for processing (1)
  • src/libs/AssemblyAI/openapi.yaml (1 hunks)

Comment on lines +1256 to +1275
language_detection_options:
x-label: Specify options for Automatic Language Detection.
description: Specify options for Automatic Language Detection.
type: object
additionalProperties: false
properties:
expected_languages:
x-label: Minimum speakers expected
description: List of languages expected in the audio file.
type: array
objects:
x-label: language
type: string
fallback_language:
x-label: Fallback language
description: |
If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Fix invalid OpenAPI array schema and misleading docs ("objects" → "items"; '["auto"]' → 'auto').

  • Array items must be under "items", not "objects" (this breaks schema validation).
  • The fallback_language description suggests an array value '["auto"]' but the schema is string; use "auto".
  • Minor copy/paste: x-label for expected_languages says "Minimum speakers expected".

Apply this diff:

 language_detection_options:
   x-label: Specify options for Automatic Language Detection.
   description: Specify options for Automatic Language Detection.
   type: object
   additionalProperties: false
   properties:
     expected_languages:
-      x-label: Minimum speakers expected
+      x-label: Expected languages
       description: List of languages expected in the audio file.
       type: array
-      objects:
-        x-label: language
-        type: string
+      items:
+        x-label: Language code
+        anyOf:
+          - $ref: "#/components/schemas/TranscriptLanguageCode"
+          - type: string
+        x-ts-type: LiteralUnion<TranscriptLanguageCode, string>
     fallback_language:
       x-label: Fallback language
       description: |
-        If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
+        If the detected language of the audio file is not in `expected_languages`, the `fallback_language` is used. Specify `auto` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
       type: string
       default: "auto"

Run to surface any other misuses of "objects" under arrays and preview the changed block:


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="src/libs/AssemblyAI/openapi.yaml"

echo "Searching for invalid 'objects:' under array types..."
rg -n -C2 -e 'type:\s*array' -e '^\s*objects:' "$FILE"

echo -e "\nShow the language_detection_options snippet for review:"
rg -n -C5 'language_detection_options:' "$FILE"

Length of output: 7921


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="src/libs/AssemblyAI/openapi.yaml"

echo "Showing language_detection_options block (lines 1256-1275):"
sed -n '1256,1275p' "$FILE"

echo
echo "Searching for TranscriptLanguageCode schema definition:"
rg -n '#/components/schemas/TranscriptLanguageCode' "$FILE" || echo "Not found"

Length of output: 1549


Fix OpenAPI schema for language_detection_options expected_languages and fallback_language

  • File: src/libs/AssemblyAI/openapi.yaml (around lines 1261–1270)
    • Update the x-label for expected_languages from “Minimum speakers expected” to “Expected languages”
    • Replace the invalid objects: key under the expected_languages array with items:
    • Define each item using anyOf to reference the TranscriptLanguageCode enum or a raw string, and add the corresponding x-ts-type
    • Clarify the fallback_language description to instruct passing auto (string) instead of ["auto"]

Apply this diff:

--- a/src/libs/AssemblyAI/openapi.yaml
+++ b/src/libs/AssemblyAI/openapi.yaml
@@ -1261,11 +1261,18 @@
             expected_languages:
-              x-label: Minimum speakers expected
+              x-label: Expected languages
               description: List of languages expected in the audio file.
               type: array
-              objects:
-                x-label: language
-                type: string
+              items:
+                x-label: Language code
+                anyOf:
+                  - $ref: "#/components/schemas/TranscriptLanguageCode"
+                  - type: string
+                x-ts-type: LiteralUnion<TranscriptLanguageCode, string>
             fallback_language:
               x-label: Fallback language
               description: |
-                If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
+                If the detected language of the audio file is not in `expected_languages`, the `fallback_language` is used. Specify `auto` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
               type: string
               default: "auto"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
language_detection_options:
x-label: Specify options for Automatic Language Detection.
description: Specify options for Automatic Language Detection.
type: object
additionalProperties: false
properties:
expected_languages:
x-label: Minimum speakers expected
description: List of languages expected in the audio file.
type: array
objects:
x-label: language
type: string
fallback_language:
x-label: Fallback language
description: |
If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"
language_detection_options:
x-label: Specify options for Automatic Language Detection.
description: Specify options for Automatic Language Detection.
type: object
additionalProperties: false
properties:
expected_languages:
x-label: Expected languages
description: List of languages expected in the audio file.
type: array
items:
x-label: Language code
anyOf:
- $ref: "#/components/schemas/TranscriptLanguageCode"
- type: string
x-ts-type: LiteralUnion<TranscriptLanguageCode, string>
fallback_language:
x-label: Fallback language
description: |
If the detected language of the audio file is not in `expected_languages`, the `fallback_language` is used. Specify `auto` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1256-1275, fix the
language_detection_options schema: change the x-label for expected_languages
from "Minimum speakers expected" to "Expected languages"; replace the invalid
objects: key under expected_languages with items: and define each item as anyOf
referencing the TranscriptLanguageCode enum or a plain string, adding the
appropriate x-ts-type to reflect the union; and update the fallback_language
description to instruct passing the string "auto" (not ["auto"]) as the fallback
indicator.

@HavenDV HavenDV closed this Aug 27, 2025
auto-merge was automatically disabled August 27, 2025 11:18

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants