Skip to content

Conversation

@HavenDV
Copy link
Contributor

@HavenDV HavenDV commented Aug 27, 2025

Summary by CodeRabbit

  • New Features
    • Introduced language_detection_options for transcript requests, enabling configuration of expected_languages (list of language codes) and a fallback_language (default “auto”) to control Automatic Language Detection behavior.
    • Enforced strict validation for these options to prevent unsupported fields; existing parameters remain unchanged.

@coderabbitai
Copy link

coderabbitai bot commented Aug 27, 2025

Walkthrough

Adds a new strict language_detection_options object to TranscriptOptionalParams in the OpenAPI spec, introducing expected_languages (array of strings) and fallback_language (string, default "auto"), without modifying other existing fields.

Changes

Cohort / File(s) Summary
OpenAPI spec: transcript params
src/libs/AssemblyAI/openapi.yaml
Introduced language_detection_options under TranscriptOptionalParams with additionalProperties: false, containing expected_languages: string[] and fallback_language: string (default "auto"); added labels and descriptions for both fields.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant API as Transcription API
  participant LD as Language Detector
  participant Model as Transcription Model

  rect rgb(235, 245, 255)
    note right of Client: Request includes language_detection_options
    Client->>API: Create transcript (audio, language_detection_options)
    API->>LD: Detect language(s)
    alt Detected in expected_languages
      LD-->>API: detected_language
    else Not in expected_languages
      note right of LD: Use fallback_language (or "auto")
      LD-->>API: fallback_language
    end
    API->>Model: Transcribe with chosen language
    Model-->>API: Transcript result
    API-->>Client: Transcript response
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

Hoppity-hop through YAML fields I go,
New options sprout where languages flow.
Expected tongues in a tidy row,
A fallback whisper: “auto” knows.
Ears perked high, I parse with cheer—
Spec is crisp, the path is clear. 🐇✨

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bot/update-openapi_202508270919

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@HavenDV HavenDV enabled auto-merge (squash) August 27, 2025 09:20
@coderabbitai coderabbitai bot changed the title feat:@coderabbitai feat:Add strict language_detection_options to TranscriptOptionalParams Aug 27, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/libs/AssemblyAI/openapi.yaml (2)

1256-1261: Document interplay with language_detection and precedence vs language_code.

Clarify whether language_detection_options is ignored unless language_detection=true, and what happens if language_code is also set. Consider adding an if/then constraint in 3.1 (JSON Schema) to enforce valid combinations.

Do you want me to add an if/then schema so options are only allowed when language_detection is true?


1256-1275: Optional: add an example under TranscriptOptionalParams showing language_detection_options.

Helps SDKs and users.

I can add a minimal example using expected_languages: ["en_us","es"] and fallback_language: "auto".

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c531b33 and 859e03d.

⛔ Files ignored due to path filters (4)
  • src/libs/AssemblyAI/Generated/AssemblyAI.JsonSerializerContextTypes.g.cs is excluded by !**/generated/**
  • src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParams.g.cs is excluded by !**/generated/**
  • src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.Json.g.cs is excluded by !**/generated/**
  • src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.g.cs is excluded by !**/generated/**
📒 Files selected for processing (1)
  • src/libs/AssemblyAI/openapi.yaml (1 hunks)

Comment on lines +1262 to +1269
expected_languages:
x-label: Minimum speakers expected
description: List of languages expected in the audio file.
type: array
objects:
x-label: language
type: string
fallback_language:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Fix invalid OpenAPI: use items (not objects) and correct label/schema for expected_languages.

Array schemas must use items. Current use of objects will break validation/SDK generation. Also the x-label is incorrect.

Apply:

-            expected_languages:
-              x-label: Minimum speakers expected
-              description: List of languages expected in the audio file.
-              type: array
-              objects:
-                x-label: language
-                type: string
+            expected_languages:
+              x-label: Expected languages
+              description: List of languages expected in the audio file.
+              type: array
+              items:
+                anyOf:
+                  - $ref: "#/components/schemas/TranscriptLanguageCode"
+                  - type: string
+                x-ts-type: LiteralUnion<TranscriptLanguageCode, string>
+              minItems: 1
+              uniqueItems: true
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
expected_languages:
x-label: Minimum speakers expected
description: List of languages expected in the audio file.
type: array
objects:
x-label: language
type: string
fallback_language:
expected_languages:
x-label: Expected languages
description: List of languages expected in the audio file.
type: array
items:
anyOf:
- $ref: "#/components/schemas/TranscriptLanguageCode"
- type: string
x-ts-type: LiteralUnion<TranscriptLanguageCode, string>
minItems: 1
uniqueItems: true
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1262 to 1269, the
expected_languages array schema is invalid: it uses "objects" and misplaces
x-label. Replace "objects" with "items", move the per-item x-label (if needed)
under items, and ensure items:type is string; keep the array-level
x-label/description/type as-is. Concretely, define expected_languages with type:
array, an optional x-label for the array, description, and an items block with
type: string and any per-item x-label/schema.

Comment on lines +1270 to +1275
x-label: Fallback language
description: |
If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Align fallback_language description with type and constrain values.

Doc says specify ["auto"] (array) but the field is a string. Use "auto" (no brackets) and constrain to either a language code or the literal "auto".

-            fallback_language:
-              x-label: Fallback language
-              description: |
-                If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
-              type: string
-              default: "auto"
+            fallback_language:
+              x-label: Fallback language
+              description: |
+                If the detected language is not in `expected_languages`, this value is used. Set to "auto" to choose the highest-confidence language from `expected_languages`. Requires `expected_languages` to be non-empty when "auto" is used.
+              anyOf:
+                - $ref: "#/components/schemas/TranscriptLanguageCode"
+                - type: string
+                  enum: ["auto"]
+              default: "auto"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
x-label: Fallback language
description: |
If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"
fallback_language:
x-label: Fallback language
description: |
If the detected language is not in `expected_languages`, this value is used. Set to "auto" to choose the highest-confidence language from `expected_languages`. Requires `expected_languages` to be non-empty when "auto" is used.
anyOf:
- $ref: "#/components/schemas/TranscriptLanguageCode"
- type: string
enum: ["auto"]
default: "auto"
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1270-1275, the description
incorrectly refers to ["auto"] (an array) while the field is defined as a
string; update the description to say use "auto" (a string) and then add a
constraint so the value must be either the literal "auto" or a valid language
code (e.g., BCP-47); implement that constraint via an enum of allowed literals
or a regex/pattern that permits "auto" or language codes and adjust the
description to reflect the allowed values.

@HavenDV HavenDV disabled auto-merge August 27, 2025 11:18
@HavenDV HavenDV merged commit f679407 into main Aug 27, 2025
3 of 4 checks passed
@HavenDV HavenDV deleted the bot/update-openapi_202508270919 branch August 27, 2025 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants