Skip to content

Conversation

@naincy128
Copy link
Contributor

@naincy128 naincy128 commented Sep 6, 2025

Description:

This PR addresses a bug where excessively large system messages, typically caused by unit content with extensive video transcripts, result in Xpert message failures. In these scenarios, the system prompt consumes the entire token limit, leaving no room for learner or LA messages, thus breaking the interaction flow.

Resolution:

  1. Unit content is now handled using proportional trimming before sending to Xpert.
  2. This ensures the prompt includes a balanced amount of all content types, leaving enough room for both learner and LA messages.

Test Coverage:

  1. Confirmed that learner messages now reach Xpert successfully.
  2. Added unit tests to validate trimming behaviour for extremely large unit contents.

All tests and lint checks pass.
Results (1.32s):
55 passed

2U Private Jira Link

Jira ticket: https://2u-internal.atlassian.net/browse/COSMO2-14

Copilot AI review requested due to automatic review settings September 23, 2025 14:05
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a bug where extremely large system messages (typically from extensive video transcripts) were causing Xpert Assistant failures by consuming the entire token limit and leaving no room for learner or LA messages.

  • Implements proportional trimming for unit content to ensure balanced inclusion of all content types
  • Calculates dynamic limits based on static content size to maintain room for chat messages
  • Adds comprehensive test coverage for various content scenarios including large video transcripts

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
learning_assistant/api.py Implements proportional trimming logic for unit content handling
tests/test_api.py Adds extensive test cases for content trimming scenarios

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copilot AI review requested due to automatic review settings September 23, 2025 14:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copilot AI review requested due to automatic review settings September 23, 2025 15:02
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copilot AI review requested due to automatic review settings September 23, 2025 16:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copilot AI review requested due to automatic review settings September 24, 2025 05:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Member

@mraman-2U mraman-2U left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the release version here
Add the new version to the change log

Copilot AI review requested due to automatic review settings October 8, 2025 04:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

trimmed_unit_content.append({"content_type": ctype, "content_text": ""})
continue

allowed_chars = max(1, int((len(text) / total_chars) * adjusted_unit_limit))
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Division by zero is possible if total_chars is 0. Although there's a check at line 159, this code could still execute if total_chars becomes 0 through other paths.

Suggested change
allowed_chars = max(1, int((len(text) / total_chars) * adjusted_unit_limit))
if total_chars == 0:
allowed_chars = 1 # fallback to minimum allowed
else:
allowed_chars = max(1, int((len(text) / total_chars) * adjusted_unit_limit))

Copilot uses AI. Check for mistakes.
Comment on lines +202 to +211
'A' * 200, # Long string case to test trimming
[ # VIDEO content case
{'content_type': 'VIDEO', 'content_text': f"Video transcript {i} " + ("A" * 200)} for i in range(10)
],
[ # TEXT content case
{'content_type': 'TEXT', 'content_text': f"Paragraph {i} " + ("B" * 100)} for i in range(20)
],
[ # Mixed VIDEO + TEXT case
{'content_type': 'VIDEO', 'content_text': "Video intro " + ("C" * 100)},
{'content_type': 'TEXT', 'content_text': "Some explanation " + ("D" * 100)},
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Creating large test strings with repeated characters could be memory inefficient. Consider using smaller test data or generating content on-demand within the test method.

Suggested change
'A' * 200, # Long string case to test trimming
[ # VIDEO content case
{'content_type': 'VIDEO', 'content_text': f"Video transcript {i} " + ("A" * 200)} for i in range(10)
],
[ # TEXT content case
{'content_type': 'TEXT', 'content_text': f"Paragraph {i} " + ("B" * 100)} for i in range(20)
],
[ # Mixed VIDEO + TEXT case
{'content_type': 'VIDEO', 'content_text': "Video intro " + ("C" * 100)},
{'content_type': 'TEXT', 'content_text': "Some explanation " + ("D" * 100)},
'A' * 20, # Long string case to test trimming (reduced size)
[ # VIDEO content case
{'content_type': 'VIDEO', 'content_text': f"Video transcript {i} " + ("A" * 20)} for i in range(10)
],
[ # TEXT content case
{'content_type': 'TEXT', 'content_text': f"Paragraph {i} " + ("B" * 10)} for i in range(20)
],
[ # Mixed VIDEO + TEXT case
{'content_type': 'VIDEO', 'content_text': "Video intro " + ("C" * 10)},
{'content_type': 'TEXT', 'content_text': "Some explanation " + ("D" * 10)},

Copilot uses AI. Check for mistakes.
@naincy128 naincy128 force-pushed the Cosmos2-14/Naincy128 branch from aa3daa4 to 3caaa4e Compare October 9, 2025 05:03
@naincy128 naincy128 merged commit c37d063 into edx:main Oct 9, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants