Skip to content

feat: Add conversation history with GenAI Chat API & optimize client …#20

Merged
ndayiemile merged 6 commits intomainfrom
baseKnowledgeIntegration
Sep 23, 2025
Merged

feat: Add conversation history with GenAI Chat API & optimize client …#20
ndayiemile merged 6 commits intomainfrom
baseKnowledgeIntegration

Conversation

@ndayiemile
Copy link
Collaborator

@ndayiemile ndayiemile commented Sep 23, 2025

…usage

• Implement thread-based conversation memory with file attachments • Migrate to Google GenAI Chat API for proper history support • Add comprehensive test suite
• Fix conversation history edge cases and error handling

Description by Korbit AI

What change is being made?

Add conversation history support with GenAI Chat API, optimize GenAI client usage and chat session handling, and extend backend workflow, tests, and models to support threaded conversations, file attachments, and related endpoints.

Why are these changes being made?

Introduce persistent conversation history to enable context-aware GenAI interactions, improve performance by reusing a single GenAI client, and expand test coverage and CI setup to ensure reliability across new features.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

…usage

• Implement thread-based conversation memory with file attachments
• Migrate to Google GenAI Chat API for proper history support
• Add comprehensive test suite
• Fix conversation history edge cases and error handling
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Documentation Incomplete function docstring ▹ view
Security Insecure API Key Storage ▹ view
Error Handling Generic exception handling masks database errors ▹ view
Performance Unbounded conversation history loading ▹ view
Readability Missing type hint for constant ▹ view ✅ Fix detected
Functionality Malformed XML closing tag ▹ view ✅ Fix detected
Readability Consolidate imports from same module ▹ view
Readability Typo in career center reference ▹ view ✅ Fix detected
Functionality Inconsistent function signature for generate_response calls ▹ view
Performance Redundant file uploads in conversation history ▹ view
Files scanned
File Path Reviewed
resumax_backend/resumax_algo/system_instructions.py
resumax_backend/resumax_api/views.py
resumax_backend/resumax_algo/gemini_model.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

SYSTEM_PROMPT = """
<identity>
<role> resume, cover letter, and cv writing assistant</role>
<parent_organization>Loeb Center, at Amherst College, Amherst,MA, USA<parent_organization/>

This comment was marked as resolved.

<expertise>
<area>Career services best practices</area>
<area>Resume, Cover letter and CV standards (US and international)</area>
<area>Amherst College's recommended carrier center recommended practices</area>

This comment was marked as resolved.

@@ -0,0 +1,58 @@
SYSTEM_PROMPT = """

This comment was marked as resolved.

Comment on lines +9 to +65
from .models import ConversationsThread, Conversation
from pathlib import Path

async def generate_response(promptText, fileUrls=None):
'''
This function generates content using the Gemini API
'''
client = genai.Client(api_key=settings.GEMINI_API_KEY)
if not fileUrls:
response = client.models.generate_content(
model="gemini-1.5-flash",
contents=promptText
)
return response.text

# Module-level client instance for reuse across functions
_client = None


def _get_genai_client():
"""Get or create a shared GenAI client instance"""
global _client
if _client is None:
if not settings.GEMINI_API_KEY:
raise Exception("GEMINI_API_KEY not configured")
_client = genai.Client(api_key=settings.GEMINI_API_KEY)
return _client


async def _get_or_create_chat_session(thread_id=None):
"""Get chat session with conversation history from database"""
client = _get_genai_client()

try:
# Create chat session with system instruction (text only)
system_content = system_instructions.SYSTEM_PROMPT

# Convert URLs to actual file paths using MEDIA_ROOT
file_data = [] # Store both path and MIME type
for fileUrl in fileUrls:
# Remove the MEDIA_URL prefix and construct the full file path
relative_path = fileUrl.replace(settings.MEDIA_URL, '')
full_path = pathlib.Path(settings.MEDIA_ROOT) / relative_path
# Get conversation history if thread_id provided
history = await _get_conversation_history(thread_id) if thread_id else None

# Check if file exists before adding to processing list
if full_path.exists():
# Determine MIME type from file extension
mime_type, _ = mimetypes.guess_type(str(full_path))
if not mime_type:
# Fallback to application/pdf for unknown types
mime_type = 'application/pdf'
file_data.append((full_path, mime_type))
# Debug: Print history information
if history:
print(f"🔄 Loading {len(history)} history messages for thread {thread_id}")
for i, msg in enumerate(history):
role = msg.get('role', 'unknown')
text_preview = msg['parts'][0].get('text', 'no content')[:50] if msg['parts'] else 'no content'
print(f" {i+1}. {role}: {text_preview}...")
print(f"📋 Full history structure: {history}")
else:
print(f"Warning: File not found: {full_path}")

# If no valid files found, process text only
if not file_data:
response = client.models.generate_content(
model="gemini-1.5-flash",
contents=promptText
print(f"📝 No history found for thread {thread_id}" if thread_id else "🆕 New conversation (no thread_id)")

chat = client.chats.create(
model='gemini-1.5-flash',
config={"system_instruction": system_content},
history=history
)
return response.text

return chat

except Exception as e:
raise Exception(f"Failed to create chat session: {e}")


@sync_to_async
def _get_conversation_history(thread_id):
"""Get conversation history from database for chat session"""
try:
from .models import AttachedFile
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consolidate imports from same module category Readability

Tell me more
What is the issue?

Split imports from the same module make the code's dependencies harder to track and maintain.

Why this matters

Scattered imports increase cognitive load when reading the code and make dependency management more difficult.

Suggested change ∙ Feature Preview
from .models import ConversationsThread, Conversation, AttachedFile
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

if not thread:
return []

conversations = Conversation.objects.filter(thread=thread).order_by('created_at')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded conversation history loading category Performance

Tell me more
What is the issue?

Database query loads all conversations without pagination or limiting, potentially fetching excessive conversation history.

Why this matters

Large conversation threads will cause memory bloat and slow response times as the entire history is loaded into memory and sent to the GenAI API, which may have context limits.

Suggested change ∙ Feature Preview

Implement conversation history limits and use select_related for attached files:

# Limit to recent conversations (e.g., last 50)
conversations = Conversation.objects.filter(thread=thread)\
    .select_related()\
    .prefetch_related('attachedfile_set')\
    .order_by('-created_at')[:50]\
    .reverse()  # Maintain chronological order
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

# Generate response using Gemini
try:
response = asyncio.run(generate_response(promptText))
response = asyncio.run(generate_response(promptText, thread_id=thread_id))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent function signature for generate_response calls category Functionality

Tell me more
What is the issue?

The generate_response function is called with thread_id as a keyword argument, but when files are attached, it's called with positional arguments (promptText, file_urls) followed by thread_id as keyword argument, creating an inconsistent function signature.

Why this matters

This inconsistency could lead to runtime errors if the generate_response function expects a specific parameter order, and makes the code harder to maintain and debug.

Suggested change ∙ Feature Preview

Make the function calls consistent by using the same parameter pattern. Either use all keyword arguments or ensure the positional arguments match the function definition:

# Option 1: Use keyword arguments consistently
response = asyncio.run(generate_response(prompt_text=promptText, thread_id=thread_id))
# and
response = asyncio.run(generate_response(prompt_text=promptText, file_urls=file_urls, thread_id=thread_id))

# Option 2: Use positional arguments consistently (if function supports it)
response = asyncio.run(generate_response(promptText, None, thread_id))
# and  
response = asyncio.run(generate_response(promptText, file_urls, thread_id))
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +27 to +28
async def _get_or_create_chat_session(thread_id=None):
"""Get chat session with conversation history from database"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete function docstring category Documentation

Tell me more
What is the issue?

The docstring doesn't explain the purpose of the thread_id parameter or what the function returns.

Why this matters

Missing parameter and return information makes it harder for other developers to correctly use this function.

Suggested change ∙ Feature Preview

"""Get or create a chat session with optional conversation history.

Args:
thread_id: Optional ID to load existing conversation history.

Returns:
A chat session instance with loaded history if thread_id provided."""

Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +110 to +112
except Exception as e:
print(f"Error getting conversation history: {e}")
return []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic exception handling masks database errors category Error Handling

Tell me more
What is the issue?

Generic exception handling that silently returns empty history without distinguishing between different types of failures or providing adequate error context.

Why this matters

Database connection issues, permission problems, or data corruption will all result in the same silent fallback to empty history, making debugging difficult and potentially masking serious system issues.

Suggested change ∙ Feature Preview

Use more specific exception handling and proper logging:

except Exception as e:
    import logging
    logger = logging.getLogger(__name__)
    logger.error(f"Failed to retrieve conversation history for thread {thread_id}: {e}", exc_info=True)
    # Consider re-raising for critical database errors or return empty with warning
    return []
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +81 to +91
for file in attached_files:
if file.file_path and file.file_type:
# Create file part for Google GenAI
client = _get_genai_client()
try:
# Get the full file path
full_path = pathlib.Path(settings.MEDIA_ROOT) / file.stored_filename
if full_path.exists():
uploaded = client.files.upload(file=full_path)
file_part = {'file_data': {'file_uri': uploaded.uri, 'mime_type': file.file_type}}
user_parts.append(file_part)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant file uploads in conversation history category Performance

Tell me more
What is the issue?

Files are uploaded to GenAI API individually within a loop for each conversation history retrieval, causing redundant network calls.

Why this matters

This creates O(n*m) API calls where n is the number of conversations and m is the average files per conversation, leading to significant latency and potential rate limiting when loading conversation history.

Suggested change ∙ Feature Preview

Cache uploaded file URIs in the database or implement a file upload cache with TTL. Store the GenAI file URI after first upload and reuse it:

# Add a field to AttachedFile model: genai_file_uri
if file.genai_file_uri:
    file_part = {'file_data': {'file_uri': file.genai_file_uri, 'mime_type': file.file_type}}
else:
    uploaded = client.files.upload(file=full_path)
    file.genai_file_uri = uploaded.uri
    file.save()
    file_part = {'file_data': {'file_uri': uploaded.uri, 'mime_type': file.file_type}}
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

ndayiemile and others added 5 commits September 22, 2025 21:47
Added environment variables for GITHUB_TOKEN and GEMINI_API_KEY in the build job.
…est reliability

- Fix database transaction handling by using TransactionTestCase
- Add missing 'conversations' URL pattern to fix NoReverseMatch errors
- Update assert statements to Django TestCase assertions for CI/CD compatibility
- Add comprehensive AttachedFile model test with correct field names
@coveralls
Copy link

Pull Request Test Coverage Report for Build 17934443974

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 69 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+5.9%) to 85.0%

Files with Coverage Reduction New Missed Lines %
resumax_algo/gemini_model.py 69 21.9%
Totals Coverage Status
Change from base Build 17933647053: 5.9%
Covered Lines: 646
Relevant Lines: 760

💛 - Coveralls

@ndayiemile ndayiemile merged commit dcbb5b5 into main Sep 23, 2025
1 of 2 checks passed
@liamjdavis liamjdavis deleted the baseKnowledgeIntegration branch September 27, 2025 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants