Skip to content

Conversation

@chelseacarter29
Copy link
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings December 18, 2025 00:56
@chelseacarter29 chelseacarter29 merged commit d5c0ba8 into main Dec 18, 2025
7 checks passed
@chelseacarter29 chelseacarter29 deleted the pr-364-updated branch December 18, 2025 01:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the WHO (Agent Finder) functionality with several changes including async message handling improvements, configurable ranking parameters, and enhanced query context passing.

Key Changes:

  • Made create_assistant_result async and updated callers to await it properly
  • Added support for passing previous query history as context in the WHO interface
  • Made WHO ranking thresholds and limits configurable via handler attributes
  • Increased LLM ranking timeout and adjusted Gemini batch embedding timeout scaling

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
who_validation_results.txt Removed validation test results file (803 lines)
static/who.html Added code to pass previous 3 queries as context, includes unused variables and undefined queryHistory reference
code/python/embedding_providers/gemini_embedding.py Added logging configuration to suppress gRPC ALTS warnings
code/python/core/whoRanking.py Major refactoring: increased timeout from 8s to 90s, made min_score/max_results configurable with new defaults, removed conditional debug output, duplicate query assignment logic
code/python/core/schemas.py Changed create_assistant_result to async function with direct await instead of create_task
code/python/core/ranking.py Updated to await the now-async create_assistant_result function
code/python/core/embedding.py Changed Gemini timeout to scale linearly with batch size instead of fixed timeout

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +226 to +233
print(f"\n=== WHO RANKING: Filtered to {len(filtered)} results with score > {min_score_threshold} ===")

# Print the ranked sites with scores
print("\nRanked sites (top 10):")
for i, r in enumerate(ranked[:max_results], 1):
score = r.get('ranking', {}).get('score', 0)
print(f" {i}. {r['name']} - Score: {score}")
print("=" * 60)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug print statements at lines 226-233 are now always executed, ignoring the DEBUG_PRINT flag defined at line 19. This means debug output will always be printed to console even when not desired. These print statements should be wrapped in if DEBUG_PRINT: or converted to use the logger instead (e.g., logger.debug(...)).

Suggested change
print(f"\n=== WHO RANKING: Filtered to {len(filtered)} results with score > {min_score_threshold} ===")
# Print the ranked sites with scores
print("\nRanked sites (top 10):")
for i, r in enumerate(ranked[:max_results], 1):
score = r.get('ranking', {}).get('score', 0)
print(f" {i}. {r['name']} - Score: {score}")
print("=" * 60)
if DEBUG_PRINT:
print(f"\n=== WHO RANKING: Filtered to {len(filtered)} results with score > {min_score_threshold} ===")
# Print the ranked sites with scores
print("\nRanked sites (top 10):")
for i, r in enumerate(ranked[:max_results], 1):
score = r.get('ranking', {}).get('score', 0)
print(f" {i}. {r['name']} - Score: {score}")
print("=" * 60)

Copilot uses AI. Check for mistakes.
Comment on lines +218 to +219
# Use min_score from handler if available, otherwise default to 51
min_score_threshold = getattr(self.handler, 'min_score', 51)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minimum score threshold was changed from a hardcoded value of 70 to a configurable value with a default of 51. This is a significant behavioral change that will result in more lower-quality results being included. If this change is intentional, it should be documented. Otherwise, consider using 70 as the default to maintain backward compatibility: getattr(self.handler, 'min_score', 70).

Suggested change
# Use min_score from handler if available, otherwise default to 51
min_score_threshold = getattr(self.handler, 'min_score', 51)
# Use min_score from handler if available, otherwise default to 70 (backward compatible)
min_score_threshold = getattr(self.handler, 'min_score', 70)

Copilot uses AI. Check for mistakes.
Comment on lines +385 to +387
const debugParams = new URLSearchParams(window.location.search);
const queryObj = Object.fromEntries(debugParams.entries());

Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variables debugParams and queryObj are created but never used. These lines should either be removed or the functionality that uses them should be implemented.

Suggested change
const debugParams = new URLSearchParams(window.location.search);
const queryObj = Object.fromEntries(debugParams.entries());

Copilot uses AI. Check for mistakes.
streaming: 'true' // Use streaming mode
});

// Add previous queries for context (limit to last 3 queries)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code references this.queryHistory which is never initialized in the constructor. This will cause a runtime error when trying to access .length property on undefined. You need to add this.queryHistory = []; in the constructor method around line 269-277.

Suggested change
// Add previous queries for context (limit to last 3 queries)
// Add previous queries for context (limit to last 3 queries)
if (!Array.isArray(this.queryHistory)) {
this.queryHistory = [];
}

Copilot uses AI. Check for mistakes.
Comment on lines +183 to +190
# Always include query field (required for WHO ranking)
if "query" in result["ranking"]:
result_item["query"] = result["ranking"]["query"]
else:
# Fallback to original query if no custom query provided
result_item["query"] = self.handler.query


Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query field is being set twice with potentially different values. Lines 184-188 set it conditionally based on whether it exists in the ranking, then line 192 unconditionally overwrites it with site_query. This makes lines 183-188 redundant. Either remove lines 183-188 or remove line 192, depending on which behavior is desired.

Suggested change
# Always include query field (required for WHO ranking)
if "query" in result["ranking"]:
result_item["query"] = result["ranking"]["query"]
else:
# Fallback to original query if no custom query provided
result_item["query"] = self.handler.query

Copilot uses AI. Check for mistakes.
result = await asyncio.wait_for(
get_gemini_batch_embeddings(texts, model=model_id),
timeout=30 # Individual timeout per text
timeout=timeout * len(texts) # Scale timeout with batch size
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scaling the timeout by multiplying by the number of texts (timeout * len(texts)) could result in excessively long timeouts for large batches. For example, with a default timeout of 30 seconds and 100 texts, this would be 3000 seconds (50 minutes). Consider using a more reasonable scaling factor (e.g., timeout + (len(texts) * 2)) or capping the maximum timeout to prevent unreasonably long waits.

Copilot uses AI. Check for mistakes.
ans_struc,
level=self.level,
query_params=self.handler.query_params,
timeout=90,
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout for ranking LLM calls was increased from 8 seconds to 90 seconds, which is more than 11x longer. This could significantly slow down the WHO ranking process, especially when multiple sites are being ranked concurrently. Consider if this large increase is necessary, or if a more moderate timeout (e.g., 15-30 seconds) would be sufficient.

Suggested change
timeout=90,
timeout=30,

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants