Skip to content

Conversation

@filiperochalopes
Copy link
Contributor

Summary

This PR delivers significant performance optimizations for concept search via GraphQL by introducing a safe path that queries Elasticsearch directly for simple requests, with the explicit goal of making the GraphQL search behavior and result experience as close as possible to the REST search. It also fixes ID list lookup, and improves test coverage and documentation.

Key Changes

🚀 Performance & Optimization

  • Elasticsearch Safe Path: Added logic (is_optimization_safe) to detect when a GraphQL query requests only fields available in the Elasticsearch index (e.g., id, display, conceptClass, datatype, metadata).
  • Direct ES Serialization: When the safe path is enabled, results are serialized directly from Elasticsearch hits (serialize_es_hit), avoiding expensive PostgreSQL queries and preventing N+1 query issues.

🐛 Bug Fixes

  • ID-based search (concepts_for_ids): Implemented the missing helper to resolve queries based on a list of conceptIds.
  • Result Ordering: Ensured that concepts returned from ID-based search preserve the exact input order of the IDs provided in the query.
  • Authentication: Adjusted the auth_status check in the main resolver to avoid errors when the attribute is absent from the request context.

🛠️ Refactoring & Code Quality

  • Documentation: Added explanatory docstrings and comments across all major functions in core/graphql/queries.py.
  • Robust Fallback: Improved the logic that falls back to database search when Elasticsearch is unavailable or returns an error.
  • Tests: Added and updated unit tests covering Elasticsearch optimization scenarios and the new queries.

Impacted Files

  • core/graphql/queries.py: Main resolver logic, optimization, and serialization.
  • core/graphql/tests/*: New test cases and updates to existing tests.
  • core/settings.py: Minor configuration adjustments.

Tests

docker compose exec api python manage.py test core.graphql.tests  ✅  Passed!

* Adopt a permissive base QuerySet for textual GraphQL searches to rely on Elasticsearch's version filtering logic.

* Apply the same Elasticsearch filters as REST (source_version=HEAD and is_latest_version=True).

* Use strict 'id = versioned_object_id' filtering only when listing without search terms.

* Remove SQL fallback logic to ensure consistent ES-only behavior across REST and GraphQL search flows.
* Implement the concepts_for_ids function to resolve NameError when querying by a list of IDs.

* Ensure direct database lookups are used for ID-based queries to improve performance.

* Maintain result ordering and pagination for concepts retrieved by ID.

* Add clear and concise documentation for all functions and variables in queries.py to improve maintainability.
* Add search_concepts_in_es with ES hit serialization and GraphQL resolver guards to return cached hits without DB round-trips when selected fields are safe.

* Improve fallback logic and ordering in concepts_for_ids and concepts_for_query, including auth handling tweaks to preserve non-ES query paths and prevent duplicate mnemonics.

* Adjust superuser bootstrap and test suite to align with the new search helper logic, expanded coverage, and updated resolver behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant