Add trapi set interpretation support #77

ctrl-schaff · 2025-12-08T20:44:50Z

Adds support for the set_interpretation field within the Query instance. More information can be found within the TRAPI specification here. Modifies our lookup behavior, with the core implementation found within src/retriever/utils/trapi.py. Tests can be found under tests/set_interpretation

…etation-support

sentry · 2025-12-08T20:48:25Z

Codecov Report

❌ Patch coverage is 93.63057% with 10 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/retriever/utils/trapi.py	94.83%	5 Missing and 3 partials ⚠️
src/retriever/lookup/lookup.py	0.00%	2 Missing ⚠️

Files with missing lines	Coverage Δ
src/retriever/lookup/lookup.py	`0.00% <0.00%> (ø)`
src/retriever/utils/trapi.py	`94.09% <94.83%> (+0.68%)`	⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tokebe

This seems to be a misunderstanding of set interpretation. Set interpretation is on a per-node basis, where different nodes can have different set interpretations. ALL and MANY essentially mark a query node as not defining result uniqueness, where ALL has the added caveat that every member ID on the query node must be fulfilled, or else no result can be made.

Right now, it appears that the code requires all nodes "agree" on both their set interpretation, and on expected member IDs. Neither of these are a valid requirement; nodes may arbitrarily set their individual set interpretations, and different nodes may have different member IDs.

For a basic example, consider the following query, where letters represent QNodeIDs and numbers represent unique IDs on that node:

A(1,2,3; BATCH) --> B(4,5,6; ALL)

A is BATCH, so any ID 1, 2, or 3 will fulfill it. B is ALL, so a valid result MUST have every member id 4, 5, 6. Every ID on A must find an edge relating it to all IDs on B in order to be included in the final output. Imagine Retriever gets the following knowledge, which will show up as results passed to evaluate_set_interpretation() (letters and numbers together represent a binding of the knowledge node to the query node):

A1 -> B4; A1 -> B5; A1 -> B6
A2 -> B4; A2 -> B5
A3 -> B4; A3 -> B6; A3 -> B6

evaluate_set_interpretation() should prune the two results depending on A2, as we don't have a sufficient set to match all of B. We end up with two results, formatted using the placeholder set nodes with nodenorm UUIDs defined in the TRAPI spec.

A1 -> B(4,5,6) 
A3 -> B(4,5,6)

In the general case of a multi-interpretation n-node query, I believe it's a safe approach to attempt to evaluate nodes marked ALL for set assembly and pruning, and then evaluate nodes marked MANY to further group results as applicable.

ctrl-schaff · 2025-12-09T17:25:03Z

This seems to be a misunderstanding of set interpretation. Set interpretation is on a per-node basis, where different nodes can have different set interpretations. ALL and MANY essentially mark a query node as not defining result uniqueness, where ALL has the added caveat that every member ID on the query node must be fulfilled, or else no result can be made.

Right now, it appears that the code requires all nodes "agree" on both their set interpretation, and on expected member IDs. Neither of these are a valid requirement; nodes may arbitrarily set their individual set interpretations, and different nodes may have different member IDs.

For a basic example, consider the following query, where letters represent QNodeIDs and numbers represent unique IDs on that node:
A(1,2,3; BATCH) --> B(4,5,6; ALL)
A is BATCH, so any ID 1, 2, or 3 will fulfill it. B is ALL, so a valid result MUST have every member id 4, 5, 6. Every ID on A must find an edge relating it to all IDs on B in order to be included in the final output. Imagine Retriever gets the following knowledge, which will show up as results passed to evaluate_set_interpretation() (letters and numbers together represent a binding of the knowledge node to the query node):
A1 -> B4; A1 -> B5; A1 -> B6
A2 -> B4; A2 -> B5
A3 -> B4; A3 -> B6; A3 -> B6
evaluate_set_interpretation() should prune the two results depending on A2, as we don't have a sufficient set to match all of B. We end up with two results, formatted using the placeholder set nodes with nodenorm UUIDs defined in the TRAPI spec.
A1 -> B(4,5,6) 
A3 -> B(4,5,6)
In the general case of a multi-interpretation n-node query, I believe it's a safe approach to attempt to evaluate nodes marked ALL for set assembly and pruning, and then evaluate nodes marked MANY to further group results as applicable.

Okay that makes more sense. I think MANY needs more flushing out in the TRAPI specification as it isn't clear at all what we want from it at the moment. ALL makes more sense now, but after we implement this I think it would be good to add more detail to this in the specification as it isn't clear to an outsider how it works in my opinion. I'll go ahead and take a stab at implementing this

tokebe · 2025-12-09T21:56:30Z

MANY essentially only attempts to maximally group results, but doesn't prune. So in the above example, if B were MANY instead, we'd end up with 3 results:

A1 -> B(4,5,6) 
A2 -> B(4,5)
A3 -> B(4,5,6)

…etation-support

ctrl-schaff added 5 commits December 8, 2025 14:19

Add the set_interpretation evalution to lookup

d8b754d

Initial implementation for [BATCH,ALL] set_interpretation

d284966

Add initial tests for set_interpretation support

b14add0

Merge remote-tracking branch 'origin/main' into add-trapi-set-interpr…

4a679a7

…etation-support

Add tests for set_interpretation : ALL

8ed62b7

ctrl-schaff requested review from everaldorodrigo, shuchenliu and tokebe December 8, 2025 20:44

ctrl-schaff self-assigned this Dec 8, 2025

ctrl-schaff added the enhancement New feature or request label Dec 8, 2025

tokebe reviewed Dec 9, 2025

View reviewed changes

ctrl-schaff added 3 commits December 22, 2025 14:26

Merge remote-tracking branch 'origin/main' into add-trapi-set-interpr…

3d92abb

…etation-support

Add testing logic for set interpretation eval

a78af04

Add set interpretation logic for result pruning

b8bffff

ctrl-schaff marked this pull request as ready for review January 5, 2026 15:27

ctrl-schaff added 12 commits January 5, 2026 10:28

Add additional type support and linting

9b7398c

Iteratively improving type annotations

6a5f9b5

Iteratively improving type annotations

deb89e6

Iteratively improving type annotations

7478dfc

Iteratively improving type annotations

da7479c

Iteratively improving type annotations

67b0aff

Iteratively improving type annotations

9aac36b

Finalize type annotations

e517516

Refactor to pass linting requirements

430c7b4

Merge remote-tracking branch 'origin/main' into add-trapi-set-interpr…

dcbe823

…etation-support

Update docstring documentation

9b19047

Bugfix connectivity evaluation and add more tests

8f96248

ctrl-schaff added 3 commits January 5, 2026 21:43

Improve set interpretation test robustness

76bc4ed

Add tests for malformed queries

deb8077

Add one more edge case for set-identifier value

b7a5b9f

ctrl-schaff requested a review from tokebe January 7, 2026 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add trapi set interpretation support #77

Add trapi set interpretation support #77

Uh oh!

ctrl-schaff commented Dec 8, 2025

Uh oh!

sentry bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

tokebe left a comment •

edited

Loading

Uh oh!

ctrl-schaff commented Dec 9, 2025

Uh oh!

tokebe commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add trapi set interpretation support #77

Are you sure you want to change the base?

Add trapi set interpretation support #77

Uh oh!

Conversation

ctrl-schaff commented Dec 8, 2025

Uh oh!

sentry bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tokebe left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ctrl-schaff commented Dec 9, 2025

Uh oh!

tokebe commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sentry bot commented Dec 8, 2025 •

edited

Loading

tokebe left a comment •

edited

Loading