Context
From adversarial review of v0.4.0b1 (W7).
Problem
The sendInteractions endpoint validates that datasetUri is a syntactically valid AT-URI with the correct collection (science.alt.dataset.entry), but never checks that the referenced dataset actually exists in the entries table. Compare with publishLabel which does query_get_entry(pool, d_did, d_rkey) and returns 400 if not found.
Without this check, the analytics tables accumulate orphan events for nonexistent datasets, which could pollute analytics dashboards and waste storage.
Trade-offs
- Adding an existence check means a DB query per interaction item (up to 100 per batch), which increases latency for a fire-and-forget endpoint
- Could batch the existence checks with a single
query_get_entries call for the whole batch instead of per-item lookups
- Alternatively, could do a soft check (log a warning but still record) to avoid rejecting valid interactions for recently-deleted datasets
Acceptance criteria
- Interactions referencing nonexistent datasets are either rejected or flagged
- Performance impact is minimal (batch lookup preferred over per-item)
- Tests cover both existing and nonexistent dataset URIs
Context
From adversarial review of v0.4.0b1 (W7).
Problem
The
sendInteractionsendpoint validates thatdatasetUriis a syntactically valid AT-URI with the correct collection (science.alt.dataset.entry), but never checks that the referenced dataset actually exists in theentriestable. Compare withpublishLabelwhich doesquery_get_entry(pool, d_did, d_rkey)and returns 400 if not found.Without this check, the analytics tables accumulate orphan events for nonexistent datasets, which could pollute analytics dashboards and waste storage.
Trade-offs
query_get_entriescall for the whole batch instead of per-item lookupsAcceptance criteria