Skip to content

Conversation

@rdhyee
Copy link
Contributor

@rdhyee rdhyee commented Oct 31, 2025

Summary

Replaces the previous UNION-based query approach with Eric Kansa's exact query implementations from open-context-py/isamples_explore.py.

Queries Implemented

1. get_samples_at_geo_cord_location_via_sample_event

  • Path 1 only (backward walk from GeospatialCoordLocation via sample_location)
  • INNER JOIN for site (required)
  • Orders by has_thumbnail DESC
  • Returns: sample details, geo coordinates, site context

2. get_sample_data_via_sample_pid

  • Get full sample metadata including geo and site info
  • Forward walk: sample → produced_by → event → sample_location → geo

3. get_sample_data_agents_sample_pid

  • Get agent information (who collected/registered)
  • Returns responsibility and registrant predicates

4. get_sample_types_and_keywords_via_sample_pid

  • Get classification keywords and types
  • Returns keywords, has_sample_object_type, has_material_category

Key Changes

Query Strategy

  • Old: UNION of Path 1 + Path 2 (combined direct and site-based location paths)
  • New: Path 1 only (exact match to Eric's implementation)

Behavior Changes

  • Site marker locations (Path 2 only) now correctly return 0 results
  • Example: geoloc_7a05216d388682536f3e2abd8bd2ee3fb286e461 (Larnaka site marker) returns 0
  • Field collection points (Path 1) return samples as expected
  • Example: geoloc_04d6e816218b1a8798fa90b3d1d43bf4c043a57f (PKAP) returns 5 samples

Testing

Tested all 4 queries with PKAP survey area data:

# Test Case 1: Geo location with Path 1 connections
geo_pid = "geoloc_04d6e816218b1a8798fa90b3d1d43bf4c043a57f"Returns 5 samples (PKAP Survey Area)

# Test Case 2: Sample metadata  
sample_pid = "ark:/28722/k2wq0b20z"Returns 1 record with full metadata

# Test Case 3: Agent dataReturns 3 agents (R. Scott Moore, etc.)

# Test Case 4: KeywordsReturns 4 keywords (pottery, amphora, Artifact, ceramic)

Files Changed

  • tutorials/parquet_cesium.qmd (+196, -54 lines)

🤖 Generated with Claude Code

Replace previous UNION-based query with Eric's exact implementations
from open-context-py/isamples_explore.py:

1. get_samples_at_geo_cord_location_via_sample_event
   - Path 1 only (backward walk from geo via sample_location)
   - INNER JOIN for site (required)
   - Orders by has_thumbnail DESC
   - Returns sample details, geo coords, site context

2. get_sample_data_via_sample_pid
   - Get full sample metadata including geo and site
   - Forward walk from sample via produced_by → event → sample_location

3. get_sample_data_agents_sample_pid
   - Get agent info (who collected/registered)
   - Returns responsibility and registrant predicates

4. get_sample_types_and_keywords_via_sample_pid
   - Get classification keywords and types
   - Returns keywords, has_sample_object_type, has_material_category

Key changes:
- Removed UNION approach (was combining Path 1 + Path 2)
- Now matches Eric's Path 1-only strategy exactly
- Uses list_contains() for backward edge traversal
- Updated documentation to explain Path 1-only behavior
- Site markers (Path 2 only) correctly return 0 results

Tested with:
- geoloc_04d6e816218b1a8798fa90b3d1d43bf4c043a57f (PKAP, returns 5 samples)
- ark:/28722/k2wq0b20z (sample with 3 agents, 4 keywords)
- Python tests verify SQL correctness

Source: https://github.com/ekansa/open-context-py/blob/staging/opencontext_py/apps/all_items/isamples/isamples_explore.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@rdhyee rdhyee merged commit 33d6906 into isamplesorg:main Oct 31, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant