Skip to content

[WIP] Turn local cache into catalog#716

Draft
BenGalewsky wants to merge 3 commits intomasterfrom
catalog
Draft

[WIP] Turn local cache into catalog#716
BenGalewsky wants to merge 3 commits intomasterfrom
catalog

Conversation

@BenGalewsky
Copy link
Contributor

@BenGalewsky BenGalewsky commented Feb 12, 2026

This PR represents a sketch of how we might implement a results catalog feature for local caches of ServiceX results.

Concepts:

  1. Catalog - this represents an output directory from ServiceX client. It contains downloaded results in subdirectories named after the requestID along with a json db in a hidden .servicex directory that provides the metadata for each run
  2. Sample - This is a specific transform query and dataset with the Sample's title being the primary key. There can be multiple runs of this sample, which may be identified by an optional version string.

Sample Versions

In order to facilitate working with multiple runs of a sample, we introduce a version property for each sample run. This optional property can be set with the new --version CLI option for the deliver command. This string is also now an argument for the deliver function in servicex_client.

Notes:

  1. I added version number to the hash so you can always force a new sample run to be remembered by incrementing the version.
  2. I did think about the version being part of the General section of the spec so it can be tied to source code control

Example

I created cat_demo.py in examples directory to show how this works:

        Catalog        
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ Sample Name         ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ Uproot_FuncADL_YAML │
│ UprootRaw_YAML      │
└─────────────────────┘
     UprootRaw_YAML      
┏━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Version               ┃
┡━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1.0                   │
│ 1.1                   │
└───────────────────────┘
                                      Uproot Catalog Runs                                       
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ SHA      ┃ RequestID                            ┃ Submit Time                      ┃ Version ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ 210065e3 │ de99473f-2300-4094-85f3-badf03c8d3a1 │ 2026-02-12 19:07:25.319945+00:00 │ 1.0     │
│ b7c48b86 │ d38d002a-d8a9-4a5a-b9c3-010d9e0a7d11 │ 2026-02-12 21:33:36.406181+00:00 │ 1.1     │
│ 042f69c9 │ 2316760d-75b8-49ea-a253-4a031a5d28a7 │ 2026-02-12 21:38:40.028354+00:00 │         │
└──────────┴──────────────────────────────────────┴──────────────────────────────────┴─────────┘
        Latest Run Timestamp        
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Submit Time                      ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 2026-02-12 21:38:40.028354+00:00 │
└──────────────────────────────────┘
                                                                   Files for Version 1.0                                                                   
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ File Path                                                                                                                                               ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ /Users/bengal1/dev/IRIS-HEP/ServiceX_Client/cache-dir/de99473f-2300-4094-85f3-badf03c8d3a1/_d4e734cb5c29b16f5fc2261a0d3c87226e2e0175_000003.pool.root.1 │
│ /Users/bengal1/dev/IRIS-HEP/ServiceX_Client/cache-dir/de99473f-2300-4094-85f3-badf03c8d3a1/_dbbf75d3f660214b6c129eb2a91ed592c85cfe8e_000002.pool.root.1 │
│ /Users/bengal1/dev/IRIS-HEP/ServiceX_Client/cache-dir/de99473f-2300-4094-85f3-badf03c8d3a1/_853c79b751467bdf9e6b1bd02e743a5aa51e4516_000001.pool.root.1 │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

@codecov
Copy link

codecov bot commented Feb 16, 2026

Codecov Report

❌ Patch coverage is 14.28571% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.77%. Comparing base (df3d0b7) to head (bde9cd5).

Files with missing lines Patch % Lines
servicex/catalog.py 0.00% 35 Missing ⚠️
servicex/models.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #716      +/-   ##
==========================================
- Coverage   98.35%   96.77%   -1.59%     
==========================================
  Files          30       31       +1     
  Lines        2190     2232      +42     
==========================================
+ Hits         2154     2160       +6     
- Misses         36       72      +36     
Flag Coverage Δ
unittests 96.77% <14.28%> (-1.59%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant