Skip to content

Add code to reconstitute deliver() from a cache area#713

Open
ponyisi wants to merge 3 commits intomasterfrom
servicex-from-concentrate
Open

Add code to reconstitute deliver() from a cache area#713
ponyisi wants to merge 3 commits intomasterfrom
servicex-from-concentrate

Conversation

@ponyisi
Copy link
Collaborator

@ponyisi ponyisi commented Feb 9, 2026

Addresses #710 . New function servicex.read_dir (name is up for discussion) which gives a mock deliver() return value corresponding to data in a cache area. Assumes cache is in fact valid (i.e. doesn't validate that files or URLs actually work).

@codecov
Copy link

codecov bot commented Feb 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.37%. Comparing base (aafbbf0) to head (f4d00f2).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #713      +/-   ##
==========================================
+ Coverage   98.35%   98.37%   +0.02%     
==========================================
  Files          30       31       +1     
  Lines        2190     2218      +28     
==========================================
+ Hits         2154     2182      +28     
  Misses         36       36              
Flag Coverage Δ
unittests 98.37% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ArturU043
Copy link
Contributor

Hi

Is there a reason for wanting to set local_preferred to Fasle?
If both URLs and local downloads are available, we could always default to local and clear up some of the code.

Maybe I am missing something about this point.

@ponyisi
Copy link
Collaborator Author

ponyisi commented Feb 9, 2026

Hi @ArturU043 - I tend to prefer not to lock choices in too early if we don't have to - there could certainly be debugging situations where we would want to use the URLs, or even a situation where you would want to test locally with downloads but then parallelize on nodes which need to use URLs instead.

(Note that we don't actually verify whether the files exist locally before returning the list...)

@ponyisi
Copy link
Collaborator Author

ponyisi commented Feb 9, 2026

After some discussions and thought I might prefer this function to be called servicex.load (especially since you can call it with zero arguments, in order to pick up the cache implied by the servicex.yaml). It might also be reasonable to rename the first argument from path to cache_dir to match the argument in deliver (though probably very few people would actually use the name for the argument I would guess). Anyway happy to get feedback.


if local_preferred:
return {
_[0]: GuardList(_[1].file_list if _[1].file_list else _[1].signed_url_list)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be refactored into functions stored in servicex_client.py (that is already loaded into this file and is where the duplicated dictionary comprehensions are stored)?

f"{path} does not contain a valid ServiceX download area"
)
config = Configuration(api_endpoints=[], cache_path=str(path))
cache = QueryCache(config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the cache object be closed as in the tests (sidebar: should QueryCache be a context manager?)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants