Extend hermeto generic pm support for artifacts.lock.yaml#2453
Extend hermeto generic pm support for artifacts.lock.yaml#2453fbladilo wants to merge 1 commit intoopenshift-eng:mainfrom
Conversation
Adds support for custom destination filenames in `artifact_lockfile.resources` configuration. This allows specifying cleaner filenames for artifacts with complex URL paths.
- **Schema**: Accept both string URLs and objects with optional `filename` attribute
- **Lockfile Generator**: Use custom filename when provided, otherwise extract from URL
- **Image Metadata**: Normalize both formats to consistent dict structure
```yaml
konflux:
cachi2:
artifact_lockfile:
enabled: true
resources:
# Existing format (still works)
- https://example.com/simple-file.jar
# New format with custom filename
- url: https://indy.corp.redhat.com/.../java-analyzer-bundle.core-8.1.0.CR1-redhat-00003.jar
filename: java-analyzer-bundle.jar
The custom filename is injected in artifacts.lock.yaml and read by Hermeto during prefetch. Hermeto already supports this, we are only adding support on ART tooling's side.
Fully backward compatible - existing string URL configurations continue to work unchanged. New object format is opt-in.
Signed-off-by: Franco Bladilo <fbladilo@redhat.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/hold |
WalkthroughThe pull request refactors artifact resource handling to use normalized dictionaries containing Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@doozer/doozerlib/image.py`:
- Around line 1205-1229: In get_required_artifacts, resources can include Model
instances that are skipped by the current isinstance(resource, dict) branch;
update the loop to detect model-like objects (e.g., hasattr(resource, "dict") or
"model_dump") and convert them to a plain dict (assign back to resource) before
the dict-handling branch, then validate shape (ensure 'url' exists) and raise a
clear exception (e.g., TypeError/ValueError) for unsupported types instead of
silently dropping entries; keep the existing string handling and the normalized
list building (variable names: get_required_artifacts, resources, normalized).
In `@doozer/tests/test_lockfile.py`:
- Around line 827-832: Tests use hardcoded Path("/tmp/...") for mock_dest_dir
which can collide; replace each occurrence (e.g., the mock_dest_dir assignment
near mock_session setup and other occurrences at lines noted) with a temporary
directory created via tempfile.mkdtemp() wrapped in Path(...), register cleanup
with self.addCleanup(lambda: shutil.rmtree(tmpdir)) to remove it after tests,
and add imports for tempfile and shutil if missing; ensure the variable names
(mock_dest_dir or similar) are updated to use the Path(tempdir) value so
existing references (mock_session, mock_session_class, and related mocks)
continue to work.
🧹 Nitpick comments (1)
doozer/tests/test_lockfile.py (1)
763-766: Preserve legacy string-format coverage.The updated test now exercises only the dict form. Add a small test (or parameterize this one) that still passes a list of URL strings to ensure the backward-compatible path remains covered.
| def get_required_artifacts(self) -> list: | ||
| """Get list of required artifact URLs from image config.""" | ||
| """ | ||
| Get list of required artifacts from image config. | ||
|
|
||
| Returns: | ||
| list[dict]: List of dicts with 'url' and 'filename' keys. | ||
| Normalizes both string URLs and object formats. | ||
| """ | ||
| if not self.is_artifact_lockfile_enabled(): | ||
| return [] | ||
|
|
||
| resource_urls = self.config.konflux.cachi2.artifact_lockfile.resources | ||
| if resource_urls in [Missing, None]: | ||
| resources = self.config.konflux.cachi2.artifact_lockfile.resources | ||
| if resources in [Missing, None]: | ||
| return [] | ||
|
|
||
| return resource_urls # Direct URL list | ||
| # Normalize to list of dicts | ||
| normalized = [] | ||
| for resource in resources: | ||
| if isinstance(resource, str): | ||
| # Simple URL string - filename will be extracted from URL | ||
| normalized.append({'url': resource, 'filename': None}) | ||
| elif isinstance(resource, dict): | ||
| # Object format with url and optional filename | ||
| normalized.append({'url': resource['url'], 'filename': resource.get('filename', None)}) | ||
|
|
There was a problem hiding this comment.
Normalize Model-based resource entries (otherwise object-form artifacts get dropped).
If resource objects come through as Model instances, the current isinstance(resource, dict) check will skip them silently, leading to empty artifacts despite valid config. That breaks the new object-format path. Normalize Model to primitives and fail fast on unsupported shapes so hermetic validation doesn’t produce false negatives.
🔧 Proposed fix
- for resource in resources:
- if isinstance(resource, str):
- # Simple URL string - filename will be extracted from URL
- normalized.append({'url': resource, 'filename': None})
- elif isinstance(resource, dict):
- # Object format with url and optional filename
- normalized.append({'url': resource['url'], 'filename': resource.get('filename', None)})
+ for resource in resources:
+ if isinstance(resource, str):
+ normalized.append({'url': resource, 'filename': None})
+ continue
+ if isinstance(resource, Model):
+ resource = resource.primitive()
+ if isinstance(resource, dict):
+ url = resource.get('url')
+ if not url:
+ raise ValueError("artifact_lockfile.resources entry missing required 'url'")
+ normalized.append({'url': url, 'filename': resource.get('filename')})
+ continue
+ raise ValueError(f"Unsupported artifact resource type: {type(resource).__name__}")🤖 Prompt for AI Agents
In `@doozer/doozerlib/image.py` around lines 1205 - 1229, In
get_required_artifacts, resources can include Model instances that are skipped
by the current isinstance(resource, dict) branch; update the loop to detect
model-like objects (e.g., hasattr(resource, "dict") or "model_dump") and convert
them to a plain dict (assign back to resource) before the dict-handling branch,
then validate shape (ensure 'url' exists) and raise a clear exception (e.g.,
TypeError/ValueError) for unsupported types instead of silently dropping
entries; keep the existing string handling and the normalized list building
(variable names: get_required_artifacts, resources, normalized).
| mock_dest_dir = Path("/tmp/test") | ||
|
|
||
| # Mock session and download behavior | ||
| mock_session = MagicMock() | ||
| mock_session_class.return_value.__aenter__.return_value = mock_session | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the file structure and find the test file
git ls-files | grep -i lockfile | head -20Repository: openshift-eng/art-tools
Length of output: 125
🏁 Script executed:
# Check the specific lines mentioned in the review
if [ -f "doozer/tests/test_lockfile.py" ]; then
echo "File exists. Reading lines around 827-832..."
sed -n '820,840p' doozer/tests/test_lockfile.py
fiRepository: openshift-eng/art-tools
Length of output: 1015
🏁 Script executed:
# Let's also check the test class structure to understand the test framework
sed -n '1,50p' doozer/tests/test_lockfile.py | head -40Repository: openshift-eng/art-tools
Length of output: 1518
🏁 Script executed:
# Search for other hardcoded /tmp paths in the test file
rg '/tmp' doozer/tests/test_lockfile.pyRepository: openshift-eng/art-tools
Length of output: 415
🏁 Script executed:
# Find the test method containing line 827-832
sed -n '750,832p' doozer/tests/test_lockfile.py | tail -100 | head -60Repository: openshift-eng/art-tools
Length of output: 2705
🏁 Script executed:
# Check what class contains this test method
rg -B 50 'mock_dest_dir = Path\("/tmp/test"\)' doozer/tests/test_lockfile.py | head -60Repository: openshift-eng/art-tools
Length of output: 2693
🏁 Script executed:
# Search for the entire test method definition
sed -n '800,835p' doozer/tests/test_lockfile.pyRepository: openshift-eng/art-tools
Length of output: 1793
🏁 Script executed:
# Check if there are any existing uses of tempfile in the test file
rg 'tempfile|mkdtemp|TemporaryDirectory' doozer/tests/test_lockfile.pyRepository: openshift-eng/art-tools
Length of output: 49
🏁 Script executed:
# Check how mock_dest_dir is actually used in the test methods
rg -A 10 'mock_dest_dir = Path\("/tmp/test"\)' doozer/tests/test_lockfile.py | head -40Repository: openshift-eng/art-tools
Length of output: 1594
🏁 Script executed:
# Check if _write_yaml actually uses the path to create real directories
sed -n '100,200p' doozer/tests/test_lockfile.py | grep -A 5 "_write_yaml"Repository: openshift-eng/art-tools
Length of output: 49
🏁 Script executed:
# Find the full test method that contains line 827-832
awk '/async def test_generate_artifact_lockfile_custom_filename/,/^ (async )?def |^class / {print NR": "$0}' doozer/tests/test_lockfile.py | head -80Repository: openshift-eng/art-tools
Length of output: 177
🏁 Script executed:
# Get line count to understand file size better
wc -l doozer/tests/test_lockfile.pyRepository: openshift-eng/art-tools
Length of output: 100
🏁 Script executed:
# Check the actual assertion/usage of mock_dest_dir in these tests
sed -n '820,900p' doozer/tests/test_lockfile.pyRepository: openshift-eng/art-tools
Length of output: 3643
🏁 Script executed:
# Count occurrences of hardcoded /tmp paths in this test file
rg '/tmp' doozer/tests/test_lockfile.py | wc -lRepository: openshift-eng/art-tools
Length of output: 68
🏁 Script executed:
# Check the complete list of all hardcoded /tmp paths
rg -n '/tmp' doozer/tests/test_lockfile.pyRepository: openshift-eng/art-tools
Length of output: 451
🏁 Script executed:
# Verify the test class inheritance to confirm self.addCleanup is available
sed -n '790,810p' doozer/tests/test_lockfile.pyRepository: openshift-eng/art-tools
Length of output: 1068
🏁 Script executed:
# Check if the mock_dest_dir is actually used to create files or if it's purely mocked
rg -B 5 -A 15 'def generate_artifact_lockfile' doozer/doozerlib/lockfile.py | head -60Repository: openshift-eng/art-tools
Length of output: 955
🏁 Script executed:
# Check how the test class is structured and if setUp/tearDown is used
sed -n '813,825p' doozer/tests/test_lockfile.pyRepository: openshift-eng/art-tools
Length of output: 830
🏁 Script executed:
# Check if there are other tests in the file using temporary directories properly
rg -n 'TemporaryDirectory|tempfile|mktemp' doozer/tests/test_lockfile.pyRepository: openshift-eng/art-tools
Length of output: 49
🏁 Script executed:
# Verify S108 is indeed the Bandit rule for hardcoded tmp
web_search S108 bandit security hardcoded temp pathRepository: openshift-eng/art-tools
Length of output: 117
🌐 Web query:
S108 bandit hardcoded temporary directory
💡 Result:
S108 (flake8-bandit / Ruff) / B108 (Bandit) flags hardcoded temporary file or directory paths like "/tmp/...", "/var/tmp/...", "/dev/shm/...", because hardcoding temp locations can enable attacks (e.g., symlink/race) and cause unexpected interference from other processes. (docs.astral.sh)
What triggers it
Examples that typically trigger S108/B108:
open("/tmp/foo.txt", "w")
Path("/var/tmp/app/cache").mkdir()Bandit’s B108 specifically matches string literals starting with configurable temp dir prefixes (default includes /tmp, /var/tmp, /dev/shm). (bandit.readthedocs.io)
Recommended fixes
Use the standard library tempfile APIs instead of hardcoded paths:
import tempfile
with tempfile.NamedTemporaryFile() as f:
f.write(b"data")or:
import tempfile
with tempfile.TemporaryDirectory() as d:
...These are the recommended alternatives in the rule guidance. (docs.astral.sh)
If you must keep a fixed temp path
- Bandit config: customize what B108 considers “tmp” via
hardcoded_tmp_directory.tmp_dirs. (bandit.readthedocs.io) - Suppress with justification: add an inline ignore (e.g.,
# nosec) only when you’re sure it’s safe and you can explain why (project policy varies). (bandit.readthedocs.io)
Citations:
- 1: https://docs.astral.sh/ruff/rules/hardcoded-temp-file/?utm_source=openai
- 2: https://bandit.readthedocs.io/en/1.5.1/plugins/b108_hardcoded_tmp_directory.html?utm_source=openai
- 3: https://docs.astral.sh/ruff/rules/hardcoded-temp-file/?utm_source=openai
- 4: https://bandit.readthedocs.io/en/1.5.1/plugins/b108_hardcoded_tmp_directory.html?utm_source=openai
- 5: https://bandit.readthedocs.io/en/1.5.1/plugins/b108_hardcoded_tmp_directory.html?utm_source=openai
Use tempfile.mkdtemp() instead of hardcoded /tmp paths.
This pattern appears multiple times in the test file (lines 768, 827, 882, 898) and triggers S108. While these specific mocks don't write actual files, hardcoding temp paths can cause test collisions in parallel execution and invites security issues. Use tempfile.mkdtemp() with self.addCleanup() for proper cleanup.
♻️ Suggested change
- mock_dest_dir = Path("/tmp/test")
+ import tempfile
+ import shutil
+ mock_dest_dir = Path(tempfile.mkdtemp(prefix="artifact-lockfile-"))
+ self.addCleanup(shutil.rmtree, mock_dest_dir, ignore_errors=True)🧰 Tools
🪛 Ruff (0.14.14)
[error] 827-827: Probable insecure usage of temporary file or directory: "/tmp/test"
(S108)
🤖 Prompt for AI Agents
In `@doozer/tests/test_lockfile.py` around lines 827 - 832, Tests use hardcoded
Path("/tmp/...") for mock_dest_dir which can collide; replace each occurrence
(e.g., the mock_dest_dir assignment near mock_session setup and other
occurrences at lines noted) with a temporary directory created via
tempfile.mkdtemp() wrapped in Path(...), register cleanup with
self.addCleanup(lambda: shutil.rmtree(tmpdir)) to remove it after tests, and add
imports for tempfile and shutil if missing; ensure the variable names
(mock_dest_dir or similar) are updated to use the Path(tempdir) value so
existing references (mock_session, mock_session_class, and related mocks)
continue to work.
|
@fbladilo: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/unhold |
Adds support for custom destination filenames in
artifact_lockfile.resourcesconfiguration. This allows specifying cleaner filenames for artifacts with complex URL paths.Schema: Accept both string URLs and objects with optional
filenameattributeLockfile Generator: Use custom filename when provided, otherwise extract from URL
Image Metadata: Normalize both formats to consistent dict structure
The custom filename is injected in artifacts.lock.yaml and read by Hermeto during prefetch. Hermeto already supports this, we are only adding support on ART tooling's side.
Fully backward compatible - existing string URL configurations continue to work unchanged. New object format is opt-in.