Skip to content

Extend hermeto generic pm support for artifacts.lock.yaml#2453

Open
fbladilo wants to merge 1 commit intoopenshift-eng:mainfrom
fbladilo:extend_generic_pm
Open

Extend hermeto generic pm support for artifacts.lock.yaml#2453
fbladilo wants to merge 1 commit intoopenshift-eng:mainfrom
fbladilo:extend_generic_pm

Conversation

@fbladilo
Copy link
Contributor

@fbladilo fbladilo commented Feb 6, 2026

Adds support for custom destination filenames in artifact_lockfile.resources configuration. This allows specifying cleaner filenames for artifacts with complex URL paths.

  • Schema: Accept both string URLs and objects with optional filename attribute

  • Lockfile Generator: Use custom filename when provided, otherwise extract from URL

  • Image Metadata: Normalize both formats to consistent dict structure

    konflux:
      cachi2:
        artifact_lockfile:
          enabled: true
          resources:
            # Existing format (still works)
            - https://example.com/simple-file.jar
    
            # New format with custom filename
            - url: https://indy.corp.redhat.com/.../java-analyzer-bundle.core-8.1.0.CR1-redhat-00003.jar
              filename: java-analyzer-bundle.jar
    

The custom filename is injected in artifacts.lock.yaml and read by Hermeto during prefetch. Hermeto already supports this, we are only adding support on ART tooling's side.

Fully backward compatible - existing string URL configurations continue to work unchanged. New object format is opt-in.

Adds support for custom destination filenames in `artifact_lockfile.resources` configuration. This allows specifying cleaner filenames for artifacts with complex URL paths.

- **Schema**: Accept both string URLs and objects with optional `filename` attribute
- **Lockfile Generator**: Use custom filename when provided, otherwise extract from URL
- **Image Metadata**: Normalize both formats to consistent dict structure

  ```yaml
  konflux:
    cachi2:
      artifact_lockfile:
        enabled: true
        resources:
          # Existing format (still works)
          - https://example.com/simple-file.jar

          # New format with custom filename
          - url: https://indy.corp.redhat.com/.../java-analyzer-bundle.core-8.1.0.CR1-redhat-00003.jar
            filename: java-analyzer-bundle.jar

The custom filename is injected in artifacts.lock.yaml and read by Hermeto during prefetch. Hermeto already supports this, we are only adding support on ART tooling's side.

Fully backward compatible - existing string URL configurations continue to work unchanged. New object format is opt-in.

Signed-off-by: Franco Bladilo <fbladilo@redhat.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 6, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign thegreyd for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fbladilo
Copy link
Contributor Author

fbladilo commented Feb 6, 2026

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 6, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 6, 2026

Walkthrough

The pull request refactors artifact resource handling to use normalized dictionaries containing url and optional filename keys instead of plain URL strings, enabling metadata-aware artifact processing throughout the codebase.

Changes

Cohort / File(s) Summary
Artifact Normalization
doozer/doozerlib/image.py, doozer/tests/test_image.py
Added get_required_artifacts normalization to convert artifact resources (strings or dicts) into uniform dicts with url and filename keys, with updated docstring and test expectations.
Artifact Download & Lockfile Processing
doozer/doozerlib/lockfile.py, doozer/tests/test_lockfile.py
Modified artifact download loop to iterate over artifact dicts with url and filename keys; updated _download_and_compute_checksum to support custom filenames; added test for custom filename handling in artifact lockfiles.
Artifact Resource Consumption
doozer/doozerlib/backend/rebaser.py
Updated loop to extract artifact URLs from dict structure (artifact['url']) instead of direct string values.
Schema Definition
ocp-build-data-validator/validator/json_schemas/image_config.base.schema.json
Extended artifact resource schema to accept either a plain URL string or an object with required url and optional filename fields.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@doozer/doozerlib/image.py`:
- Around line 1205-1229: In get_required_artifacts, resources can include Model
instances that are skipped by the current isinstance(resource, dict) branch;
update the loop to detect model-like objects (e.g., hasattr(resource, "dict") or
"model_dump") and convert them to a plain dict (assign back to resource) before
the dict-handling branch, then validate shape (ensure 'url' exists) and raise a
clear exception (e.g., TypeError/ValueError) for unsupported types instead of
silently dropping entries; keep the existing string handling and the normalized
list building (variable names: get_required_artifacts, resources, normalized).

In `@doozer/tests/test_lockfile.py`:
- Around line 827-832: Tests use hardcoded Path("/tmp/...") for mock_dest_dir
which can collide; replace each occurrence (e.g., the mock_dest_dir assignment
near mock_session setup and other occurrences at lines noted) with a temporary
directory created via tempfile.mkdtemp() wrapped in Path(...), register cleanup
with self.addCleanup(lambda: shutil.rmtree(tmpdir)) to remove it after tests,
and add imports for tempfile and shutil if missing; ensure the variable names
(mock_dest_dir or similar) are updated to use the Path(tempdir) value so
existing references (mock_session, mock_session_class, and related mocks)
continue to work.
🧹 Nitpick comments (1)
doozer/tests/test_lockfile.py (1)

763-766: Preserve legacy string-format coverage.

The updated test now exercises only the dict form. Add a small test (or parameterize this one) that still passes a list of URL strings to ensure the backward-compatible path remains covered.

Comment on lines 1205 to +1229
def get_required_artifacts(self) -> list:
"""Get list of required artifact URLs from image config."""
"""
Get list of required artifacts from image config.

Returns:
list[dict]: List of dicts with 'url' and 'filename' keys.
Normalizes both string URLs and object formats.
"""
if not self.is_artifact_lockfile_enabled():
return []

resource_urls = self.config.konflux.cachi2.artifact_lockfile.resources
if resource_urls in [Missing, None]:
resources = self.config.konflux.cachi2.artifact_lockfile.resources
if resources in [Missing, None]:
return []

return resource_urls # Direct URL list
# Normalize to list of dicts
normalized = []
for resource in resources:
if isinstance(resource, str):
# Simple URL string - filename will be extracted from URL
normalized.append({'url': resource, 'filename': None})
elif isinstance(resource, dict):
# Object format with url and optional filename
normalized.append({'url': resource['url'], 'filename': resource.get('filename', None)})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Normalize Model-based resource entries (otherwise object-form artifacts get dropped).

If resource objects come through as Model instances, the current isinstance(resource, dict) check will skip them silently, leading to empty artifacts despite valid config. That breaks the new object-format path. Normalize Model to primitives and fail fast on unsupported shapes so hermetic validation doesn’t produce false negatives.

🔧 Proposed fix
-        for resource in resources:
-            if isinstance(resource, str):
-                # Simple URL string - filename will be extracted from URL
-                normalized.append({'url': resource, 'filename': None})
-            elif isinstance(resource, dict):
-                # Object format with url and optional filename
-                normalized.append({'url': resource['url'], 'filename': resource.get('filename', None)})
+        for resource in resources:
+            if isinstance(resource, str):
+                normalized.append({'url': resource, 'filename': None})
+                continue
+            if isinstance(resource, Model):
+                resource = resource.primitive()
+            if isinstance(resource, dict):
+                url = resource.get('url')
+                if not url:
+                    raise ValueError("artifact_lockfile.resources entry missing required 'url'")
+                normalized.append({'url': url, 'filename': resource.get('filename')})
+                continue
+            raise ValueError(f"Unsupported artifact resource type: {type(resource).__name__}")
🤖 Prompt for AI Agents
In `@doozer/doozerlib/image.py` around lines 1205 - 1229, In
get_required_artifacts, resources can include Model instances that are skipped
by the current isinstance(resource, dict) branch; update the loop to detect
model-like objects (e.g., hasattr(resource, "dict") or "model_dump") and convert
them to a plain dict (assign back to resource) before the dict-handling branch,
then validate shape (ensure 'url' exists) and raise a clear exception (e.g.,
TypeError/ValueError) for unsupported types instead of silently dropping
entries; keep the existing string handling and the normalized list building
(variable names: get_required_artifacts, resources, normalized).

Comment on lines +827 to +832
mock_dest_dir = Path("/tmp/test")

# Mock session and download behavior
mock_session = MagicMock()
mock_session_class.return_value.__aenter__.return_value = mock_session

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's check the file structure and find the test file
git ls-files | grep -i lockfile | head -20

Repository: openshift-eng/art-tools

Length of output: 125


🏁 Script executed:

# Check the specific lines mentioned in the review
if [ -f "doozer/tests/test_lockfile.py" ]; then
  echo "File exists. Reading lines around 827-832..."
  sed -n '820,840p' doozer/tests/test_lockfile.py
fi

Repository: openshift-eng/art-tools

Length of output: 1015


🏁 Script executed:

# Let's also check the test class structure to understand the test framework
sed -n '1,50p' doozer/tests/test_lockfile.py | head -40

Repository: openshift-eng/art-tools

Length of output: 1518


🏁 Script executed:

# Search for other hardcoded /tmp paths in the test file
rg '/tmp' doozer/tests/test_lockfile.py

Repository: openshift-eng/art-tools

Length of output: 415


🏁 Script executed:

# Find the test method containing line 827-832
sed -n '750,832p' doozer/tests/test_lockfile.py | tail -100 | head -60

Repository: openshift-eng/art-tools

Length of output: 2705


🏁 Script executed:

# Check what class contains this test method
rg -B 50 'mock_dest_dir = Path\("/tmp/test"\)' doozer/tests/test_lockfile.py | head -60

Repository: openshift-eng/art-tools

Length of output: 2693


🏁 Script executed:

# Search for the entire test method definition
sed -n '800,835p' doozer/tests/test_lockfile.py

Repository: openshift-eng/art-tools

Length of output: 1793


🏁 Script executed:

# Check if there are any existing uses of tempfile in the test file
rg 'tempfile|mkdtemp|TemporaryDirectory' doozer/tests/test_lockfile.py

Repository: openshift-eng/art-tools

Length of output: 49


🏁 Script executed:

# Check how mock_dest_dir is actually used in the test methods
rg -A 10 'mock_dest_dir = Path\("/tmp/test"\)' doozer/tests/test_lockfile.py | head -40

Repository: openshift-eng/art-tools

Length of output: 1594


🏁 Script executed:

# Check if _write_yaml actually uses the path to create real directories
sed -n '100,200p' doozer/tests/test_lockfile.py | grep -A 5 "_write_yaml"

Repository: openshift-eng/art-tools

Length of output: 49


🏁 Script executed:

# Find the full test method that contains line 827-832
awk '/async def test_generate_artifact_lockfile_custom_filename/,/^    (async )?def |^class / {print NR": "$0}' doozer/tests/test_lockfile.py | head -80

Repository: openshift-eng/art-tools

Length of output: 177


🏁 Script executed:

# Get line count to understand file size better
wc -l doozer/tests/test_lockfile.py

Repository: openshift-eng/art-tools

Length of output: 100


🏁 Script executed:

# Check the actual assertion/usage of mock_dest_dir in these tests
sed -n '820,900p' doozer/tests/test_lockfile.py

Repository: openshift-eng/art-tools

Length of output: 3643


🏁 Script executed:

# Count occurrences of hardcoded /tmp paths in this test file
rg '/tmp' doozer/tests/test_lockfile.py | wc -l

Repository: openshift-eng/art-tools

Length of output: 68


🏁 Script executed:

# Check the complete list of all hardcoded /tmp paths
rg -n '/tmp' doozer/tests/test_lockfile.py

Repository: openshift-eng/art-tools

Length of output: 451


🏁 Script executed:

# Verify the test class inheritance to confirm self.addCleanup is available
sed -n '790,810p' doozer/tests/test_lockfile.py

Repository: openshift-eng/art-tools

Length of output: 1068


🏁 Script executed:

# Check if the mock_dest_dir is actually used to create files or if it's purely mocked
rg -B 5 -A 15 'def generate_artifact_lockfile' doozer/doozerlib/lockfile.py | head -60

Repository: openshift-eng/art-tools

Length of output: 955


🏁 Script executed:

# Check how the test class is structured and if setUp/tearDown is used
sed -n '813,825p' doozer/tests/test_lockfile.py

Repository: openshift-eng/art-tools

Length of output: 830


🏁 Script executed:

# Check if there are other tests in the file using temporary directories properly
rg -n 'TemporaryDirectory|tempfile|mktemp' doozer/tests/test_lockfile.py

Repository: openshift-eng/art-tools

Length of output: 49


🏁 Script executed:

# Verify S108 is indeed the Bandit rule for hardcoded tmp
web_search S108 bandit security hardcoded temp path

Repository: openshift-eng/art-tools

Length of output: 117


🌐 Web query:

S108 bandit hardcoded temporary directory

💡 Result:

S108 (flake8-bandit / Ruff) / B108 (Bandit) flags hardcoded temporary file or directory paths like "/tmp/...", "/var/tmp/...", "/dev/shm/...", because hardcoding temp locations can enable attacks (e.g., symlink/race) and cause unexpected interference from other processes. (docs.astral.sh)

What triggers it

Examples that typically trigger S108/B108:

open("/tmp/foo.txt", "w")
Path("/var/tmp/app/cache").mkdir()

Bandit’s B108 specifically matches string literals starting with configurable temp dir prefixes (default includes /tmp, /var/tmp, /dev/shm). (bandit.readthedocs.io)

Recommended fixes

Use the standard library tempfile APIs instead of hardcoded paths:

import tempfile

with tempfile.NamedTemporaryFile() as f:
    f.write(b"data")

or:

import tempfile

with tempfile.TemporaryDirectory() as d:
    ...

These are the recommended alternatives in the rule guidance. (docs.astral.sh)

If you must keep a fixed temp path

  • Bandit config: customize what B108 considers “tmp” via hardcoded_tmp_directory.tmp_dirs. (bandit.readthedocs.io)
  • Suppress with justification: add an inline ignore (e.g., # nosec) only when you’re sure it’s safe and you can explain why (project policy varies). (bandit.readthedocs.io)

Citations:


Use tempfile.mkdtemp() instead of hardcoded /tmp paths.

This pattern appears multiple times in the test file (lines 768, 827, 882, 898) and triggers S108. While these specific mocks don't write actual files, hardcoding temp paths can cause test collisions in parallel execution and invites security issues. Use tempfile.mkdtemp() with self.addCleanup() for proper cleanup.

♻️ Suggested change
-        mock_dest_dir = Path("/tmp/test")
+        import tempfile
+        import shutil
+        mock_dest_dir = Path(tempfile.mkdtemp(prefix="artifact-lockfile-"))
+        self.addCleanup(shutil.rmtree, mock_dest_dir, ignore_errors=True)
🧰 Tools
🪛 Ruff (0.14.14)

[error] 827-827: Probable insecure usage of temporary file or directory: "/tmp/test"

(S108)

🤖 Prompt for AI Agents
In `@doozer/tests/test_lockfile.py` around lines 827 - 832, Tests use hardcoded
Path("/tmp/...") for mock_dest_dir which can collide; replace each occurrence
(e.g., the mock_dest_dir assignment near mock_session setup and other
occurrences at lines noted) with a temporary directory created via
tempfile.mkdtemp() wrapped in Path(...), register cleanup with
self.addCleanup(lambda: shutil.rmtree(tmpdir)) to remove it after tests, and add
imports for tempfile and shutil if missing; ensure the variable names
(mock_dest_dir or similar) are updated to use the Path(tempdir) value so
existing references (mock_session, mock_session_class, and related mocks)
continue to work.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 6, 2026

@fbladilo: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security c8a2767 link false /test security

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@fbladilo
Copy link
Contributor Author

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant