Skip to content

Conversation

@raptorsun
Copy link
Contributor

@raptorsun raptorsun commented Jan 22, 2026

Description

Enable hermetic build.
Tools to regenerate requirement files.
Ruby gems are installed using bundler, for Konflux only support bundler to install Ruby gems in hermetic builds.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue # LCORE-791
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Multi-arch builds (x86_64 & ARM64) and optional source-image builds
    • Hermetic prefetching for RPM, Python, and Bundler dependencies
    • Embedding model download integrated into the image build
  • Chores

    • Added reproducible, hash‑pinned dependency manifests and an automated requirements generation script
    • Introduced Ruby/Bundler build flow, new Make targets, expanded image metadata, and UBI repo config
    • Updated ML dependency pins and build tooling versions

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 22, 2026

Walkthrough

Adds hermetic prefetching and multi-arch build support to Tekton PipelineRuns (including an arm64 target and optional source-image build), introduces pinned/hash‑verified Python wheel/source manifests and Bundler support, adds UBI repo and RPM list, and provides scripts/Make targets to generate and inject requirement artifacts into the pipeline.

Changes

Cohort / File(s) Summary
Tekton pipeline
​.tekton/rag-tool-pull-request.yaml, ​.tekton/rag-tool-push.yaml
Add build-source-image and prefetch-input parameters; set hermetic: true; embed complex prefetch payload (rpm/pip/bundler); add linux-c6gd2xlarge/arm64 to build-platforms; reorder/propagate params into PipelineRun.
Hermetic dependency artifacts
requirements-build.txt, requirements.hashes.source.txt, requirements.hashes.wheel.pypi.txt, requirements.hashes.wheel.txt, requirements.hermetic.txt, requirements.overrides.txt
Add autogenerated, pinned requirements and hash manifests for reproducible hermetic installs and override pins for selected packages.
Container image & Ruby tooling
Containerfile, Gemfile, pyproject.toml, Makefile
Add Ruby/Bundler installation and Gemfile, relax/upgrade some Python pins, add embedding-model download and image LABELs, and add konflux-requirements and ruby-bundler Make targets.
RPMs & repo config
rpms.in.yaml, ubi.repo
Add RPM package list for build images and UBI9 repository definitions covering x86_64 and aarch64.
Build tooling script
scripts/konflux_requirements.sh
New script that generates consolidated requirements, partitions wheel vs source packages, writes hash manifests, pins build deps, appends extra wheels, and patches Tekton YAMLs with computed package lists.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Dev as Developer
participant Git as Git/PR
participant Script as konflux_requirements.sh
participant Tekton as Tekton Pipeline
participant Prefetch as Prefetch Task
participant Builder as Build Matrix
participant Registry as Image Registry

Dev->>Git: push PR (requirements, Containerfile, Tekton changes)
Git->>Script: invoke konflux_requirements.sh (local or CI task)
Script->>Git: commit generated hash manifests & patched Tekton YAMLs
Git->>Tekton: trigger PipelineRun (includes prefetch-input, build-source-image)
Tekton->>Prefetch: run hermetic prefetch (rpm, pip hashes, bundler)
Prefetch-->>Tekton: return artifacts/metadata
Tekton->>Builder: run build matrix (x86_64, aarch64, optional source image)
Builder->>Registry: push images
Registry-->>Tekton: return digests/status

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • tisnik
  • umago
  • lpiwowar
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main objective of enabling hermetic builds with Konflux configuration, though it uses '[WIP]' which is standard for in-progress work.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@rpms.in.yaml`:
- Around line 1-11: The packages list in rpms.in.yaml is missing rubygem-bundler
even though the Containerfile installs it and rpms.lock.yaml contains it; update
the packages array to include rubygem-bundler alongside rubygems (i.e., add the
symbol "rubygem-bundler" to the packages list) so the input manifest explicitly
declares the bundler RPM for hermetic builds and matches the lockfile and
Containerfile.
🧹 Nitpick comments (4)
pyproject.toml (1)

90-91: Align pip pin with the hermetic toolchain (or document the split)

Dev deps pin pip==24.3.1 while the hermetic toolchain pins pip==25.3. Divergent resolver behavior can yield different lock outputs. Consider aligning these or adding a short note explaining the intentional split.

Containerfile (1)

13-13: Consider pinning uv to an exact version for hermetic builds.

For a hermetic build aiming at reproducibility, using uv>=0.7.20 allows any newer version to be installed. Pin to an exact version (e.g., uv==0.7.20) to ensure consistent builds.

Suggested fix
-RUN pip3.12 install uv>=0.7.20
+RUN pip3.12 install uv==0.7.20
requirements.overrides.txt (1)

1-11: Add trailing newline for POSIX compliance.

The file is missing a trailing newline. Many tools expect text files to end with a newline character.

Suggested fix
 faiss-cpu==1.12.0
+
Makefile (1)

104-105: Add .PHONY declaration for consistency.

Other targets in this Makefile use .PHONY declarations. For consistency and to prevent issues if a file named konflux-requirements is ever created, add the phony declaration.

Suggested fix
+.PHONY: konflux-requirements
 konflux-requirements:	## generate hermetic requirements.*.txt file for konflux build
 	./scripts/konflux_requirements.sh

Comment on lines +1 to +11
packages:
[
gcc,
cmake,
cargo,
rubygems,
python3.12,
python3.12-devel,
python3.12-pip,
git,
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the Containerfile and rpms.in.yaml files
fd -t f "Containerfile|Dockerfile" -x cat {}

Repository: lightspeed-core/rag-content

Length of output: 4262


🏁 Script executed:

# Search for rubygem-bundler and bundler references
rg -i "rubygem-bundler|bundler" -B 2 -A 2

Repository: lightspeed-core/rag-content

Length of output: 1442


🏁 Script executed:

# Check the rpms.in.yaml file to see its full content
cat rpms.in.yaml

Repository: lightspeed-core/rag-content

Length of output: 269


Add rubygem-bundler to the packages list.

The Containerfile installs both rubygems and rubygem-bundler (line 9), and rubygem-bundler is already listed in rpms.lock.yaml for both architectures. For consistency and clarity in hermetic builds, rubygem-bundler should be explicitly included in the input packages.

🤖 Prompt for AI Agents
In `@rpms.in.yaml` around lines 1 - 11, The packages list in rpms.in.yaml is
missing rubygem-bundler even though the Containerfile installs it and
rpms.lock.yaml contains it; update the packages array to include rubygem-bundler
alongside rubygems (i.e., add the symbol "rubygem-bundler" to the packages list)
so the input manifest explicitly declares the bundler RPM for hermetic builds
and matches the lockfile and Containerfile.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@Containerfile`:
- Line 13: The RUN pip install line uses an unquoted package spec containing ">"
so the shell treats ">" as redirection; update the Dockerfile RUN command that
currently says RUN pip3.12 install uv>=0.7.20 to quote or escape the version
spec (e.g., use single or double quotes around uv>=0.7.20 or escape the >) so
the version constraint is preserved and no file is created by shell redirection.

In `@requirements-build.txt`:
- Around line 65-70: The requirements file contains conflicting pins for
hatchling (hatchling==1.26.3 and hatchling==1.28.0); resolve the duplicate
requirement by choosing a single hatchling version and updating the pins
accordingly (either bump dependencies that require 1.26.3 or downgrade those
requiring 1.28.0) so only one hatchling spec remains (remove or replace the
other occurrence of hatchling==1.26.3 or hatchling==1.28.0); ensure any direct
consumers (e.g., hatch-fancy-pypi-readme, openai, banks, bracex, cibuildwheel,
filelock, fsspec) are compatible with the chosen version before committing.
🧹 Nitpick comments (1)
Makefile (1)

104-105: Mark konflux-requirements as phony to avoid file collisions.

If a file named konflux-requirements appears, this target won’t run. Consider adding a .PHONY entry.

♻️ Proposed change
+.PHONY: konflux-requirements
 konflux-requirements:	## generate hermetic requirements.*.txt file for konflux build
 	./scripts/konflux_requirements.sh


# Install uv package manager
RUN pip3.12 install uv==0.7.20
RUN pip3.12 install uv>=0.7.20
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Quote the version spec to avoid shell redirection.

Line 13 will redirect output because > is parsed by the shell. This drops the version constraint and creates a stray file.

🐛 Proposed fix
-RUN pip3.12 install uv>=0.7.20
+RUN pip3.12 install "uv>=0.7.20"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN pip3.12 install uv>=0.7.20
RUN pip3.12 install "uv>=0.7.20"
🤖 Prompt for AI Agents
In `@Containerfile` at line 13, The RUN pip install line uses an unquoted package
spec containing ">" so the shell treats ">" as redirection; update the
Dockerfile RUN command that currently says RUN pip3.12 install uv>=0.7.20 to
quote or escape the version spec (e.g., use single or double quotes around
uv>=0.7.20 or escape the >) so the version constraint is preserved and no file
is created by shell redirection.

Comment on lines +65 to +70
hatchling==1.26.3
# via
# hatch-fancy-pypi-readme
# openai
hatchling==1.28.0
# via
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, check the actual content of requirements-build.txt around lines 65-70
sed -n '60,75p' requirements-build.txt

Repository: lightspeed-core/rag-content

Length of output: 355


🏁 Script executed:

# Then run the duplicate detection script mentioned in the review
python - <<'PY'
import re
from collections import Counter

pkgs = []
with open("requirements-build.txt", "r", encoding="utf-8") as f:
    for line in f:
        line = line.strip()
        if not line or line.startswith("#"):
            continue
        if " @ " in line:
            name = line.split(" @ ", 1)[0]
        else:
            name = re.split(r"[<=>]", line, 1)[0]
        pkgs.append(name.lower())

dups = [p for p, c in Counter(pkgs).items() if c > 1]
print("Duplicates found:", ", ".join(dups) if dups else "none")
print("\nFull package counts:")
for pkg, count in sorted(Counter(pkgs).items()):
    if count > 1:
        print(f"  {pkg}: {count} occurrences")
PY

Repository: lightspeed-core/rag-content

Length of output: 147


🏁 Script executed:

# Also check if there are any other versions of hatchling in the file
grep -n "hatchling" requirements-build.txt

Repository: lightspeed-core/rag-content

Length of output: 204


Resolve duplicate hatchling pins to prevent pip failure.

hatchling is pinned to two different versions: 1.26.3 at line 65 (required by hatch-fancy-pypi-readme, openai) and 1.28.0 at line 69 (required by banks, bracex, cibuildwheel, filelock, fsspec). Pip will reject this with a "double requirement given" error. Consolidate to a single version by either upgrading/downgrading the conflicting dependencies or manually resolving the pins to a common version.

🤖 Prompt for AI Agents
In `@requirements-build.txt` around lines 65 - 70, The requirements file contains
conflicting pins for hatchling (hatchling==1.26.3 and hatchling==1.28.0);
resolve the duplicate requirement by choosing a single hatchling version and
updating the pins accordingly (either bump dependencies that require 1.26.3 or
downgrade those requiring 1.28.0) so only one hatchling spec remains (remove or
replace the other occurrence of hatchling==1.26.3 or hatchling==1.28.0); ensure
any direct consumers (e.g., hatch-fancy-pypi-readme, openai, banks, bracex,
cibuildwheel, filelock, fsspec) are compatible with the chosen version before
committing.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@scripts/konflux_requirements.sh`:
- Around line 89-90: The sed substitution that pins maturin only matches
digits/dots and will break for suffixes like .post1 or rc; update the regex in
the sed command that touches BUILD_FILE (the line using sed and the token
"maturin==") to match the entire version token (e.g., use a non-whitespace token
matcher such as [^[:space:]]* or \S* instead of [0-9.]*) so the replacement
always replaces the full version token with maturin==1.10.2.
- Around line 41-86: The script runs `uv pip compile` on $WHEEL_FILE_PYPI even
when it contains only comments, causing failure; add a guard before the `uv pip
compile "$WHEEL_FILE_PYPI" ... > "$WHEEL_HASH_FILE_PYPI"` call that checks for
any non-comment/non-empty package lines (e.g. grep -v '^[[:space:]]*#'
"$WHEEL_FILE_PYPI" | grep -q '\S') and if none are found, skip the compile and
create/write an empty or placeholder $WHEEL_HASH_FILE_PYPI (or touch it) so
downstream steps don’t fail; update the block around the WHEEL_FILE_PYPI compile
invocation to use this conditional and keep the other compile commands
unchanged.
♻️ Duplicate comments (3)
rpms.in.yaml (1)

1-11: Add rubygem-bundler to the packages list.

Bundler is installed in the Containerfile; include the RPM here so the hermetic input manifest matches the build dependencies.

🐛 Proposed fix
     rubygems,
+    rubygem-bundler,
     python3.12,
Containerfile (1)

12-13: Quote the version spec to avoid shell redirection.

Unquoted > is parsed by the shell, so the constraint is dropped and a file is created.

🐛 Proposed fix
-RUN pip3.12 install uv>=0.7.20
+RUN pip3.12 install "uv>=0.7.20"
requirements-build.txt (1)

65-76: Resolve duplicate hatchling pins to avoid pip conflicts.

Two hatchling versions are pinned; pip will reject this. Consolidate to a single compatible version and adjust dependents as needed.

🐛 Example consolidation (remove the older pin)
-hatchling==1.26.3
-    # via
-    #   hatch-fancy-pypi-readme
-    #   openai
🧹 Nitpick comments (2)
Makefile (1)

104-105: Declare konflux-requirements as phony.

Avoids conflicts if a file named konflux-requirements appears and keeps parity with other utility targets.

♻️ Proposed change
+.PHONY: konflux-requirements
 konflux-requirements:	## generate hermetic requirements.*.txt file for konflux build
 	./scripts/konflux_requirements.sh
.tekton/rag-tool-push.yaml (1)

27-59: Align prefetch binary architectures with build platforms.

build-platforms is x86_64 only, but prefetch binary.arch includes aarch64. If multi-arch isn’t intended here, this fetches extra artifacts; if it is, consider adding aarch64 to build-platforms.

♻️ One possible alignment (x86_64-only)
-            "arch": "x86_64,aarch64",
+            "arch": "x86_64",

Comment on lines +41 to +86
echo "# Packages from pypi.org to be fetched as wheels" > "$WHEEL_FILE_PYPI"
echo "# This file was autogenerated by konflux_requirements.sh" >> "$WHEEL_FILE_PYPI"

current_package=""

while IFS= read -r line || [[ -n "$line" ]]; do
# Check if this is a package line (starts with a letter/digit, not whitespace or #)
if [[ "$line" =~ ^[a-zA-Z0-9] ]]; then
current_package="$line"
# Check if this is a "# from" annotation line
elif [[ "$line" =~ ^[[:space:]]*#[[:space:]]*from[[:space:]]+(.*) ]]; then
index_url="${BASH_REMATCH[1]}"

if [[ -n "$current_package" ]]; then
if [[ "$index_url" == "https://pypi.org/simple/" ]]; then
# Extract package name (without version) for comparison
package_name=$(echo "$current_package" | sed 's/[=<>!].*//')
if echo ",${PYPI_WHEELS}," | grep -qF ",${package_name},"; then
echo "$current_package" >> "$WHEEL_FILE_PYPI"
else
echo "$current_package" >> "$SOURCE_FILE"
fi
elif [[ "$index_url" == "https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple/" ]]; then
echo "$current_package" >> "$WHEEL_FILE"
fi
current_package=""
fi
fi
done < "$RAW_REQ_FILE"

# replace the list of binary packages in konflux pipeline configuration
# only the package names, not the versions, delimited by commas
wheel_packages=$(grep -v "^[#-]" "$WHEEL_FILE" | sed 's/==.*//' | tr '\n' ',' | sed 's/,$//')
# append extra wheels to the list
wheel_packages="$wheel_packages,$EXTRA_WHEELS,$PYPI_WHEELS"
sed -i 's/"packages": "[^"]*"/"packages": "'"$wheel_packages"'"/' .tekton/rag-tool-pull-request.yaml
sed -i 's/"packages": "[^"]*"/"packages": "'"$wheel_packages"'"/' .tekton/rag-tool-push.yaml

echo "Packages from pypi.org written to: $SOURCE_FILE ($(wc -l < "$SOURCE_FILE") packages)"
echo "Packages from console.redhat.com written to: $WHEEL_FILE ($(wc -l < "$WHEEL_FILE") packages)"


# Use stdout redirection instead of -o flag to work around uv bug where -o reuses stale hashes from existing output file
uv pip compile "$WHEEL_FILE" --refresh --generate-hashes --index-url https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple/ --python-version 3.12 --emit-index-url --no-deps --no-annotate --universal > "$WHEEL_HASH_FILE"
uv pip compile "$WHEEL_FILE_PYPI" --refresh --generate-hashes --python-version 3.12 --emit-index-url --no-deps --no-annotate > "$WHEEL_HASH_FILE_PYPI"
uv pip compile "$SOURCE_FILE" --refresh --generate-hashes --python-version 3.12 --emit-index-url --no-deps --no-annotate > "$SOURCE_HASH_FILE"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against empty PyPI wheel lists before compiling hashes.

If none of the PYPI_WHEELS entries are present in the compiled requirements, $WHEEL_FILE_PYPI will contain only comments. uv pip compile typically exits with “No requirements specified,” which breaks the script under set -e even though an empty wheel list is valid. Consider skipping compilation (or writing an empty hash file) when no package lines exist.

🔧 Proposed fix
-uv pip compile "$WHEEL_FILE_PYPI" --refresh --generate-hashes --python-version 3.12 --emit-index-url --no-deps --no-annotate > "$WHEEL_HASH_FILE_PYPI"
+if grep -qEv '^[[:space:]]*($|#)' "$WHEEL_FILE_PYPI"; then
+  uv pip compile "$WHEEL_FILE_PYPI" --refresh --generate-hashes --python-version 3.12 --emit-index-url --no-deps --no-annotate > "$WHEEL_HASH_FILE_PYPI"
+else
+  echo "# No PyPI wheel requirements" > "$WHEEL_HASH_FILE_PYPI"
+fi
🤖 Prompt for AI Agents
In `@scripts/konflux_requirements.sh` around lines 41 - 86, The script runs `uv
pip compile` on $WHEEL_FILE_PYPI even when it contains only comments, causing
failure; add a guard before the `uv pip compile "$WHEEL_FILE_PYPI" ... >
"$WHEEL_HASH_FILE_PYPI"` call that checks for any non-comment/non-empty package
lines (e.g. grep -v '^[[:space:]]*#' "$WHEEL_FILE_PYPI" | grep -q '\S') and if
none are found, skip the compile and create/write an empty or placeholder
$WHEEL_HASH_FILE_PYPI (or touch it) so downstream steps don’t fail; update the
block around the WHEEL_FILE_PYPI compile invocation to use this conditional and
keep the other compile commands unchanged.

Comment on lines +89 to +90
# pin maturin to the version available in the Red Hat registry
sed -i 's/maturin==[0-9.]*/maturin==1.10.2/' "$BUILD_FILE"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make the maturin pin resilient to non-numeric version suffixes.

The current regex only matches digits and dots; if the resolved version includes a suffix (e.g., .post1, rc), the substitution can produce an invalid version string. Match the full token instead.

🔧 Proposed fix
-sed -i 's/maturin==[0-9.]*/maturin==1.10.2/' "$BUILD_FILE"
+sed -i 's/maturin==[^[:space:]]*/maturin==1.10.2/' "$BUILD_FILE"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# pin maturin to the version available in the Red Hat registry
sed -i 's/maturin==[0-9.]*/maturin==1.10.2/' "$BUILD_FILE"
sed -i 's/maturin==[^[:space:]]*/maturin==1.10.2/' "$BUILD_FILE"
🤖 Prompt for AI Agents
In `@scripts/konflux_requirements.sh` around lines 89 - 90, The sed substitution
that pins maturin only matches digits/dots and will break for suffixes like
.post1 or rc; update the regex in the sed command that touches BUILD_FILE (the
line using sed and the token "maturin==") to match the entire version token
(e.g., use a non-whitespace token matcher such as [^[:space:]]* or \S* instead
of [0-9.]*) so the replacement always replaces the full version token with
maturin==1.10.2.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.tekton/rag-tool-push.yaml (1)

27-29: Remove aarch64 from prefetch-input arch field or add it to build-platforms.

The build-platforms specifies only linux/x86_64, but the prefetch-input pip binary configuration requests x86_64,aarch64. This causes unnecessary prefetching of aarch64 wheels that won't be used in this pipeline. Either align the arch field to x86_64 or add linux-c6gd2xlarge/arm64 to build-platforms if multi-arch support is intended.

♻️ Duplicate comments (3)
scripts/konflux_requirements.sh (2)

83-86: Guard against empty requirements files before compiling hashes.

If $WHEEL_FILE_PYPI or other files contain only comments/headers, uv pip compile will fail with "No requirements specified", causing the script to exit under set -e. This concern was previously raised.

🔧 Proposed fix
 # Use stdout redirection instead of -o flag to work around uv bug where -o reuses stale hashes from existing output file
 uv pip compile "$WHEEL_FILE" --refresh --generate-hashes --index-url https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple/ --python-version 3.12 --emit-index-url --no-deps --no-annotate --universal > "$WHEEL_HASH_FILE"
-uv pip compile "$WHEEL_FILE_PYPI" --refresh --generate-hashes --python-version 3.12 --emit-index-url --no-deps --no-annotate > "$WHEEL_HASH_FILE_PYPI"
+if grep -qEv '^[[:space:]]*($|#)' "$WHEEL_FILE_PYPI"; then
+  uv pip compile "$WHEEL_FILE_PYPI" --refresh --generate-hashes --python-version 3.12 --emit-index-url --no-deps --no-annotate > "$WHEEL_HASH_FILE_PYPI"
+else
+  echo "# No PyPI wheel requirements" > "$WHEEL_HASH_FILE_PYPI"
+fi
 uv pip compile "$SOURCE_FILE" --refresh --generate-hashes --python-version 3.12 --emit-index-url --no-deps --no-annotate > "$SOURCE_HASH_FILE"

89-90: Make the maturin pin resilient to non-numeric version suffixes.

The regex [0-9.]* won't match version suffixes like .post1 or rc1. This concern was previously raised.

🔧 Proposed fix
-sed -i 's/maturin==[0-9.]*/maturin==1.10.2/' "$BUILD_FILE"
+sed -i 's/maturin==[^[:space:]]*/maturin==1.10.2/' "$BUILD_FILE"
Containerfile (1)

13-13: Quote the version spec to avoid shell redirection.

The unquoted >= will be parsed by the shell as redirection, creating a file 0.7.20 and dropping the version constraint. This was previously flagged.

🐛 Proposed fix
-RUN pip3.12 install uv>=0.7.20
+RUN pip3.12 install "uv>=0.7.20"
🧹 Nitpick comments (2)
scripts/konflux_requirements.sh (1)

71-77: Consider escaping special characters in sed substitution.

If any package name contains characters special to sed (like /, &, or \), the substitution on lines 76-77 will fail or produce incorrect results. While unlikely with standard package names, this could be made more robust.

♻️ Proposed fix using a safer delimiter
-sed -i 's/"packages": "[^"]*"/"packages": "'"$wheel_packages"'"/' .tekton/rag-tool-pull-request.yaml
-sed -i 's/"packages": "[^"]*"/"packages": "'"$wheel_packages"'"/' .tekton/rag-tool-push.yaml
+# Escape any forward slashes in package names for sed
+escaped_packages=$(echo "$wheel_packages" | sed 's/[&/\]/\\&/g')
+sed -i 's/"packages": "[^"]*"/"packages": "'"$escaped_packages"'"/' .tekton/rag-tool-pull-request.yaml
+sed -i 's/"packages": "[^"]*"/"packages": "'"$escaped_packages"'"/' .tekton/rag-tool-push.yaml
Makefile (1)

104-105: Add .PHONY declaration for konflux-requirements target.

The konflux-requirements target doesn't produce a file of that name, so it should be declared as .PHONY to ensure it always runs when invoked (even if a file named konflux-requirements were to exist).

♻️ Proposed fix
+.PHONY: konflux-requirements
 konflux-requirements:	## generate hermetic requirements.*.txt file for konflux build
 	./scripts/konflux_requirements.sh

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@Containerfile`:
- Around line 8-10: Remove any invocations of "bundle update --bundler" in the
Containerfile (the RUN that references Ruby/Gems and later RUNs around lines
that touch Gemfile.lock) and stop rewriting Gemfile.lock; instead set bundler
env vars for locked, hermetic installs (for example export or ENV
BUNDLE_DEPLOYMENT=1 and ENV BUNDLE_FROZEN=1) and run bundle install in locked
mode (e.g., "bundle install --local" or "bundle install --jobs=... --retry=..."
so it uses the existing Gemfile.lock without contacting rubygems.org). Ensure
references to "Gemfile.lock" remain unchanged and remove the deprecated
"--deployment --frozen" flags in favor of the environment variables.

In `@Makefile`:
- Around line 104-106: The Makefile's konflux-requirements target lacks a .PHONY
declaration; add konflux-requirements to the Makefile's .PHONY list (or create a
.PHONY line if one doesn't exist) so the konflux-requirements target is always
executed rather than being skipped if a file/dir named konflux-requirements
exists; reference the konflux-requirements target name when updating the .PHONY
entries.

In `@pyproject.toml`:
- Around line 48-51: Update the stale explanatory comment above the pinned
dependencies so it matches the actual pinned versions: replace the line stating
"torch 2.5.x pairs with torchvision 0.20.x" with a comment that reflects the
current pins "torch>=2.8.0,<2.9.0" and "torchvision>=0.23.0,<0.24.0" (e.g.,
"torch 2.8.x pairs with torchvision 0.23.x") so the comment correctly documents
the version pairing.

In `@requirements.overrides.txt`:
- Around line 1-11: The file ends without a trailing newline; update
requirements.overrides.txt to add a single POSIX-compliant newline character at
the end of the file (i.e., ensure the file ends with a '\n' after the last
listed package such as "faiss-cpu==1.12.0") so text tools and linters treat it
as a proper POSIX text file.
♻️ Duplicate comments (2)
requirements-build.txt (1)

65-100: Duplicate hatchling pins will cause pip installation failure.

The file contains two conflicting pins for hatchling:

  • Line 65: hatchling==1.26.3
  • Line 69: hatchling==1.28.0

Pip will reject this with a "double requirement given" error. This needs to be resolved by consolidating to a single version that satisfies all dependents.

Containerfile (1)

13-13: Quote the uv version spec to avoid shell redirection.

Line 13 treats > as redirection, which drops the version constraint and creates a stray file. This was flagged previously and still applies.

🐛 Proposed fix
-RUN pip3.12 install uv>=0.7.20
+RUN pip3.12 install "uv>=0.7.20"

Containerfile Outdated
Comment on lines 8 to 10
# Install Ruby
RUN microdnf install -y rubygems rubygem-bundler && \
microdnf clean all
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd -type f -name "Containerfile" -o -name "Dockerfile"

Repository: lightspeed-core/rag-content

Length of output: 241


🏁 Script executed:

git ls-files | grep -iE "(Containerfile|Dockerfile)"

Repository: lightspeed-core/rag-content

Length of output: 102


🏁 Script executed:

cat -n Containerfile

Repository: lightspeed-core/rag-content

Length of output: 3049


🏁 Script executed:

cat -n Gemfile Gemfile.lock 2>/dev/null | head -50

Repository: lightspeed-core/rag-content

Length of output: 593


🌐 Web query:

bundler update --bundler flag documentation hermetic build

💡 Result:

bundle update --bundler is the Bundler-supported way to update (or set) the Bundler version recorded in your Gemfile.lock’s BUNDLED WITH section—i.e., the Bundler version your app is expected to use. The flag is documented on the official bundle update manpage as --bundler[=VERSION]. [1]

Why this matters for hermetic/reproducible builds:

  • Gemfile.lock exists to ensure the same resolved gem versions are used across machines, and it also records which Bundler version last wrote the lockfile under BUNDLED WITH. [2]
  • Starting with Bundler 2.3+ (with RubyGems 3.3+), bundle install will automatically use the exact Bundler version from BUNDLED WITH (installing it if needed). This makes builds more hermetic because the build uses the same Bundler version as the lockfile specifies. [3]
  • If you want to change the Bundler version your project is pinned to, run bundle update --bundler (or bundle update --bundler=VERSION) to regenerate the lockfile using a newer (or specified) Bundler version. [3][1]

Primary docs to cite internally for this behavior

  • Bundler manpage: bundle update options incl. --bundler[=VERSION]. [1]
  • Bundler guide: meaning of Gemfile.lock and BUNDLED WITH. [2]
  • Bundler blog: explains lockfile-pinned Bundler behavior and explicitly recommends bundle update --bundler to upgrade the pinned version. [3]

Sources: [1] [2] [3]


🏁 Script executed:

cat Gemfile.lock | grep -A 2 "BUNDLED WITH"

Repository: lightspeed-core/rag-content

Length of output: 91


🌐 Web query:

bundle install --deployment --frozen flags bundler documentation

💡 Result:

bundle install --deployment --frozen are Bundler “install-time behavior” switches:

  • --deployment: enables deployment mode (intended for production/CI). It requires a checked-in, up-to-date Gemfile.lock, and installs gems to vendor/bundle by default (unless you set a different path). It also errors if your Gemfile changed without a corresponding lockfile update. [1]
  • --frozen: prevents Bundler from updating Gemfile.lock; Bundler exits non-zero if it would need to change the lockfile. [1]

In modern Bundler, both flags are deprecated as CLI options (because they relied on Bundler “remembering” flags across runs). The docs recommend using persistent configuration instead:

  • bundle config set --local deployment true (or BUNDLE_DEPLOYMENT=1) [1]
  • bundle config set --local frozen true [1]

Bundler also notes that relying on “remembered options” from bundle install will change in Bundler 4, so using bundle config (or env vars) is the forward-compatible approach. [2]

Sources: Bundler bundle install man page [1], Bundler bundle config man page (remembered options / Bundler 4 note) [2].


Avoid bundle update --bundler in hermetic builds.

Line 25 reaches out to rubygems.org and rewrites Gemfile.lock (the BUNDLED WITH section), which breaks reproducibility and will fail under network isolation. Since Gemfile.lock is already copied and locked, this step is unnecessary. Remove it and use locked installation mode.

Instead of the deprecated --deployment --frozen flags, prefer environment variables for forward compatibility with Bundler 4+:

🔧 Proposed change
- RUN bundle update --bundler
- RUN bundle install
+ RUN BUNDLE_DEPLOYMENT=1 BUNDLE_FROZEN=1 bundle install

Also applies to: 17-17, 23-26

🤖 Prompt for AI Agents
In `@Containerfile` around lines 8 - 10, Remove any invocations of "bundle update
--bundler" in the Containerfile (the RUN that references Ruby/Gems and later
RUNs around lines that touch Gemfile.lock) and stop rewriting Gemfile.lock;
instead set bundler env vars for locked, hermetic installs (for example export
or ENV BUNDLE_DEPLOYMENT=1 and ENV BUNDLE_FROZEN=1) and run bundle install in
locked mode (e.g., "bundle install --local" or "bundle install --jobs=...
--retry=..." so it uses the existing Gemfile.lock without contacting
rubygems.org). Ensure references to "Gemfile.lock" remain unchanged and remove
the deprecated "--deployment --frozen" flags in favor of the environment
variables.

Makefile Outdated
Comment on lines 104 to 106
konflux-requirements: ## generate hermetic requirements.*.txt file for konflux build
./scripts/konflux_requirements.sh

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add .PHONY declaration for konflux-requirements target.

The konflux-requirements target is missing a .PHONY declaration. Without it, if a file or directory named konflux-requirements exists, make will skip execution. Other targets in this Makefile correctly use .PHONY.

📝 Suggested fix
+.PHONY: konflux-requirements
 konflux-requirements:	## generate hermetic requirements.*.txt file for konflux build
 	./scripts/konflux_requirements.sh
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
konflux-requirements: ## generate hermetic requirements.*.txt file for konflux build
./scripts/konflux_requirements.sh
.PHONY: konflux-requirements
konflux-requirements: ## generate hermetic requirements.*.txt file for konflux build
./scripts/konflux_requirements.sh
🤖 Prompt for AI Agents
In `@Makefile` around lines 104 - 106, The Makefile's konflux-requirements target
lacks a .PHONY declaration; add konflux-requirements to the Makefile's .PHONY
list (or create a .PHONY line if one doesn't exist) so the konflux-requirements
target is always executed rather than being skipped if a file/dir named
konflux-requirements exists; reference the konflux-requirements target name when
updating the .PHONY entries.

Comment on lines +1 to +11
# override these package to the version available on RHOAI wheels index:
# https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple
transformers==4.57.3
tokenizers==0.22.1
scipy==1.16.3
aiohttp==3.13.2
aiosqlite==0.22.0
anyio==4.12.0
datasets==4.4.1
pillow==12.0.0
faiss-cpu==1.12.0 No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -e

echo "=== Check 1: Trailing newline in requirements.overrides.txt ==="
if [ -f "requirements.overrides.txt" ]; then
  # Check if file ends with newline
  if [ -z "$(tail -c 1 requirements.overrides.txt)" ]; then
    echo "✓ File DOES end with newline"
  else
    echo "✗ File DOES NOT end with newline (missing trailing newline)"
  fi
  
  # Show last few bytes for confirmation
  echo "Last 20 bytes (hex):"
  tail -c 20 requirements.overrides.txt | od -An -tx1
else
  echo "✗ File not found"
fi

echo ""
echo "=== Check 2: RHOAI URL accessibility ==="
# Test the RHOAI index URL
URL="https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple/"
HTTP_CODE=$(curl -s -o /tmp/rhoai_response.txt -w "%{http_code}" "$URL" --max-time 10 2>/dev/null || echo "000")
echo "HTTP Status: $HTTP_CODE"

if [ "$HTTP_CODE" = "200" ]; then
  echo "✓ RHOAI index is accessible"
  echo "First 500 chars of response:"
  head -c 500 /tmp/rhoai_response.txt
  echo ""
elif [ "$HTTP_CODE" = "000" ]; then
  echo "✗ Could not reach URL (network error or timeout)"
else
  echo "✗ URL returned HTTP $HTTP_CODE"
fi

Repository: lightspeed-core/rag-content

Length of output: 276


🏁 Script executed:

#!/bin/bash
set -e

echo "=== Check 1: Trailing newline confirmation ==="
if [ -f "requirements.overrides.txt" ]; then
  # Alternative method to show last bytes without od
  echo "File size and last characters:"
  wc -c requirements.overrides.txt
  
  echo "Last line:"
  tail -1 requirements.overrides.txt
  
  # Check with tail -c
  LAST_BYTE=$(tail -c 1 requirements.overrides.txt | xxd -p)
  echo "Last byte (hex): $LAST_BYTE"
  if [ "$LAST_BYTE" = "0a" ]; then
    echo "✓ Ends with newline (0x0a)"
  else
    echo "✗ Does NOT end with newline (last byte is 0x$LAST_BYTE)"
  fi
fi

echo ""
echo "=== Check 2: RHOAI URL accessibility ==="
URL="https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple/"
echo "Testing URL: $URL"
HTTP_CODE=$(curl -s -o /tmp/rhoai_response.txt -w "%{http_code}" "$URL" --max-time 10 2>&1 || echo "000")
echo "HTTP Status: $HTTP_CODE"

if [ "$HTTP_CODE" = "200" ]; then
  echo "✓ RHOAI index is accessible"
  echo "Response size: $(wc -c < /tmp/rhoai_response.txt) bytes"
  echo "Sample of response:"
  head -c 800 /tmp/rhoai_response.txt
  echo ""
else
  echo "Response: $HTTP_CODE"
fi

Repository: lightspeed-core/rag-content

Length of output: 270


🏁 Script executed:

#!/bin/bash

echo "=== Check 1: Trailing newline verification ==="
if [ -f "requirements.overrides.txt" ]; then
  # Use Python to check last byte since xxd is not available
  python3 << 'PYTHON_EOF'
with open('requirements.overrides.txt', 'rb') as f:
  content = f.read()
  if len(content) > 0:
    last_byte = content[-1]
    print(f"File size: {len(content)} bytes")
    print(f"Last byte: 0x{last_byte:02x} ({repr(chr(last_byte))})")
    if last_byte == 0x0a:
      print("✓ File DOES end with newline (0x0a)")
    else:
      print("✗ File DOES NOT end with newline")
  else:
    print("File is empty")
PYTHON_EOF
fi

echo ""
echo "=== Check 2: Verify file content ==="
tail -3 requirements.overrides.txt
echo "[END OF FILE]"

echo ""
echo "=== Check 3: RHOAI URL accessibility ==="
URL="https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple/"
echo "Testing URL: $URL"

python3 << 'PYTHON_EOF'
import urllib.request
import ssl

url = "https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple/"
try:
  # Create unverified SSL context (for sandbox)
  context = ssl.create_default_context()
  context.check_hostname = False
  context.verify_mode = ssl.CERT_NONE
  
  with urllib.request.urlopen(url, context=context, timeout=10) as response:
    status = response.status
    print(f"HTTP Status: {status}")
    print("✓ URL is accessible")
    body = response.read(500).decode('utf-8', errors='ignore')
    print(f"Sample response: {body[:300]}")
except urllib.error.URLError as e:
  print(f"✗ URL error: {e}")
except Exception as e:
  print(f"✗ Error: {e}")
PYTHON_EOF

Repository: lightspeed-core/rag-content

Length of output: 778


🏁 Script executed:

#!/bin/bash

echo "=== Check 4: Verify package versions on RHOAI index ==="

python3 << 'PYTHON_EOF'
import urllib.request
import ssl
import json
import re

# Create unverified SSL context
context = ssl.create_default_context()
context.check_hostname = False
context.verify_mode = ssl.CERT_NONE

base_url = "https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple"

packages_to_check = [
    ("transformers", "4.57.3"),
    ("tokenizers", "0.22.1"),
    ("scipy", "1.16.3"),
    ("aiohttp", "3.13.2"),
    ("pillow", "12.0.0"),
    ("faiss-cpu", "1.12.0"),
]

print("Checking package availability on RHOAI index...\n")

for package_name, version in packages_to_check:
    url = f"{base_url}/{package_name}/"
    try:
        with urllib.request.urlopen(url, context=context, timeout=10) as response:
            body = response.read().decode('utf-8')
            # Look for the specific version
            if f"{package_name}-{version}" in body or f"{version}" in body:
                print(f"✓ {package_name}=={version}: Found on index")
            else:
                print(f"✗ {package_name}=={version}: NOT found on index")
                # Show what versions are available
                versions = re.findall(r'>([^<]+)\/<', body)
                if versions:
                    print(f"  Available versions: {versions[:5]}")
    except urllib.error.HTTPError as e:
        if e.code == 404:
            print(f"✗ {package_name}: Package not found on index (HTTP 404)")
        else:
            print(f"✗ {package_name}: HTTP {e.code}")
    except Exception as e:
        print(f"✗ {package_name}: Error - {e}")

PYTHON_EOF

Repository: lightspeed-core/rag-content

Length of output: 386


Add trailing newline for POSIX compliance.

The file is missing a trailing newline. POSIX text files should end with a newline character to ensure compatibility with various tools.

🤖 Prompt for AI Agents
In `@requirements.overrides.txt` around lines 1 - 11, The file ends without a
trailing newline; update requirements.overrides.txt to add a single
POSIX-compliant newline character at the end of the file (i.e., ensure the file
ends with a '\n' after the last listed package such as "faiss-cpu==1.12.0") so
text tools and linters treat it as a proper POSIX text file.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.tekton/rag-tool-push.yaml:
- Around line 34-65: The prefetch-input config contains an invalid bundler field
and a mis-typed python version: remove the unsupported "allow_binary" entry from
the "bundler" object (the bundler block shown) and change the "py_version" value
inside the "binary" object from a quoted string to an integer (e.g., 312) so the
"binary" section uses an integer py_version and the bundler no longer contains
allow_binary.
♻️ Duplicate comments (5)
pyproject.toml (1)

48-51: Stale comment: version pairing no longer matches code.

The comment on line 49 states "torch 2.5.x pairs with torchvision 0.20.x" but the pins are now torch>=2.8.0,<2.9.0 and torchvision>=0.23.0,<0.24.0. Update the comment accordingly.

rpms.in.yaml (1)

1-11: Add rubygem-bundler to the packages list.

The Containerfile installs rubygem-bundler, and the PR enables Bundler-based hermetic builds. For consistency and to ensure all required RPMs are declared, add rubygem-bundler to the packages list.

requirements-build.txt (1)

65-101: Resolve duplicate hatchling pins to prevent pip failure.

hatchling is pinned twice with conflicting versions: 1.26.3 (line 65) and 1.28.0 (line 69). Pip will reject this with a "double requirement given" error. This needs to be resolved before the hermetic build can succeed.

Containerfile (2)

13-13: Quote the uv version spec to avoid shell redirection.

The shell interprets > as redirection, which drops the constraint and creates a stray file.

🐛 Proposed fix
-RUN pip3.12 install uv>=0.7.20
+RUN pip3.12 install "uv>=0.7.20"

23-26: Avoid bundle update --bundler in the image build (breaks hermeticity).

It rewrites Gemfile.lock and requires network access. Prefer a locked install using the existing lockfile.

🔧 Suggested adjustment
-ENV BUNDLE_PATH__SYSTEM=true
-RUN bundle update --bundler
-RUN bundle install
+ENV BUNDLE_PATH__SYSTEM=true \
+    BUNDLE_DEPLOYMENT=1 \
+    BUNDLE_FROZEN=1
+RUN bundle install
🧹 Nitpick comments (3)
pyproject.toml (1)

90-91: Consider aligning pip versions across files.

pip==24.3.1 is pinned here in dev dependencies, while requirements.hermetic.txt specifies pip==25.3. If both are used in the build process, this inconsistency could cause confusion or unexpected behavior. Consider using a consistent version across all requirement specifications.

.tekton/rag-tool-pull-request.yaml (1)

59-64: Long binary packages list reduces maintainability.

The packages string on line 60 contains 100+ comma-separated package names in a single line, making it difficult to review, diff, and maintain. Consider formatting this as a proper JSON array across multiple lines or managing the list in a separate file that gets injected during pipeline generation.

.tekton/rag-tool-push.yaml (1)

27-29: Align prefetch arch list with build platforms.

build-platforms is linux/x86_64, but prefetch requests x86_64,aarch64. If multi‑arch is intended, add linux/aarch64 to build-platforms; otherwise drop aarch64 from prefetch to avoid unused downloads.

Also applies to: 58-59

Comment on lines 34 to 65
- name: prefetch-input
# no source available: torch, faiss-cpu
# hermeto prefetch problems: uv, pip, jiter, tiktoken,
# those need cmake to build: pyarrow
# those need cargo to build: jiter, tiktoken, cryptography, fastuuid, hf_xet, maturin, pydantic_core, rpds_py, safetensors, tokenizers
# to accelerate build:numpy, scipy, pandas, pillow, scikit_learn
value: |
[
{
"type": "rpm",
"path": "."
},
{
"type": "pip",
"path": ".",
"requirements_files": [
"requirements.hashes.wheel.txt",
"requirements.hashes.source.txt",
"requirements.hermetic.txt"
],
"requirements_build_files": ["requirements-build.txt"],
"binary": {
"packages": "accelerate,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,attrs,beautifulsoup4,cffi,chardet,charset-normalizer,click,colorama,cryptography,dataclasses-json,defusedxml,distro,docling-ibm-models,einops,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,googleapis-common-protos,greenlet,h11,hf-xet,httpcore,httpx,huggingface-hub,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,latex2mathml,llama-stack-client,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,multidict,mypy-extensions,nest-asyncio,networkx,nltk,openpyxl,opentelemetry-api,opentelemetry-exporter-otlp-proto-common,opentelemetry-exporter-otlp-proto-http,opentelemetry-proto,opentelemetry-sdk,opentelemetry-semantic-conventions,packaging,pandas,pillow,platformdirs,pluggy,prompt-toolkit,propcache,psycopg2-binary,pyaml,pydantic,pydantic-core,pydantic-settings,pygments,pyjwt,pylatexenc,python-dateutil,python-docx,python-dotenv,python-multipart,python-pptx,pytz,pyyaml,referencing,requests,rich,rpds-py,rtree,safetensors,scikit-learn,scipy,semchunk,sentence-transformers,shapely,shellingham,six,sniffio,starlette,sympy,tabulate,tenacity,threadpoolctl,tiktoken,tokenizers,torch,torchvision,tqdm,transformers,triton,typer,typing-extensions,typing-inspect,typing-inspection,tzdata,wrapt,xlsxwriter,yarl,zipp,uv-build,uv,pip,maturin,opencv-python,rapidocr,sqlite-vec",
"os": "linux",
"arch": "x86_64,aarch64",
"py_version": "312"
}
},
{
"type": "bundler",
"allow_binary": "true"
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file context around the problematic area
cd / && find . -name "rag-tool-push.yaml" -type f 2>/dev/null | head -5

Repository: lightspeed-core/rag-content

Length of output: 117


🏁 Script executed:

# Search for cachi2 or prefetch references in the repo
rg "prefetch-dependencies-oci-ta|cachi2|prefetch-input" --type yaml -B 2 -A 5 2>/dev/null | head -100

Repository: lightspeed-core/rag-content

Length of output: 53


🏁 Script executed:

# Look for any configuration or documentation about prefetch schema
fd "cachi2|prefetch" -t f 2>/dev/null | head -20

Repository: lightspeed-core/rag-content

Length of output: 111


🏁 Script executed:

# Check the actual structure of the YAML file to understand context
if [ -f ".tekton/rag-tool-push.yaml" ]; then
  head -80 .tekton/rag-tool-push.yaml
fi

Repository: lightspeed-core/rag-content

Length of output: 4839


🌐 Web query:

cachi2 prefetch-dependencies-oci-ta schema documentation json format

💡 Result:

For the Konflux task-prefetch-dependencies-oci-ta (which runs Cachi2/Hermeto), the input parameter is a JSON string. In practice it is an object with:

  • packages: array of package specs
  • flags: optional array of strings (extra behavior flags)

Example seen in a real PipelineRun attestation:

{
  "packages": [
    { "path": "yq", "type": "gomod" },
    { "type": "rpm" }
  ],
  "flags": ["gomod-vendor"]
}

[1]

Package object shape (what goes inside packages[])

Each entry is a “package manager input” object. For example, for Go modules Hermeto documents:

{ "type": "gomod", "path": "." }

(path is relative to the source root and defaults to ".".) [2]

Hermeto has similar per-package-manager pages (npm/pip/yarn/rpm/etc.) describing the allowed keys for each type. [3]

Sources: [1] [2] [3]


🏁 Script executed:

cat scripts/generate_packages_to_prefetch.py

Repository: lightspeed-core/rag-content

Length of output: 7106


🌐 Web query:

cachi2 Hermeto pip binary packages arch py_version allow_binary schema format

💡 Result:

In Hermeto (the tool formerly packaged as cachi2), pip “binary packages” selection is configured via the binary object in the pip JSON input. There isn’t an allow_binary field in the current pip input schema; instead, you provide (or omit) binary. [1]

JSON input shape (pip)

{
  "type": "pip",
  "path": ".",
  "requirements_files": ["requirements.txt"],
  "requirements_build_files": ["requirements-build.txt"],
  "binary": {
    "packages": "tensorflow",
    "arch": "x86_64",
    "os": "linux",
    "py_version": 312
  }
}

[1]

binary filter fields (schema/format)

  • packages: comma-separated package names; special :all: means “all packages” (default :all:).
  • arch: comma-separated architectures (default "x86_64").
  • os: comma-separated OSes (default "linux").
  • py_version: single integer XY for Python X.Y (example: 312 for Python 3.12; default None).
    (Additional fields exist: py_impl, abi, platform.) [2]

Also, Hermeto describes three modes via binary presence/value: omit binary (no wheels), set packages=":all:" (prefer wheels, fall back to sdists), or set packages to specific names (only wheels for those packages; fail if no compatible wheel). [2]

Sources: Hermeto pip docs (JSON input + binary filters). [1] [2]


🌐 Web query:

Cachi2 pip package manager binary field specification

💡 Result:

For Cachi2/Hermeto’s pip package manager input, the optional binary field is a “binary (wheel) filter” object used to control which wheels are prefetched (otherwise Cachi2/Hermeto defaults to fetching sdists only). [1]

Where binary appears (pip JSON input)

{
  "type": "pip",
  "path": ".",
  "requirements_files": ["requirements.txt"],
  "requirements_build_files": ["requirements-build.txt"],
  "binary": { ... }
}

[1]

Behavior / “strategies” controlled by binary.packages

  • Default (no binaries): omit binary ⇒ source-only (sdists only). [1]
  • Prefer binaries (like pip --prefer-binary): set "binary": {"packages": ":all:"} ⇒ try compatible wheels for everything, fall back to sdists if none match. [1]
  • Only binaries for specific packages (like pip --only-binary for those pkgs): set "packages" to specific names (e.g. "numpy,pandas") ⇒ for those packages, fail if no matching wheel exists; others remain sdist-only. [1]

binary filter object fields

  • packages: comma-separated package names to apply binary filtering to. Default :all:. [1]
  • arch: comma-separated architectures. Default "x86_64". [1]
  • os: comma-separated OS values. Default "linux". [1]
  • py_version: integer MAJORMINOR (e.g. 312 for Python 3.12). Default None. [1]
  • py_impl: comma-separated Python implementations. Default "cp". [1]
  • abi: comma-separated ABIs. Default :all:. [1]
  • platform: regex matched against wheel platform tags. Default None. [1]

Filter logic rules

  • :all: or None ⇒ no filtering for that field. [1]
  • Multiple values within a field use OR; multiple fields combine with AND. [1]
  • arch/os are mutually exclusive with platform. [1]

Source: Hermeto/Cachi2 pip documentation page. [1]

References:
[1] (hermetoproject.github.io)

Citations:


Remove unsupported allow_binary field and fix py_version type in binary configuration.

The allow_binary field is not part of Hermeto's pip binary filter schema and must be removed. Additionally, py_version must be an integer (e.g., 312 for Python 3.12), not a quoted string. The packages and arch fields are correctly formatted as comma-separated strings per spec.

Required fixes
  "binary": {
    "packages": "accelerate,aiohappyeyeballs,aiohttp,...",
    "os": "linux",
    "arch": "x86_64,aarch64",
-   "py_version": "312"
+   "py_version": 312
  }

Also remove "allow_binary": "true" from the bundler block.

🤖 Prompt for AI Agents
In @.tekton/rag-tool-push.yaml around lines 34 - 65, The prefetch-input config
contains an invalid bundler field and a mis-typed python version: remove the
unsupported "allow_binary" entry from the "bundler" object (the bundler block
shown) and change the "py_version" value inside the "binary" object from a
quoted string to an integer (e.g., 312) so the "binary" section uses an integer
py_version and the bundler no longer contains allow_binary.

@raptorsun raptorsun force-pushed the hermetic branch 2 times, most recently from ca7a0a1 to 9169da9 Compare January 26, 2026 15:23
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.tekton/rag-tool-pull-request.yaml:
- Around line 38-69: The prefetch-input pip binary schema uses a string for
"py_version" and an unsupported bundler field "allow_binary"; update the pip
block's "py_version" value from the quoted string to an integer (py_version:
312) and remove the "allow_binary" key from the bundler block (refer to the
"prefetch-input" object, the "py_version" field inside the pip -> binary map,
and the "bundler" object) so the schema matches Hermeto's expected types.
♻️ Duplicate comments (4)
pyproject.toml (1)

48-51: Update the stale torch/torchvision pairing comment.

Line 49 still references the old pairing, which no longer matches the pinned versions.

✏️ Proposed fix
-    # torch 2.5.x pairs with torchvision 0.20.x
+    # torch 2.8.x pairs with torchvision 0.23.x
PyTorch 2.8.0 torchvision 0.23.0 compatibility CPU wheels
Containerfile (2)

11-11: Quote the uv version spec to avoid shell redirection.

Line 11 uses > unquoted, so the shell treats it as redirection.

🛠️ Proposed fix
-RUN pip3.12 install uv>=0.7.20
+RUN pip3.12 install "uv>=0.7.20"

21-22: Use locked Bundler install for hermetic builds.

Line 22 runs bundle install without frozen/deployment mode; this can update the lockfile or reach the network. Prefer locked, offline-friendly install.

🔒 Proposed fix
-RUN BUNDLE_PATH__SYSTEM=true bundle install
+RUN BUNDLE_PATH__SYSTEM=true BUNDLE_DEPLOYMENT=1 BUNDLE_FROZEN=1 bundle install --local
Bundler BUNDLE_DEPLOYMENT BUNDLE_FROZEN bundle install --local behavior
.tekton/rag-tool-push.yaml (1)

34-65: Fix Hermeto prefetch schema: remove allow_binary and use integer py_version.

Same issue as above: py_version should be an integer, and the bundler input doesn’t support allow_binary.

🛠️ Proposed fix
           "binary": {
             "packages": "accelerate,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,attrs,beautifulsoup4,cffi,chardet,charset-normalizer,click,colorama,cryptography,dataclasses-json,defusedxml,distro,docling-ibm-models,einops,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,googleapis-common-protos,greenlet,h11,hf-xet,httpcore,httpx,huggingface-hub,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,latex2mathml,llama-stack-client,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,multidict,mypy-extensions,nest-asyncio,networkx,nltk,openpyxl,opentelemetry-api,opentelemetry-exporter-otlp-proto-common,opentelemetry-exporter-otlp-proto-http,opentelemetry-proto,opentelemetry-sdk,opentelemetry-semantic-conventions,packaging,pandas,pillow,platformdirs,pluggy,prompt-toolkit,propcache,psycopg2-binary,pyaml,pydantic,pydantic-core,pydantic-settings,pygments,pyjwt,pylatexenc,python-dateutil,python-docx,python-dotenv,python-multipart,python-pptx,pytz,pyyaml,referencing,requests,rich,rpds-py,rtree,safetensors,scikit-learn,scipy,semchunk,sentence-transformers,shapely,shellingham,six,sniffio,starlette,sympy,tabulate,tenacity,threadpoolctl,tiktoken,tokenizers,torch,torchvision,tqdm,transformers,triton,typer,typing-extensions,typing-inspect,typing-inspection,tzdata,wrapt,xlsxwriter,yarl,zipp,uv-build,uv,pip,maturin,opencv-python,rapidocr,sqlite-vec",
             "os": "linux",
             "arch": "x86_64,aarch64",
-            "py_version": "312"
+            "py_version": 312
           }
         },
         {
           "type": "bundler",
-          "allow_binary": "true"
         }
Hermeto cachi2 pip binary schema py_version integer allow_binary bundler

Comment on lines 38 to 69
- name: prefetch-input
# no source available: torch, faiss-cpu
# hermeto prefetch problems: uv, pip, jiter, tiktoken,
# those need cmake to build: pyarrow
# those need cargo to build: jiter, tiktoken, cryptography, fastuuid, hf_xet, maturin, pydantic_core, rpds_py, safetensors, tokenizers
# to accelerate build:numpy, scipy, pandas, pillow, scikit_learn
value: |
[
{
"type": "rpm",
"path": "."
},
{
"type": "pip",
"path": ".",
"requirements_files": [
"requirements.hashes.wheel.txt",
"requirements.hashes.source.txt",
"requirements.hermetic.txt"
],
"requirements_build_files": ["requirements-build.txt"],
"binary": {
"packages": "accelerate,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,attrs,beautifulsoup4,cffi,chardet,charset-normalizer,click,colorama,cryptography,dataclasses-json,defusedxml,distro,docling-ibm-models,einops,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,googleapis-common-protos,greenlet,h11,hf-xet,httpcore,httpx,huggingface-hub,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,latex2mathml,llama-stack-client,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,multidict,mypy-extensions,nest-asyncio,networkx,nltk,openpyxl,opentelemetry-api,opentelemetry-exporter-otlp-proto-common,opentelemetry-exporter-otlp-proto-http,opentelemetry-proto,opentelemetry-sdk,opentelemetry-semantic-conventions,packaging,pandas,pillow,platformdirs,pluggy,prompt-toolkit,propcache,psycopg2-binary,pyaml,pydantic,pydantic-core,pydantic-settings,pygments,pyjwt,pylatexenc,python-dateutil,python-docx,python-dotenv,python-multipart,python-pptx,pytz,pyyaml,referencing,requests,rich,rpds-py,rtree,safetensors,scikit-learn,scipy,semchunk,sentence-transformers,shapely,shellingham,six,sniffio,starlette,sympy,tabulate,tenacity,threadpoolctl,tiktoken,tokenizers,torch,torchvision,tqdm,transformers,triton,typer,typing-extensions,typing-inspect,typing-inspection,tzdata,wrapt,xlsxwriter,yarl,zipp,uv-build,uv,pip,maturin,opencv-python,rapidocr,sqlite-vec",
"os": "linux",
"arch": "x86_64,aarch64",
"py_version": "312"
}
},
{
"type": "bundler",
"allow_binary": "true"
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Hermeto cachi2 pip binary schema py_version integer type

💡 Result:

In Hermeto’s (Cachi2-derived) pip “binary” filter object, py_version is an integer: it must be the Python major and minor combined, e.g. 312 for Python 3.12 (and it’s shown in the JSON input example as py_version: 312). [1]

Example

{
  "type": "pip",
  "binary": { "os": "linux", "arch": "aarch64", "py_version": 312 }
}

[1]

Source(s):
[1] (hermetoproject.github.io)

Citations:


🌐 Web query:

cachi2 bundler allow_binary parameter support

💡 Result:

allow_binary is not a supported parameter for the bundler package manager in cachi2 (Hermeto). The public docs describe Bundler support only in terms of reading Gemfile.lock/Gemfile and fetching the declared gems; they do not document any allow_binary option for Bundler input/config. [1]

Implication: if you need to control “binary vs source” behavior for Ruby gems, that must be done via Bundler/RubyGems configuration in the build (e.g., Bundler settings like force_ruby_platform), not via a cachi2 bundler/allow_binary parameter. [1]

Sources: [1]


🌐 Web query:

Hermeto prefetch pip schema requirements

💡 Result:

Hermeto’s pip prefetch (“hermeto fetch-deps”) accepts a JSON object with this schema (keys and defaults):

  • type: must be "pip" (required) (hermetoproject.github.io)
  • path: path to the Python package dir, relative to --source (default ".") (hermetoproject.github.io)
  • requirements_files: list of requirement lockfiles relative to path (default ["requirements.txt"], or [] if missing) (hermetoproject.github.io)
  • requirements_build_files: build-requirements lockfiles relative to path (default ["requirements-build.txt"], or [] if missing) (hermetoproject.github.io)
  • binary (optional): wheel-selection filter object (defaults to none; if omitted Hermeto fetches sdists by default): (hermetoproject.github.io)
    • packages (e.g., "tensorflow"), arch (e.g., "x86_64"), os (e.g., "linux"), py_version (e.g., 312) (hermetoproject.github.io)

Requirements file expectations: the requirements*.txt files must be “fully resolved” lockfiles (include transitive deps and pin exact versions). (hermetoproject.github.io)

Citations:


Fix Hermeto prefetch schema: change py_version to integer and remove unsupported allow_binary.

The Hermeto pip binary schema requires py_version as an integer (e.g., 312 not "312"), and allow_binary is not a supported parameter for the bundler block.

Proposed fix
           "binary": {
             "packages": "accelerate,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,attrs,beautifulsoup4,cffi,chardet,charset-normalizer,click,colorama,cryptography,dataclasses-json,defusedxml,distro,docling-ibm-models,einops,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,googleapis-common-protos,greenlet,h11,hf-xet,httpcore,httpx,huggingface-hub,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,latex2mathml,llama-stack-client,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,multidict,mypy-extensions,nest-asyncio,networkx,nltk,openpyxl,opentelemetry-api,opentelemetry-exporter-otlp-proto-common,opentelemetry-exporter-otlp-proto-http,opentelemetry-proto,opentelemetry-sdk,opentelemetry-semantic-conventions,packaging,pandas,pillow,platformdirs,pluggy,prompt-toolkit,propcache,psycopg2-binary,pyaml,pydantic,pydantic-core,pydantic-settings,pygments,pyjwt,pylatexenc,python-dateutil,python-docx,python-dotenv,python-multipart,python-pptx,pytz,pyyaml,referencing,requests,rich,rpds-py,rtree,safetensors,scikit-learn,scipy,semchunk,sentence-transformers,shapely,shellingham,six,sniffio,starlette,sympy,tabulate,tenacity,threadpoolctl,tiktoken,tokenizers,torch,torchvision,tqdm,transformers,triton,typer,typing-extensions,typing-inspect,typing-inspection,tzdata,wrapt,xlsxwriter,yarl,zipp,uv-build,uv,pip,maturin,opencv-python,rapidocr,sqlite-vec",
             "os": "linux",
             "arch": "x86_64,aarch64",
-            "py_version": "312"
+            "py_version": 312
           }
         },
         {
           "type": "bundler",
-          "allow_binary": "true"
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: prefetch-input
# no source available: torch, faiss-cpu
# hermeto prefetch problems: uv, pip, jiter, tiktoken,
# those need cmake to build: pyarrow
# those need cargo to build: jiter, tiktoken, cryptography, fastuuid, hf_xet, maturin, pydantic_core, rpds_py, safetensors, tokenizers
# to accelerate build:numpy, scipy, pandas, pillow, scikit_learn
value: |
[
{
"type": "rpm",
"path": "."
},
{
"type": "pip",
"path": ".",
"requirements_files": [
"requirements.hashes.wheel.txt",
"requirements.hashes.source.txt",
"requirements.hermetic.txt"
],
"requirements_build_files": ["requirements-build.txt"],
"binary": {
"packages": "accelerate,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,attrs,beautifulsoup4,cffi,chardet,charset-normalizer,click,colorama,cryptography,dataclasses-json,defusedxml,distro,docling-ibm-models,einops,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,googleapis-common-protos,greenlet,h11,hf-xet,httpcore,httpx,huggingface-hub,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,latex2mathml,llama-stack-client,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,multidict,mypy-extensions,nest-asyncio,networkx,nltk,openpyxl,opentelemetry-api,opentelemetry-exporter-otlp-proto-common,opentelemetry-exporter-otlp-proto-http,opentelemetry-proto,opentelemetry-sdk,opentelemetry-semantic-conventions,packaging,pandas,pillow,platformdirs,pluggy,prompt-toolkit,propcache,psycopg2-binary,pyaml,pydantic,pydantic-core,pydantic-settings,pygments,pyjwt,pylatexenc,python-dateutil,python-docx,python-dotenv,python-multipart,python-pptx,pytz,pyyaml,referencing,requests,rich,rpds-py,rtree,safetensors,scikit-learn,scipy,semchunk,sentence-transformers,shapely,shellingham,six,sniffio,starlette,sympy,tabulate,tenacity,threadpoolctl,tiktoken,tokenizers,torch,torchvision,tqdm,transformers,triton,typer,typing-extensions,typing-inspect,typing-inspection,tzdata,wrapt,xlsxwriter,yarl,zipp,uv-build,uv,pip,maturin,opencv-python,rapidocr,sqlite-vec",
"os": "linux",
"arch": "x86_64,aarch64",
"py_version": "312"
}
},
{
"type": "bundler",
"allow_binary": "true"
}
- name: prefetch-input
# no source available: torch, faiss-cpu
# hermeto prefetch problems: uv, pip, jiter, tiktoken,
# those need cmake to build: pyarrow
# those need cargo to build: jiter, tiktoken, cryptography, fastuuid, hf_xet, maturin, pydantic_core, rpds_py, safetensors, tokenizers
# to accelerate build:numpy, scipy, pandas, pillow, scikit_learn
value: |
[
{
"type": "rpm",
"path": "."
},
{
"type": "pip",
"path": ".",
"requirements_files": [
"requirements.hashes.wheel.txt",
"requirements.hashes.source.txt",
"requirements.hermetic.txt"
],
"requirements_build_files": ["requirements-build.txt"],
"binary": {
"packages": "accelerate,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,attrs,beautifulsoup4,cffi,chardet,charset-normalizer,click,colorama,cryptography,dataclasses-json,defusedxml,distro,docling-ibm-models,einops,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,googleapis-common-protos,greenlet,h11,hf-xet,httpcore,httpx,huggingface-hub,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,latex2mathml,llama-stack-client,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,multidict,mypy-extensions,nest-asyncio,networkx,nltk,openpyxl,opentelemetry-api,opentelemetry-exporter-otlp-proto-common,opentelemetry-exporter-otlp-proto-http,opentelemetry-proto,opentelemetry-sdk,opentelemetry-semantic-conventions,packaging,pandas,pillow,platformdirs,pluggy,prompt-toolkit,propcache,psycopg2-binary,pyaml,pydantic,pydantic-core,pydantic-settings,pygments,pyjwt,pylatexenc,python-dateutil,python-docx,python-dotenv,python-multipart,python-pptx,pytz,pyyaml,referencing,requests,rich,rpds-py,rtree,safetensors,scikit-learn,scipy,semchunk,sentence-transformers,shapely,shellingham,six,sniffio,starlette,sympy,tabulate,tenacity,threadpoolctl,tiktoken,tokenizers,torch,torchvision,tqdm,transformers,triton,typer,typing-extensions,typing-inspect,typing-inspection,tzdata,wrapt,xlsxwriter,yarl,zipp,uv-build,uv,pip,maturin,opencv-python,rapidocr,sqlite-vec",
"os": "linux",
"arch": "x86_64,aarch64",
"py_version": 312
}
},
{
"type": "bundler"
}
🤖 Prompt for AI Agents
In @.tekton/rag-tool-pull-request.yaml around lines 38 - 69, The prefetch-input
pip binary schema uses a string for "py_version" and an unsupported bundler
field "allow_binary"; update the pip block's "py_version" value from the quoted
string to an integer (py_version: 312) and remove the "allow_binary" key from
the bundler block (refer to the "prefetch-input" object, the "py_version" field
inside the pip -> binary map, and the "bundler" object) so the schema matches
Hermeto's expected types.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@scripts/konflux_requirements.sh`:
- Around line 71-77: The package extraction can still include index-url lines
when they have leading whitespace; update the wheel_packages pipeline to
robustly ignore commented and dash-prefixed lines by changing the grep to
exclude lines that start with optional whitespace followed by '#' or '-' (e.g.,
use grep -v '^[[:space:]]*[-#]'), keeping the rest of the pipeline (sed
's/==.*//' | tr '\n' ',' | sed 's/,$//') and then continue using wheel_packages,
EXTRA_WHEELS, PYPI_WHEELS and the two sed replacements that update
.tekton/rag-tool-pull-request.yaml and .tekton/rag-tool-push.yaml.
♻️ Duplicate comments (5)
scripts/konflux_requirements.sh (2)

83-86: Guard against empty PyPI wheel list before compiling hashes.

If none of the PYPI_WHEELS entries are present in the compiled requirements, $WHEEL_FILE_PYPI will contain only comments. uv pip compile will fail with "No requirements specified," breaking the script under set -e.

🔧 Proposed fix
-uv pip compile "$WHEEL_FILE_PYPI" --refresh --generate-hashes --python-version 3.12 --emit-index-url --no-deps --no-annotate > "$WHEEL_HASH_FILE_PYPI"
+if grep -qEv '^[[:space:]]*($|#)' "$WHEEL_FILE_PYPI"; then
+  uv pip compile "$WHEEL_FILE_PYPI" --refresh --generate-hashes --python-version 3.12 --emit-index-url --no-deps --no-annotate > "$WHEEL_HASH_FILE_PYPI"
+else
+  echo "# No PyPI wheel requirements" > "$WHEEL_HASH_FILE_PYPI"
+fi

89-90: Make the maturin pin resilient to non-numeric version suffixes.

The current regex [0-9.]* only matches digits and dots. If the resolved version includes a suffix (e.g., .post1, rc1), the substitution will produce an invalid version string.

🔧 Proposed fix
-sed -i 's/maturin==[0-9.]*/maturin==1.10.2/' "$BUILD_FILE"
+sed -i 's/maturin==[^[:space:]]*/maturin==1.10.2/' "$BUILD_FILE"
pyproject.toml (1)

48-51: Stale comment: version pairing no longer matches code.

Line 49 states "torch 2.5.x pairs with torchvision 0.20.x" but the actual pins are now torch>=2.8.0,<2.9.0 and torchvision>=0.23.0,<0.24.0. Update the comment to reflect the current pairing.

📝 Proposed fix
     # Pin torch/torchvision to versions available as CPU wheels
-    # torch 2.5.x pairs with torchvision 0.20.x
+    # torch 2.8.x pairs with torchvision 0.23.x
     "torch>=2.8.0,<2.9.0",
     "torchvision>=0.23.0,<0.24.0",
.tekton/rag-tool-push.yaml (1)

62-65: Remove unsupported allow_binary field from bundler configuration.

The allow_binary field is not part of Hermeto's bundler package manager schema. This field should be removed to avoid prefetch failures.

🔧 Proposed fix
         {
           "type": "bundler",
-          "allow_binary": "true"
+          "path": "."
         }
Containerfile (1)

10-11: Quote the version spec to avoid shell redirection.

The > character in uv>=0.7.20 will be interpreted by the shell as output redirection, creating a file named =0.7.20 and installing uv without version constraints.

🐛 Proposed fix
 # Install uv package manager
-RUN pip3.12 install uv>=0.7.20
+RUN pip3.12 install "uv>=0.7.20"
🧹 Nitpick comments (4)
scripts/konflux_requirements.sh (2)

56-58: Package name extraction may fail for packages with extras.

The sed pattern s/[=<>!].*//' won't correctly extract the package name if it includes extras (e.g., package[extra]==1.0). The brackets would remain in the extracted name.

♻️ Proposed fix
-                package_name=$(echo "$current_package" | sed 's/[=<>!].*//')
+                package_name=$(echo "$current_package" | sed 's/\[.*\]//; s/[=<>!].*//')

96-99: Version counting regex misses versions with suffixes.

The grep -Eo '==[0-9.]+' pattern won't match versions like ==1.0.post1 or ==2.0rc1, leading to inaccurate package counts in the summary output.

♻️ Proposed fix
-echo "Packages from pypi.org written to: $SOURCE_HASH_FILE ($( grep -Eo '==[0-9.]+' "$SOURCE_HASH_FILE" | wc -l) packages)"
-echo "Packages from console.redhat.com written to: $WHEEL_HASH_FILE ($(grep -Eo '==[0-9.]+' "$WHEEL_HASH_FILE" | wc -l) packages)"
-echo "Packages from pypi.org (wheels) written to: $WHEEL_HASH_FILE_PYPI ($(grep -Eo '==[0-9.]+' "$WHEEL_HASH_FILE_PYPI" | wc -l) packages)"
-echo "Build dependencies written to: $BUILD_FILE ($(grep -Eo '==[0-9.]+' "$BUILD_FILE" | wc -l) packages)"
+echo "Packages from pypi.org written to: $SOURCE_HASH_FILE ($( grep -Eo '==[^[:space:]]+' "$SOURCE_HASH_FILE" | wc -l) packages)"
+echo "Packages from console.redhat.com written to: $WHEEL_HASH_FILE ($(grep -Eo '==[^[:space:]]+' "$WHEEL_HASH_FILE" | wc -l) packages)"
+echo "Packages from pypi.org (wheels) written to: $WHEEL_HASH_FILE_PYPI ($(grep -Eo '==[^[:space:]]+' "$WHEEL_HASH_FILE_PYPI" | wc -l) packages)"
+echo "Build dependencies written to: $BUILD_FILE ($(grep -Eo '==[^[:space:]]+' "$BUILD_FILE" | wc -l) packages)"
.tekton/rag-tool-push.yaml (1)

27-29: Platform mismatch between build-platforms and binary.arch.

build-platforms specifies only linux/x86_64, but binary.arch includes x86_64,aarch64. This could cause prefetch to download aarch64 wheels that won't be used during the build, increasing artifact size unnecessarily.

Consider either:

  1. Adding linux/aarch64 to build-platforms if multi-arch builds are intended
  2. Removing aarch64 from binary.arch if only x86_64 builds are needed
♻️ Option 2: Align arch with build-platforms
           "binary": {
             "packages": "accelerate,aiohappyeyeballs,...",
             "os": "linux",
-            "arch": "x86_64,aarch64",
+            "arch": "x86_64",
             "py_version": 312
           }

Also applies to: 55-59

requirements.hermetic.txt (1)

1-4: Inconsistent version pinning strategy.

uv-build, uv, and pip use exact pins (==), but ninja uses a lower bound (>=1.5). For fully reproducible hermetic builds, consider pinning ninja to an exact version.

♻️ Proposed fix
 uv-build==0.9.26
 uv==0.9.26
 pip==25.3
-ninja>=1.5
+ninja==1.13.0

Comment on lines +71 to +77
# replace the list of binary packages in konflux pipeline configuration
# only the package names, not the versions, delimited by commas
wheel_packages=$(grep -v "^[#-]" "$WHEEL_FILE" | sed 's/==.*//' | tr '\n' ',' | sed 's/,$//')
# append extra wheels to the list
wheel_packages="$wheel_packages,$EXTRA_WHEELS,$PYPI_WHEELS"
sed -i 's/"packages": "[^"]*"/"packages": "'"$wheel_packages"'"/' .tekton/rag-tool-pull-request.yaml
sed -i 's/"packages": "[^"]*"/"packages": "'"$wheel_packages"'"/' .tekton/rag-tool-push.yaml
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Package extraction may include index-url lines.

The grep -v "^[#-]" pattern filters lines starting with # or -, but --index-url lines start with two dashes. This could inadvertently include the index-url directive in the package list if the wheel file contains such lines.

🔧 Proposed fix
-wheel_packages=$(grep -v "^[`#-`]" "$WHEEL_FILE" | sed 's/==.*//' | tr '\n' ',' | sed 's/,$//')
+wheel_packages=$(grep -Ev '^[[:space:]]*(#|--|$)' "$WHEEL_FILE" | sed 's/==.*//' | tr '\n' ',' | sed 's/,$//')
🤖 Prompt for AI Agents
In `@scripts/konflux_requirements.sh` around lines 71 - 77, The package extraction
can still include index-url lines when they have leading whitespace; update the
wheel_packages pipeline to robustly ignore commented and dash-prefixed lines by
changing the grep to exclude lines that start with optional whitespace followed
by '#' or '-' (e.g., use grep -v '^[[:space:]]*[-#]'), keeping the rest of the
pipeline (sed 's/==.*//' | tr '\n' ',' | sed 's/,$//') and then continue using
wheel_packages, EXTRA_WHEELS, PYPI_WHEELS and the two sed replacements that
update .tekton/rag-tool-pull-request.yaml and .tekton/rag-tool-push.yaml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant