Skip to content

Conversation

@frobware
Copy link
Contributor

@frobware frobware commented Jan 6, 2026

This replaces the current init container approach that uses a fedora-minimal image with findmnt and mount utilities.

The findmnt solution introduced in #477 correctly fixed the overlay mount problem by checking the filesystem type rather than grepping for a specific source name. This PR preserves that fix while eliminating the external image dependency.

Why change?

The current approach has two drawbacks:

  1. Extra image pull - we pull fedora-minimal just for findmnt and mount, even though the agent image is already being pulled
  2. Minimal base image compatibility - images like ubi9-minimal don't include these utilities, which matters for downstream distributions

By moving this functionality into bpfman-agent itself, downstream distributions no longer need a custom init container image, and we don't need to parameterise the init container image reference.

How it works

The agent gains a --mount-bpffs flag that:

  1. Parses /proc/self/mountinfo to check for existing bpf mounts (using the same approach as libmount from util-linux)
  2. Calls syscall.Mount if needed

This is functionally equivalent to:

if ! findmnt --noheadings --types bpf /sys/fs/bpf; then
  mount bpffs /sys/fs/bpf -t bpf
fi

The init container now just runs:

command: ["/bpfman-agent", "--mount-bpffs"]

Testing

A --mount-bpffs-remount flag is also available for testing - it unmounts first if mounted, then mounts, ensuring the mount code path is exercised.

Tested on kind with the daemon pod running successfully.

@frobware frobware requested a review from dave-tucker January 6, 2026 13:23
@frobware frobware mentioned this pull request Jan 6, 2026
@frobware frobware force-pushed the agent-mount-bpffs-v2 branch from f00d8ba to 0076d94 Compare January 6, 2026 14:45
dave-tucker
dave-tucker previously approved these changes Jan 9, 2026
@mergify
Copy link
Contributor

mergify bot commented Jan 9, 2026

@frobware, this pull request is now in conflict and requires a rebase.

@mergify mergify bot added the needs-rebase label Jan 9, 2026
@codecov
Copy link

codecov bot commented Jan 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (c1a9276) to head (0076d94).
⚠️ Report is 12 commits behind head on main.

Additional details and impacted files
@@     Coverage Diff     @@
##   main   #490   +/-   ##
===========================
===========================

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The BpffsInitImage field in DaemonSpec allowed overriding the init
container image that mounts bpffs. This configurability is no longer
required as the init container will use the agent image directly in
subsequent changes.

Remove the field from the API, the corresponding override logic from
the controller, and update the tests to only cover CSI registrar image
overrides.

Signed-off-by: Andrew McDermott <amcdermo@redhat.com>
Regenerate the Config CRD and OLM bundle manifests following the
removal of the BpffsInitImage field from the DaemonSpec API.

Signed-off-by: Andrew McDermott <amcdermo@redhat.com>
The init container that ensures bpffs is mounted before the main
containers start previously used a fedora-minimal image with findmnt
and mount utilities. This had two drawbacks: pulling an additional
container image on startup, and incompatibility with minimal base
images like ubi9-minimal that lack these utilities. By moving this
functionality into the agent, downstream bpfman distributions no
longer need a custom init container image, and the operator does not
need to parameterise the init container image reference.

Extend bpfman-agent with --mount-bpffs flag to handle this directly.
The agent now parses /proc/self/mountinfo to check for existing bpf
mounts (matching libmount's parsing approach from util-linux) and uses
syscall.Mount when needed. This eliminates the external image
dependency since the agent image is already pulled for the main
container.

Add internal/bpffs package with:
- IsMounted: parse mountinfo to detect bpf mounts at a given path
- Mount: create bpffs mount, creating the directory if needed
- Unmount: remove a bpffs mount
- EnsureMounted: idempotent helper for the common check-then-mount pattern

The --mount-bpffs-remount flag forces a fresh mount by unmounting
first if already mounted, useful for testing the mount code path.

Both modes handle the race where another process might mount between
check and mount by treating mount failures as success if the
filesystem is now mounted.

Signed-off-by: Andrew McDermott <amcdermo@redhat.com>
The configureBpfmanDs function was only updating images for the main
containers, leaving the mount-bpffs init container with a hardcoded
image reference. This caused CI failures because the init container
would use quay.io/bpfman/bpfman-agent:latest instead of the freshly
built test image, meaning the --mount-bpffs functionality was not
available.

Add logic to update the init container image from
config.Spec.Agent.Image, consistent with how the bpfman-agent
container image is configured.

Signed-off-by: Andrew McDermott <amcdermo@redhat.com>
@frobware
Copy link
Contributor Author

/hold

This PR layers on top of #491 which removes the bpffsInitImage configuration option. #491 needs to merge first.

@mergify mergify bot removed the needs-rebase label Jan 12, 2026
The lifecycle test was failing because it used hardcoded :latest image
tags when creating a new Config, but CI only loads :int-test images
into the kind cluster. The init container now uses the agent image
from the Config, so when the test created a Config with :latest
images, the pods would fail to start.

Read image tags from environment variables (BPFMAN_IMG,
BPFMAN_AGENT_IMG) with fallback to :latest defaults for local testing.
This aligns with how the integration tests handle images.

Also fix the error formatting in waitUntilCondition which was using
ctx.Err() instead of timeoutCTX.Err(), producing malformed error
messages like "%!w(<nil>)" when the timeout fired.

Signed-off-by: Andrew McDermott <amcdermo@redhat.com>
@frobware frobware force-pushed the agent-mount-bpffs-v2 branch from e8515ec to 093ba4b Compare January 12, 2026 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants