Skip to content

Conversation

@edolstra
Copy link
Collaborator

@edolstra edolstra commented Jan 21, 2026

Motivation

This is an updated version of upstream NixOS#11749.

Nix historically has been bad at being able to answer the question "where did this store path come from", i.e. to provide traceability from a store path back to the Nix expression from which is was built. Nix tracks the "deriver" of a store path (the .drv file that built it) but that's pretty useless in practice, since it doesn't link back to the Nix expressions.

So this PR adds a "provenance" field (a JSON object) to the ValidPaths table and to .narinfo files that describes where the store path came from and how it can be reproduced.

There are currently the following types of provenance:

  • copied: Records that the store path was copied or substituted from another store (typically a binary cache). Its "from" field is the URL of the origin store. Its "provenance" field propagates the provenance of the store path on the origin store.

  • build: Records that the store path was produced by building a derivation. This is equivalent for the "deriver" field, but it has a nested "provenance" field that records how the .drv file was created.

  • tree: The store path is the result of a call to fetchTree (e.g. it's a flake or flake input). It includes the fetcher attributes.

  • subpath: The store path was created by copying the subpath of some other provenance (e.g. /foo/builder.sh from a tree) .

  • flake: Records that the store path was created during the evaluation of a flake output.

  • fetchurl: The store path is the result of a builtins.fetchurl call.

Context

Summary by CodeRabbit

  • New Features

    • Experimental provenance tracking: origin metadata is now propagated and exposed across fetches, builds, store operations, flakes, and worker protocol exchanges.
    • Build results, stored paths, fetch accessors, and flake outputs can carry provenance when the feature is enabled.
  • Documentation

    • Store object schema extended with an optional provenance field to carry provenance JSON.
  • Bug Fixes

    • Flake prefetch output now preserves the "__final" attribute.
  • Tests

    • New functional test validating provenance propagation and substitution.

@coderabbitai
Copy link

coderabbitai bot commented Jan 21, 2026

📝 Walkthrough

Walkthrough

Adds a provenance subsystem and threads provenance metadata across fetchers, source accessors, flakes/eval, store write/copy flows, worker/daemon protocol, DB schema, and JSON schemas to record and transmit origin information.

Changes

Cohort / File(s) Summary
Schema
doc/manual/source/protocols/json/schema/store-object-info-v2.yaml
Added optional provenance (null
Provenance core & registry
src/libutil/include/nix/util/provenance.hh, src/libutil/provenance.cc
New Provenance abstraction, registry, Unknown/SubpathProvenance, JSON (de)serializers, factory/registration helpers.
Source accessor surface
src/libutil/include/nix/util/source-accessor.hh, src/libutil/source-accessor.cc, src/libutil/include/nix/util/source-path.hh, src/libutil/mounted-source-accessor.cc, src/libutil/union-source-accessor.cc
Added provenance member and getProvenance API; SourcePath exposes getProvenance; accessors wrap/forward provenance.
Fetchers & fetch provenance
src/libfetchers/include/nix/fetchers/provenance.hh, src/libfetchers/provenance.cc, src/libfetchers/tarball.cc, src/libfetchers/fetchers.cc, src/libfetchers/filtering-source-accessor.{cc,hh}
New TreeProvenance/FetchurlProvenance; assign provenance to accessors post-fetch/substitution; FilteringSourceAccessor delegates/overrides provenance.
Eval / Flake integration
src/libexpr/include/nix/expr/eval.hh, src/libexpr/eval.cc, src/libcmd/include/nix/cmd/installable-flake.hh, src/libcmd/installable-flake.cc, src/libflake/include/nix/flake/provenance.hh, src/libflake/provenance.cc, src/libflake/flake.cc
EvalState stores rootProvenance with accessors; InstallableFlake::makeProvenance added/used; Flake holds provenance; FlakeProvenance implemented.
Async writer & derivation plumbing
src/libstore/include/nix/store/async-path-writer.hh, src/libstore/async-path-writer.cc, src/libstore/derivations.cc, src/libstore/include/nix/store/derivations.hh, src/libexpr/primops.cc
addPath/writeDerivation/addToStoreFromDump signatures extended to accept provenance; async queue items carry provenance; write calls forward provenance.
Store API, PathInfo & provenance types
src/libstore/include/nix/store/store-api.hh, src/libstore/include/nix/store/path-info.hh, src/libstore/include/nix/store/provenance.hh, src/libstore/provenance.cc
Store API gains provenance params and includeInProvenance(); ValidPathInfo gains provenance; new BuildProvenance/CopiedProvenance types and JSON handling.
Store implementations & DB
src/libstore/local-store.cc, src/libstore/include/nix/store/local-store.hh, src/libstore/remote-store.cc, src/libstore/include/nix/store/remote-store.hh, src/libstore/restricted-store.cc, src/libstore/dummy-store.cc, src/libstore/binary-cache-store.cc, src/libstore/include/nix/store/binary-cache-store.hh, src/libstore/ssh-store.cc, src/libstore/include/nix/store/legacy-ssh-store.hh
Threaded provenance through addToStoreFromDump/addCAToStore/copy paths; LocalStore DB schema and queries extended to store provenance; remote handshake/transmit updated.
Copy semantics & store-api logic
src/libstore/store-api.cc
copyStorePath now returns shared ValidPathInfo and wraps provenance via CopiedProvenance for copied paths.
Build/substitution wiring
src/libstore/build/derivation-building-goal.cc, src/libstore/include/nix/store/build/derivation-builder.hh, src/libstore/include/nix/store/build/derivation-building-goal.hh, src/libstore/build/substitution-goal.cc, src/libstore/include/nix/store/build/substitution-goal.hh, src/libstore/unix/build/derivation-builder.cc
Derivation provenance threaded into DerivationBuilderParams/BuildResult; substitution and async copy flows return/forward provenance (promise type changes).
Daemon, worker protocol & RPC
src/libstore/daemon.cc, src/libstore/include/nix/store/worker-protocol.hh, src/libstore/include/nix/store/worker-protocol-connection.hh, src/libstore/worker-protocol.cc, src/libstore/worker-protocol-connection.cc
WorkerProto featureProvenance added; handshake filters/propogates provenance capability; UnkeyedValidPathInfo wire format conditionally includes provenance.
NarInfo, PathInfo, BuildResult JSON & feature flag
src/libstore/nar-info.cc, src/libstore/path-info.cc, src/libstore/build-result.cc, src/libutil/experimental-features.cc, src/libutil/include/nix/util/experimental-features.hh
NarInfo/PathInfo/BuildResult JSON extended to include provenance when experimental feature enabled; ExperimentalFeature::Provenance added.
Docs/tests/build files
tests/functional/flakes/provenance.sh, tests/functional/common/init.sh, various meson.build and header lists
New functional provenance test; tests use expanded experimental-features env var; meson/header lists updated to include provenance sources/headers. Check test assumptions and meson exports.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Daemon
  participant Store
  participant Provenance

  Client->>Daemon: AddToStore(data, optional provenance JSON)
  Daemon->>Store: addToStoreFromDump(..., provenance?)
  Note right of Store: create ValidPathInfo\ninfo.provenance = provenance
  Store->>Provenance: possibly wrap/chain (Copied/Build) provenance
  Store-->Daemon: result (path + provenance if negotiated)
  Daemon-->Client: response (path + provenance if negotiated)
Loading
sequenceDiagram
  participant Fetcher
  participant Input
  participant Accessor
  participant Store
  participant Provenance

  Fetcher->>Input: fetchToStore(...)
  Input-->>Accessor: return accessor (with attrs)
  Accessor->>Accessor: accessor.provenance = TreeProvenance(input.attrs)
  Fetcher->>Store: addToStoreFromDump(..., provenance = accessor.getProvenance(path))
  Store->>Provenance: record BuildProvenance/CopiedProvenance chain
  Store-->Fetcher: store path (with provenance)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

flake-regression-test

Suggested reviewers

  • cole-h
  • grahamc

Poem

"I hopped through code with a nibble and grin,
Breadcrumbs of origin tucked safely within.
From fetch to store and flake to dish,
Provenance trails fulfill every wish.
A rabbit cheers for metadata wins!" 🐇

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.92% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Provenance' directly describes the main feature being added: comprehensive provenance tracking for store paths throughout the Nix ecosystem.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch provenance-detsys

Important

Action Needed: IP Allowlist Update

If your organization protects your Git platform with IP whitelisting, please add the new CodeRabbit IP address to your allowlist:

  • 136.113.208.247/32 (new)
  • 34.170.211.100/32
  • 35.222.179.152/32

Reviews will stop working after February 8, 2026 if the new IP is not added to your allowlist.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/libstore/dummy-store.cc (1)

235-304: Provenance parameter is accepted but not used in addToStoreFromDump.

The provenance parameter is added to the signature but is never used in the implementation. The ValidPathInfo created on lines 276–288 and stored in contents doesn't include the provenance. Other stores handle this correctly: BinaryCacheStore::addToStoreFromDump (line 382) and LocalStore::addToStoreFromDump (line 1331) both assign info.provenance = provenance.

Add the provenance assignment:

Suggested fix
         auto info = ValidPathInfo::makeFromCA(
             *this,
             name,
             ContentAddressWithReferences::fromParts(
                 hashMethod,
                 std::move(hash),
                 {
                     .others = references,
                     // caller is not capable of creating a self-reference, because
                     // this is content-addressed without modulus
                     .self = false,
                 }),
             std::move(narHash.first));

         info.narSize = narHash.second.value();
+        info.provenance = std::move(provenance);
🤖 Fix all issues with AI agents
In `@src/libexpr/include/nix/expr/eval.hh`:
- Around line 1134-1147: Members are declared in the wrong order which breaks
the intended destruction order: move std::shared_ptr<const Provenance>
rootProvenance so it is declared before ref<Executor> executor (or alternatively
move ref<Executor> executor to be the final data member) to ensure executor is
destroyed last; update the declarations around the class fields that contain
rootProvenance, executor and the public methods
setRootProvenance/getRootProvenance accordingly so destructor ordering is
preserved.

In `@src/libfetchers/filtering-source-accessor.cc`:
- Around line 71-76: FilteringSourceAccessor::getProvenance currently bypasses
checkAccess() and the SubpathProvenance wrapping from
SourceAccessor::getProvenance, so update this method to first call
checkAccess(path) (or the appropriate access-checking call used by
SourceAccessor), then if the local provenance member exists return it, otherwise
obtain the child provenance via next->getProvenance(prefix / path) and wrap that
result in a SubpathProvenance constructed with the returned provenance and the
prefix before returning; also add an `#include` for the SubpathProvenance header
if not already present.

In `@src/libfetchers/tarball.cc`:
- Around line 106-109: The code currently stores the raw URL in provenance
(info.provenance = std::make_shared<FetchurlProvenance>(url)) which can leak
credentials or signed tokens; before creating FetchurlProvenance and calling
store.addToStore(info, source, ...), sanitize/redact the URL (remove userinfo
and sensitive query params) using the existing redaction helper if available (or
add a small helper that strips user:pass and known signed token query keys),
then pass the redacted string into FetchurlProvenance instead of the original
url so no secrets are persisted in narinfo/metadata.
- Around line 12-34: Add a Provenance::Register for the FetchurlProvenance type
so deserialization of {"type":"fetchurl"} returns a FetchurlProvenance instance:
register a Provenance::Register (e.g. registerFetchurlProvenance) that uses
getObject(json), valueAt(obj, "url"), getString(...) and
make_ref<FetchurlProvenance>(...) to construct the object; place this
registration in this file (or src/libfetchers/provenance.cc) so the type is
recognized at load time. Also sanitize/redact sensitive components of the url
before storing/serializing in FetchurlProvenance::to_json (e.g. strip
credentials or tokens) to avoid leaking credentials in provenance.

In `@src/libflake/include/nix/flake/flake.hh`:
- Around line 98-101: Flake::~Flake() in flake.cc needs the complete type for
Provenance because it destroys the std::shared_ptr<const Provenance> member
provenance; add `#include` "nix/flake/provenance.hh" to src/libflake/flake.cc so
the Provenance definition is visible when instantiating the destructor for Flake
(referencing the provenance member and the Flake::~Flake() destructor).

In `@src/libstore/include/nix/store/path-info.hh`:
- Around line 127-132: The header now exposes std::shared_ptr via the member
"provenance" (type std::shared_ptr<const Provenance>), so add a direct include
for <memory> at the top of this header to avoid relying on transitive includes;
ensure the include is placed with the other standard headers and before any code
that references std::shared_ptr/Provenance.

In `@src/libstore/include/nix/store/provenance.hh`:
- Around line 9-36: DerivationProvenance is missing a from_json registration so
Provenance::from_json() cannot reconstruct derivation nodes; add a registration
call analogous to CopiedProvenance using Provenance::Register to register
"derivation" (or the existing type tag) with a factory that constructs a
DerivationProvenance from JSON, ensuring the registration happens during module
init (the same translation unit where other registrations like CopiedProvenance
are registered) so Provenance::from_json can deserialize DerivationProvenance
objects.

In `@src/libstore/nar-info.cc`:
- Around line 134-136: The text serialization in NarInfo (in
src/libstore/nar-info.cc) appends provenance unconditionally; update the
NarInfo::to_string (or the function that builds `res`) to only append
"Provenance: " + provenance->to_json_str() when
experimentalFeatureSettings.isEnabled(Xp::Provenance) is true, mirroring the
gating used during parsing and the UnkeyedValidPathInfo::toJSON() behavior;
locate the block that currently does `if (provenance) res += "Provenance: " +
provenance->to_json_str() + "\n";` and change it to check the feature flag
before using `provenance`.

In `@src/libutil/experimental-features.cc`:
- Around line 323-330: The provenance experimental-feature entry currently has
an empty .trackingUrl which leads to a broken link in documentation; update the
documentation generation (documentExperimentalFeatures) to check
feature.trackingUrl (or equivalent field) and only emit the tracking/issue link
when it is non-empty, or alternatively set a real URL for the Xp::Provenance
entry; reference the Xp::Provenance feature record (name "provenance") and the
trackingUrl field when making the change so the link is suppressed for empty
values.
🧹 Nitpick comments (5)
src/libstore/include/nix/store/remote-store.hh (1)

91-99: Consider restoring default arguments for RemoteStore callers.

C++ default args aren’t inherited, so removing them here can break direct RemoteStore usages even if Store still provides defaults. If this isn’t intentional, consider mirroring the base defaults to preserve API ergonomics.

♻️ Proposed adjustment
-        FileSerialisationMethod dumpMethod,
-        ContentAddressMethod hashMethod,
-        HashAlgorithm hashAlgo,
-        const StorePathSet & references,
-        RepairFlag repair,
-        std::shared_ptr<const Provenance> provenance) override;
+        FileSerialisationMethod dumpMethod = FileSerialisationMethod::NixArchive,
+        ContentAddressMethod hashMethod = ContentAddressMethod::Raw::NixArchive,
+        HashAlgorithm hashAlgo = HashAlgorithm::SHA256,
+        const StorePathSet & references = StorePathSet(),
+        RepairFlag repair = NoRepair,
+        std::shared_ptr<const Provenance> provenance = nullptr) override;
src/libstore/provenance.cc (1)

6-25: Inconsistent next field serialization between provenance types.

DerivationProvenance::to_json() always includes the "next" field (as null when absent), while CopiedProvenance::to_json() only includes it when present. This inconsistency may cause confusion or issues during deserialization.

Consider making the serialization consistent. Based on the FlakeProvenance pattern from src/libflake/provenance.cc, which always emits "next", aligning with that approach would be more consistent:

♻️ Suggested fix for consistency
 nlohmann::json CopiedProvenance::to_json() const
 {
-    nlohmann::json j{
+    return nlohmann::json{
         {"type", "copied"},
         {"from", from},
+        {"next", next ? next->to_json() : nlohmann::json(nullptr)},
     };
-    if (next)
-        j["next"] = next->to_json();
-    return j;
 }
src/libstore/build/derivation-building-goal.cc (1)

443-447: Consider capturing provenance for the AlreadyValid shortcut.
Provenance is fetched after the early return at Line 349, so AlreadyValid results will report null provenance even when it exists. If you want consistent provenance in build results, consider querying it before that shortcut or in the AlreadyValid branch.

src/libstore/store-api.cc (1)

888-895: Consider returning existing path info on the early-valid shortcut.
With the new nullptr return when the destination already has the path (Line 893–895), callers that want provenance (or other info) need an extra query. If that data is important in the “already present” case, consider returning dstStore.queryPathInfo(storePath) or explicitly documenting the contract.

Also applies to: 907-942

src/libutil/include/nix/util/provenance.hh (1)

12-38: Add a virtual destructor to Provenance as best practice.
While the base class is polymorphic and owns derived types through std::shared_ptr (which correctly captures concrete type deletes at construction), explicitly marking a polymorphic base class with a virtual destructor clarifies intent and ensures safety if the code evolves. This follows standard C++ polymorphism conventions.

Proposed fix
 struct Provenance
 {
+    virtual ~Provenance() = default;
     static ref<const Provenance> from_json_str(std::string_view);

Comment on lines +12 to +34
#include "nix/util/provenance.hh"

#include <nlohmann/json.hpp>

namespace nix::fetchers {

struct FetchurlProvenance : Provenance
{
std::string url;

FetchurlProvenance(const std::string & url)
: url(url)
{
}

nlohmann::json to_json() const override
{
return nlohmann::json{
{"type", "fetchurl"},
{"url", url},
};
}
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for "fetchurl" references in the codebase
echo "=== Searching for 'fetchurl' ==="
rg -n "fetchurl" --type-list | head -5
rg -n "fetchurl" 2>/dev/null | head -20

echo -e "\n=== Searching for Provenance deserialization patterns ==="
rg -n "from_json|Provenance::from_json" 2>/dev/null | head -20

echo -e "\n=== Searching for type dispatch/registry in provenance ==="
rg -n "\"type\"" src/libutil src/libfetchers src/libstore 2>/dev/null | grep -i provenance | head -20

Repository: DeterminateSystems/nix-src

Length of output: 4928


🏁 Script executed:

#!/bin/bash
# Look for Provenance class definition and from_json implementation
echo "=== Finding Provenance class definition ==="
fd -e "\.hh" -e "\.cc" | xargs rg -l "class Provenance" 2>/dev/null

echo -e "\n=== Search for from_json in provenance files ==="
fd -e "\.hh" -e "\.cc" | xargs rg -A 10 "from_json.*Provenance" 2>/dev/null | head -40

Repository: DeterminateSystems/nix-src

Length of output: 163


🏁 Script executed:

#!/bin/bash
# Look for JSON type dispatching for provenance
echo "=== Searching for provenance type mapping ==="
rg -n "type.*provenance\|provenance.*type" -i src/ 2>/dev/null | head -20

echo -e "\n=== Checking for nlohmann json deserialization ==="
rg -n "nlohmann::json\|\.get\(\)" src/libfetchers src/libutil -A 2 2>/dev/null | grep -A 2 "type\|fetchurl" | head -30

Repository: DeterminateSystems/nix-src

Length of output: 167


🏁 Script executed:

#!/bin/bash
# Read the from_json implementation
echo "=== src/libutil/provenance.cc from_json implementation ==="
sed -n '32,80p' src/libutil/provenance.cc

echo -e "\n=== Check full provenance.cc ==="
wc -l src/libutil/provenance.cc src/libfetchers/provenance.cc src/libstore/provenance.cc 2>/dev/null

echo -e "\n=== Read src/libutil/provenance.cc entirely ==="
cat -n src/libutil/provenance.cc

Repository: DeterminateSystems/nix-src

Length of output: 2603


🏁 Script executed:

#!/bin/bash
# Check what files include/define FetchurlProvenance
echo "=== Files mentioning FetchurlProvenance ==="
rg -n "FetchurlProvenance" --type cc --type hh 2>/dev/null

echo -e "\n=== Check src/libfetchers/provenance.cc ==="
cat -n src/libfetchers/provenance.cc 2>/dev/null || echo "File not found"

echo -e "\n=== Check all provenance related headers ==="
fd -e ".hh" | xargs rg -l "Provenance" 2>/dev/null | head -10

Repository: DeterminateSystems/nix-src

Length of output: 1529


🏁 Script executed:

#!/bin/bash
# Read provenance headers
echo "=== src/libutil/include/nix/util/provenance.hh ==="
cat -n src/libutil/include/nix/util/provenance.hh

echo -e "\n=== src/libstore/include/nix/store/provenance.hh ==="
cat -n src/libstore/include/nix/store/provenance.hh

Repository: DeterminateSystems/nix-src

Length of output: 3463


🏁 Script executed:

#!/bin/bash
# Check for registration patterns
echo "=== Looking for RegisteredTypes registration ==="
rg -n "registeredTypes\|RegisteredTypes\|register.*[Pp]rovenance" -A 3 -B 1 --type cc --type hh 2>/dev/null | head -60

echo -e "\n=== Check tarball.cc for FetchurlProvenance context ==="
sed -n '1,50p' src/libfetchers/tarball.cc

Repository: DeterminateSystems/nix-src

Length of output: 1265


🏁 Script executed:

#!/bin/bash
# Search for Provenance::Register usage pattern
echo "=== Looking for Provenance::Register instantiation ==="
rg -n "Provenance::Register\|Register.*type" -B 2 -A 5 --type cc --type hh 2>/dev/null

echo -e "\n=== Check full src/libstore/provenance.cc ==="
cat -n src/libstore/provenance.cc

echo -e "\n=== Search for registration in fetchers/provenance ==="
cat -n src/libfetchers/provenance.cc

Repository: DeterminateSystems/nix-src

Length of output: 2218


🏁 Script executed:

#!/bin/bash
# Look for how other provenance types register themselves
echo "=== Check for registrations across all provenance implementations ==="
find . -name "provenance*" -type f \( -name "*.cc" -o -name "*.hh" \) 2>/dev/null | xargs cat -n 2>/dev/null | grep -A 10 "Register"

echo -e "\n=== Full tarball.cc provenance section ==="
sed -n '12,100p' src/libfetchers/tarball.cc

Repository: DeterminateSystems/nix-src

Length of output: 3819


Register FetchurlProvenance for deserialization.

The "fetchurl" type lacks a Provenance::Register instantiation. Without it, deserialization of provenance with "type": "fetchurl" will fall back to UnknownProvenance (line 40 of src/libutil/provenance.cc), losing the typed information. Add a registration in this file or src/libfetchers/provenance.cc:

Provenance::Register registerFetchurlProvenance("fetchurl", [](nlohmann::json json) {
    auto & obj = getObject(json);
    return make_ref<FetchurlProvenance>(getString(valueAt(obj, "url")));
});

Additionally, URLs in provenance may contain credentials or signed tokens; consider redacting sensitive components before serialization.

🤖 Prompt for AI Agents
In `@src/libfetchers/tarball.cc` around lines 12 - 34, Add a Provenance::Register
for the FetchurlProvenance type so deserialization of {"type":"fetchurl"}
returns a FetchurlProvenance instance: register a Provenance::Register (e.g.
registerFetchurlProvenance) that uses getObject(json), valueAt(obj, "url"),
getString(...) and make_ref<FetchurlProvenance>(...) to construct the object;
place this registration in this file (or src/libfetchers/provenance.cc) so the
type is recognized at load time. Also sanitize/redact sensitive components of
the url before storing/serializing in FetchurlProvenance::to_json (e.g. strip
credentials or tokens) to avoid leaking credentials in provenance.

Comment on lines 9 to 36
struct DerivationProvenance : Provenance
{
/**
* The derivation that built this path.
*/
StorePath drvPath;

/**
* The output of the derivation that corresponds to this path.
*/
OutputName output;

/**
* The provenance of the derivation, if known.
*/
std::shared_ptr<const Provenance> next;

// FIXME: do we need anything extra for CA derivations?

DerivationProvenance(const StorePath & drvPath, const OutputName & output, std::shared_ptr<const Provenance> next)
: drvPath(drvPath)
, output(output)
, next(std::move(next))
{
}

nlohmann::json to_json() const override;
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for provenance registrations in the codebase
rg -n "from_json|nlohmann_define|nlohmann::json" src/libstore/ --type cpp -B 2 -A 2 | head -100

Repository: DeterminateSystems/nix-src

Length of output: 5808


🏁 Script executed:

# Find and examine provenance.cc
find . -name "provenance.cc" -type f

Repository: DeterminateSystems/nix-src

Length of output: 187


🏁 Script executed:

# Check the size and content of provenance.cc
if [ -f "src/libstore/provenance.cc" ]; then
  wc -l src/libstore/provenance.cc
  echo "---"
  cat -n src/libstore/provenance.cc | head -50
fi

Repository: DeterminateSystems/nix-src

Length of output: 1250


🏁 Script executed:

# Search specifically for Derived and Copied provenance patterns
rg -n "Derived|Copied" src/libstore/provenance.cc -B 2 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 534


🏁 Script executed:

#!/bin/bash
# Search for all Provenance::Register entries to see the pattern
rg -n "Provenance::Register" --type cpp

Repository: DeterminateSystems/nix-src

Length of output: 345


🏁 Script executed:

# Search for provenance types in the header files
find . -name "*.hh" -path "*provenance*" -type f | xargs cat -n | head -100

Repository: DeterminateSystems/nix-src

Length of output: 2867


🏁 Script executed:

# Check for any reference to "derivation" type in JSON context
rg -n '"derivation"' --type cpp -B 2 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 9701


🏁 Script executed:

#!/bin/bash
# Search for all Provenance registration types across all libmodules
rg -n "Provenance::Register|registerTree|registerFlake" --type cpp src/lib*/

Repository: DeterminateSystems/nix-src

Length of output: 345


🏁 Script executed:

# Check the actual implementation files for Tree and Flake provenance
find . -name "provenance.cc" -type f | xargs wc -l

Repository: DeterminateSystems/nix-src

Length of output: 218


🏁 Script executed:

# Check libfetchers/provenance.cc for TreeProvenance registration
if [ -f "src/libfetchers/provenance.cc" ]; then
  cat -n src/libfetchers/provenance.cc
fi

Repository: DeterminateSystems/nix-src

Length of output: 914


🏁 Script executed:

# Check libflake/provenance.cc for FlakeProvenance registration
if [ -f "src/libflake/provenance.cc" ]; then
  cat -n src/libflake/provenance.cc
fi

Repository: DeterminateSystems/nix-src

Length of output: 503


Add from_json registration for DerivationProvenance.

CopiedProvenance has a registration in provenance.cc that enables deserialization from JSON (via Provenance::Register at line 27), but DerivationProvenance lacks one. Since Provenance::from_json() is actively called during deserialization in nar-info.cc, worker-protocol.cc, and build-result.cc, a missing registration for DerivationProvenance would cause failures when reconstructing provenance chains containing derivations from JSON.

🤖 Prompt for AI Agents
In `@src/libstore/include/nix/store/provenance.hh` around lines 9 - 36,
DerivationProvenance is missing a from_json registration so
Provenance::from_json() cannot reconstruct derivation nodes; add a registration
call analogous to CopiedProvenance using Provenance::Register to register
"derivation" (or the existing type tag) with a factory that constructs a
DerivationProvenance from JSON, ensuring the registration happens during module
init (the same translation unit where other registrations like CopiedProvenance
are registered) so Provenance::from_json can deserialize DerivationProvenance
objects.

@edolstra edolstra changed the title Provenance detsys Provenance Jan 22, 2026
@github-actions
Copy link

github-actions bot commented Jan 22, 2026

@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 14:14 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 14:43 Inactive
@edolstra edolstra force-pushed the provenance-detsys branch 6 times, most recently from fb14a66 to e8599c8 Compare January 22, 2026 14:53
@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 14:55 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 14:55 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/libstore/include/nix/store/worker-protocol.hh (1)

66-84: Verify provenance handling in hardcoded protocol version 16 cases only—main negotiation path is correct.

The primary code paths using WorkerProto::BasicClientConnection and WorkerProto::BasicServerConnection correctly set the provenance flag based on negotiated features. However, three direct constructions with hardcoded version = 16 do not set provenance and will silently default to false:

  • src/libstore/store-api.cc:204 in addMultipleToStore() (marked FIXME)
  • src/libstore/export-import.cc:147 and :60 for nario format import/export

These are non-negotiated contexts marked as technical debt. If provenance should flow through these paths, they need explicit handling. Otherwise, document that these legacy/special-format paths intentionally bypass provenance.

src/libstore/include/nix/store/store-api.hh (1)

948-957: Handle nullptr return from copyStorePath.
The function can return nullptr when the path already exists in the destination store. At least one call site in src/libstore/build/substitution-goal.cc:258 sets a promise value with the return directly without guarding against nullptr, which could cause issues downstream.

🤖 Fix all issues with AI agents
In `@src/libstore/build/derivation-building-goal.cc`:
- Around line 443-447: The AlreadyValid fast‑path returns BuildResult::Success
before provenance is fetched, causing missing provenance; hoist the
maybeQueryPathInfo(drvPath) lookup (calling worker.evalStore.maybeQueryPathInfo
and capturing info->provenance into a std::shared_ptr<const Provenance>
provenance) before the early return for the AlreadyValid case and reuse that
provenance when constructing/returning BuildResult::Success so provenance is
present for the fast path as well as later build paths.

In `@src/libstore/dummy-store.cc`:
- Around line 235-243: The addToStoreFromDump implementation accepts a
provenance parameter but never attaches it to the created ValidPathInfo, so
provenance is dropped; update the function (addToStoreFromDump) to set
info.provenance = provenance (or equivalent member) on the ValidPathInfo
instance before calling whatever inserts the entry (e.g., insertPath /
_store.emplace or the existing insertion code) so the saved path records the
provided provenance.

In `@src/libstore/include/nix/store/legacy-ssh-store.hh`:
- Around line 76-79: The isUsefulProvenance() override currently returns true
but LegacySSHStore (ServeProto) cannot transmit provenance; change the method
implementation in the LegacySSHStore class so isUsefulProvenance() returns false
to accurately reflect protocol capabilities (or alternatively implement
provenance support in ServeProto/ReadConn/WriteConn and UnkeyedValidPathInfo
before keeping true) — locate the isUsefulProvenance() override and update its
return value to false.

In `@src/libutil/include/nix/util/provenance.hh`:
- Around line 3-8: Add explicit standard headers for the used types: include
<map>, <string>, <string_view>, and <memory> at the top of the header so that
uses of std::map, std::string, std::string_view, and std::shared_ptr are no
longer reliant on transitive includes; update the include block near the
existing includes ("nix/util/ref.hh", "nix/util/canon-path.hh", <functional>,
<nlohmann/json_fwd.hpp>) to add these four headers.
♻️ Duplicate comments (6)
src/libexpr/include/nix/expr/eval.hh (1)

1125-1147: Keep executor as the final data member to preserve destruction order.
This places rootProvenance after executor, which violates the “keep this last” guarantee. Please move rootProvenance before executor (or move executor back to the end).

src/libfetchers/tarball.cc (2)

18-34: Register fetchurl provenance for deserialization.
Without a Provenance::Register for "fetchurl", deserialization will fall back to UnknownProvenance, losing typed data.

🧩 Suggested registration (example)
+static Provenance::Register registerFetchurlProvenance(
+    "fetchurl",
+    [](nlohmann::json json) {
+        auto & obj = getObject(json);
+        return make_ref<FetchurlProvenance>(getString(valueAt(obj, "url")));
+    });

107-108: Avoid persisting credentials or signed tokens in provenance URLs.
Storing the raw URL in provenance risks leaking userinfo or signed query params into .narinfo/metadata. Please redact/sanitize before storing.

🛡️ Example guard (pseudo)
-        info.provenance = std::make_shared<FetchurlProvenance>(url);
+        info.provenance = std::make_shared<FetchurlProvenance>(redactSensitiveUrl(url));
src/libutil/experimental-features.cc (1)

323-330: Suppress tracking links when trackingUrl is empty.
documentExperimentalFeatures() always emits a tracking link; empty URL yields a broken link for provenance.

🧾 Suggested guard
-        docOss << fmt(
-            "\nRefer to [%1% tracking issue](%2%) for feature tracking.", xpFeature.name, xpFeature.trackingUrl);
+        if (!xpFeature.trackingUrl.empty()) {
+            docOss << fmt(
+                "\nRefer to [%1% tracking issue](%2%) for feature tracking.", xpFeature.name, xpFeature.trackingUrl);
+        }
src/libstore/include/nix/store/path-info.hh (1)

127-131: Add <memory> include for std::shared_ptr (duplicate).

This header now stores std::shared_ptr, so it should include <memory> directly to avoid reliance on transitive includes.

🔧 Suggested fix
 `#include` <string>
 `#include` <optional>
+#include <memory>
src/libfetchers/filtering-source-accessor.cc (1)

71-76: Preserve access checks and subpath provenance (Line 71-76).

This override skips checkAccess() and drops SubpathProvenance wrapping for non-root paths, so callers can obtain root provenance for any subpath. Consider mirroring SourceAccessor::getProvenance.

🧩 Proposed fix
 std::shared_ptr<const Provenance> FilteringSourceAccessor::getProvenance(const CanonPath & path)
 {
+    checkAccess(path);
     if (provenance)
-        return provenance;
+        return path.isRoot() ? provenance : std::make_shared<SubpathProvenance>(provenance, path);
     return next->getProvenance(prefix / path);
 }

If needed, add the SubpathProvenance include to this file.

🧹 Nitpick comments (3)
src/libstore/path-info.cc (1)

219-220: Consider limiting provenance to JSON format V2 (Line 219-220, Line 296-300).

To avoid surprising strict V1 consumers, consider gating provenance emission and parsing to PathInfoJsonFormat::V2 only.

♻️ Suggested adjustment
-        if (experimentalFeatureSettings.isEnabled(Xp::Provenance))
+        if (experimentalFeatureSettings.isEnabled(Xp::Provenance) && format == PathInfoJsonFormat::V2)
             jsonObject["provenance"] = provenance ? provenance->to_json() : nullptr;
-    if (experimentalFeatureSettings.isEnabled(Xp::Provenance)) {
+    if (experimentalFeatureSettings.isEnabled(Xp::Provenance) && format == PathInfoJsonFormat::V2) {
         auto prov = json.find("provenance");
         if (prov != json.end() && !prov->second.is_null())
             res.provenance = Provenance::from_json(prov->second);
     }

Also applies to: 296-300

src/libstore/provenance.cc (1)

6-25: Inconsistent handling of optional next field in JSON serialization.

DerivationProvenance::to_json() always emits the next field (with null if absent), while CopiedProvenance::to_json() only emits next when present. This asymmetry may cause issues with JSON comparison, diffing, or consumers that expect consistent structure.

Consider aligning the behavior—either always emit next (with null) or conditionally emit it in both classes.

Option A: Always emit `next` in CopiedProvenance (matching DerivationProvenance)
 nlohmann::json CopiedProvenance::to_json() const
 {
-    nlohmann::json j{
+    return nlohmann::json{
         {"type", "copied"},
         {"from", from},
+        {"next", next ? next->to_json() : nlohmann::json(nullptr)},
     };
-    if (next)
-        j["next"] = next->to_json();
-    return j;
 }
Option B: Conditionally emit `next` in DerivationProvenance (matching CopiedProvenance)
 nlohmann::json DerivationProvenance::to_json() const
 {
-    return nlohmann::json{
+    nlohmann::json j{
         {"type", "derivation"},
         {"drv", drvPath.to_string()},
         {"output", output},
-        {"next", next ? next->to_json() : nlohmann::json(nullptr)},
     };
+    if (next)
+        j["next"] = next->to_json();
+    return j;
 }
src/libstore/build/substitution-goal.cc (1)

72-72: Consider returning provenance for AlreadyValid paths.
Right now, AlreadyValid reports nullptr even though provenance may exist in the store. If consumers rely on it, consider fetching it here.

💡 Optional improvement
-    if (!repair && worker.store.isValidPath(storePath)) {
-        co_return doneSuccess(BuildResult::Success::AlreadyValid, nullptr);
-    }
+    if (!repair && worker.store.isValidPath(storePath)) {
+        auto info = worker.store.maybeQueryPathInfo(storePath);
+        co_return doneSuccess(BuildResult::Success::AlreadyValid, info ? info->provenance : nullptr);
+    }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/libutil/include/nix/util/provenance.hh`:
- Around line 12-23: The Provenance polymorphic base class lacks a virtual
destructor; add a public virtual destructor to Provenance (e.g., declare a
virtual ~Provenance() = default) so derived instances (referenced by
std::shared_ptr<const Provenance> and used by functions like from_json_str,
from_json_str_optional, from_json, to_json_str, and the pure virtual to_json)
are safely destroyed when deleted via a base pointer.
♻️ Duplicate comments (1)
src/libutil/include/nix/util/provenance.hh (1)

3-8: Add missing standard library headers.

This header uses std::map, std::string, std::string_view, and std::shared_ptr but relies on transitive includes. Add explicit headers for better portability.

 `#include` "nix/util/ref.hh"
 `#include` "nix/util/canon-path.hh"
 
 `#include` <functional>
+#include <map>
+#include <memory>
+#include <string>
+#include <string_view>
 
 `#include` <nlohmann/json_fwd.hpp>
🧹 Nitpick comments (1)
src/libutil/include/nix/util/provenance.hh (1)

34-40: Consider logging or asserting on duplicate type registration.

insert_or_assign silently overwrites existing registrations. If two modules accidentally register the same type name, debugging would be difficult. Consider using insert and checking the return value, or at least logging when a type is overwritten.

@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 18:42 Inactive

auto attrPath = attr->getAttrPathStr();

state->setRootProvenance(makeProvenance(attrPath));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than having to call this, is there any way we could maybe make this automatic?

(Mostly thinking about protecting from developer error -- if I have to write code here for whatever reason, and add a new path where I should have called setRootProvenance but forgot to, that's probably not great. But if it's integrated into the construction of e.g. attrPath, that means we can't forget it because it'll always happen)

@cole-h
Copy link
Member

cole-h commented Jan 22, 2026

(Also if you could go through the coderabbit comments and see which ones are real things to be resolved or just LLM hallucinations and close them accordingly, that would be great)

Example of a substitution event:

  {
    "action": "result",
    "id": 0,
    "payload": {
      "builtOutputs": {},
      "path": "3bb116cnl86svn2lgc41a3i4a9qblgsf-libtool-2.4.7",
      "provenance": {
        "from": "https://cache.nixos.org",
        "type": "copied"
      },
      "startTime": 0,
      "status": "Substituted",
      "stopTime": 0,
      "success": true,
      "timesBuilt": 0
    },
    "type": 110
  }

Example of a derivation event:

  {
    "action": "result",
    "id": 3381333262860569,
    "payload": {
      "builtOutputs": {
        "out": {
          "dependentRealisations": {},
          "id": "sha256:deb37b0f322203d852a27010200f08e2dd739cb02b51d77999bd7f3162cdfe39!out",
          "outPath": "6b9w3gdjnbdvi50c0h0b9xg91hq6aryl-patchelf-0.18.0",
          "signatures": []
        }
      },
      "cpuSystem": 2118013,
      "cpuUser": 11471586,
      "path": {
        "drvPath": "fm8zrgh4dazysyz3imcva658h0iv34k0-patchelf-0.18.0.drv",
        "outputs": [
          "*"
        ]
      },
      "provenance": {
        "flakeOutput": "packages.x86_64-linux.default",
        "next": {
          "attrs": {
            "dirtyRev": "bb2f1eb3c1e4dc9c4523642a3e39d55806fc9a81-dirty",
            "dirtyShortRev": "bb2f1eb-dirty",
            "lastModified": 1768573749,
            "type": "git",
            "url": "file:///home/eelco/Dev/patchelf"
          },
          "type": "tree"
        },
        "type": "flake"
      },
      "startTime": 1768993105,
      "status": "Built",
      "stopTime": 1768993120,
      "success": true,
      "timesBuilt": 1
    },
    "type": 110
  }
@github-actions github-actions bot temporarily deployed to pull request January 23, 2026 12:50 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/libstore/provenance.cc`:
- Around line 27-33: Add a Provenance::Register for the "derivation" type
mirroring registerCopiedProvenance so Provenance::from_json returns a
DerivationProvenance instead of UnknownProvenance; create a new static register
(e.g. registerDerivationProvenance) that uses getObject(json), reads any "next"
via optionalValueAt(obj, "next") and deserializes it with Provenance::from_json,
and extracts required fields with valueAt/getString to construct and return
make_ref<DerivationProvenance>(...) using the same pattern as
registerCopiedProvenance.
♻️ Duplicate comments (12)
src/libexpr/include/nix/expr/eval.hh (1)

1140-1153: Member ordering violates destruction order requirement.

The comment on lines 1133-1138 explicitly states that executor must be the last data member to ensure it's destroyed first. However, rootProvenance is now declared after executor (line 1144), making executor no longer the last member.

In C++, members are destroyed in reverse declaration order. With this ordering, rootProvenance will be destroyed before executor, meaning executor is destroyed second—not first as required.

Move rootProvenance before executor to preserve the intended destruction order.

src/libstore/include/nix/store/path-info.hh (1)

127-132: Include <memory> for std::shared_ptr in this public header.

This header now exposes std::shared_ptr, so it should include <memory> directly to avoid relying on transitive includes.

🔧 Proposed fix
 `#include` <string>
 `#include` <optional>
+#include <memory>
src/libutil/experimental-features.cc (1)

342-349: Avoid empty tracking URLs to prevent broken documentation links.

documentExperimentalFeatures() always emits a tracking link, so an empty URL becomes a broken link. Consider adding a real URL or suppressing link emission when empty.

🧾 Suggested guard in documentation generation
-        docOss << fmt(
-            "\nRefer to [%1% tracking issue](%2%) for feature tracking.", xpFeature.name, xpFeature.trackingUrl);
+        if (!xpFeature.trackingUrl.empty()) {
+            docOss << fmt(
+                "\nRefer to [%1% tracking issue](%2%) for feature tracking.", xpFeature.name, xpFeature.trackingUrl);
+        }
src/libstore/include/nix/store/provenance.hh (1)

9-58: Verify JSON registration for the new provenance variants.

Please confirm DerivationProvenance and CopiedProvenance are registered for JSON deserialization; otherwise Provenance::from_json*() will fail on these types.

#!/bin/bash
# Locate provenance registrations for derivation/copied variants
fd -a "provenance.cc" --type f -x rg -n "Provenance::Register|DerivationProvenance|CopiedProvenance" -C2 {}
src/libstore/include/nix/store/legacy-ssh-store.hh (1)

76-79: isUsefulProvenance() returns true but ServeProto does not transmit provenance.

This store uses ServeProto, which lacks provenance support in its serialization. Returning true misleads callers into attaching provenance metadata that will be silently dropped. Either return false to accurately reflect protocol capabilities, or extend ServeProto to include provenance support.

src/libutil/include/nix/util/provenance.hh (2)

3-8: Add explicit standard library headers for used types.

The header uses std::map, std::string, std::string_view, and std::shared_ptr but relies on transitive includes from ref.hh and canon-path.hh. Add explicit headers for better portability and self-containment.

♻️ Suggested includes
 `#include` "nix/util/ref.hh"
 `#include` "nix/util/canon-path.hh"

 `#include` <functional>
+#include <map>
+#include <memory>
+#include <string>
+#include <string_view>

 `#include` <nlohmann/json_fwd.hpp>

12-23: Add virtual destructor to polymorphic base class.

Provenance is used polymorphically (stored via std::shared_ptr<const Provenance> in SubpathProvenance::next and throughout the codebase). Deleting a derived object through a base pointer without a virtual destructor is undefined behavior.

🐛 Proposed fix
 struct Provenance
 {
+    virtual ~Provenance() = default;
+
     static ref<const Provenance> from_json_str(std::string_view);
src/libflake/include/nix/flake/flake.hh (1)

98-101: Provenance member addition looks good.

The std::shared_ptr<const Provenance> member with documentation is well-structured. The explicit destructor declaration on line 123 ensures the complete type will be available where ~Flake() is defined.

As noted in a previous review, ensure flake.cc includes the full provenance.hh header where Flake::~Flake() is defined, since std::shared_ptr destruction requires the complete type.

src/libfetchers/tarball.cc (2)

18-34: Register FetchurlProvenance for deserialization.

The FetchurlProvenance type defines to_json() but lacks a corresponding Provenance::Register instantiation. Without registration, deserializing provenance JSON with "type": "fetchurl" will fall back to UnknownProvenance, losing the typed information.


107-107: Consider sanitizing URL before storing in provenance.

Storing the raw URL in provenance can persist credentials (userinfo) or signed query parameters into store metadata and .narinfo files. Consider redacting sensitive URL components before creating FetchurlProvenance.

src/libfetchers/filtering-source-accessor.cc (1)

71-76: Enforce access checks and subpath provenance wrapping.

This override skips checkAccess() and returns root provenance for subpaths, which drops subpath context and can expose provenance for disallowed paths. If SubpathProvenance isn’t already visible here, add the appropriate provenance header.

🛠️ Suggested fix
 std::shared_ptr<const Provenance> FilteringSourceAccessor::getProvenance(const CanonPath & path)
 {
+    checkAccess(path);
     if (provenance)
-        return provenance;
+        return path.isRoot() ? provenance : std::make_shared<SubpathProvenance>(provenance, path);
     return next->getProvenance(prefix / path);
 }
src/libstore/dummy-store.cc (1)

235-243: Provenance parameter is currently dropped in addToStoreFromDump.

provenance is accepted but never assigned to the generated ValidPathInfo, so dummy-store paths lose provenance.

🛠️ Suggested fix
     info.narSize = narHash.second.value();
+    info.provenance = provenance;

Also applies to: 286-291

🧹 Nitpick comments (2)
src/libstore/provenance.cc (1)

6-14: DerivationProvenance always includes next as null, unlike other types.

DerivationProvenance::to_json() always includes the "next" key (with nullptr when absent), while SubpathProvenance and CopiedProvenance conditionally omit it. Consider aligning the serialization pattern for consistency.

♻️ Optional: align with conditional pattern
 nlohmann::json DerivationProvenance::to_json() const
 {
-    return nlohmann::json{
+    nlohmann::json j{
         {"type", "derivation"},
         {"drv", drvPath.to_string()},
         {"output", output},
-        {"next", next ? next->to_json() : nlohmann::json(nullptr)},
     };
+    if (next)
+        j["next"] = next->to_json();
+    return j;
 }
src/libstore/build-result.cc (1)

103-165: Guard provenance serialization for AlreadyValid results.

When AlreadyValid or ResolvesToAlreadyValid status is present, provenance should not be serialized since the existing outputs may have been built with different provenance than the current derivation. Currently, to_json() will emit provenance if present regardless of status, creating a risk of serializing misleading provenance if accidentally set. Add defensive guards during serialization and deserialization to prevent this.

♻️ Suggested guards
-                if (success.provenance)
+                if (success.provenance
+                    && success.status != BuildResult::Success::AlreadyValid
+                    && success.status != BuildResult::Success::ResolvesToAlreadyValid)
                     res["provenance"] = success.provenance->to_json();
-        s.provenance = provenanceFromJson(optionalValueAt(json, "provenance"));
+        s.provenance = provenanceFromJson(optionalValueAt(json, "provenance"));
+        if (s.status == BuildResult::Success::AlreadyValid
+            || s.status == BuildResult::Success::ResolvesToAlreadyValid)
+            s.provenance.reset();

@github-actions github-actions bot temporarily deployed to pull request January 23, 2026 14:35 Inactive
Makes it clearer that this store path is not a derivation itself, but
the result of building a derivation.
@github-actions github-actions bot temporarily deployed to pull request January 23, 2026 14:41 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/libstore/provenance.cc`:
- Around line 27-33: Add a Provenance::Register for the "build" type so
Provenance::from_json() can deserialize BuildProvenance; mirror the existing
registerCopiedProvenance pattern: register a Provenance::Register("build",
[](nlohmann::json json){ ... }) that calls getObject(json), extracts optional
next via optionalValueAt and converts it with Provenance::from_json, and
constructs and returns make_ref<BuildProvenance>(...) passing the string/fields
read from valueAt/getString for the keys BuildProvenance expects (match the
BuildProvenance constructor parameters). Ensure the lambda uses the same helper
functions (getObject, optionalValueAt, valueAt, getString) and returns a shared
ref to BuildProvenance.
🧹 Nitpick comments (1)
src/libstore/provenance.cc (1)

6-14: Inconsistent handling of next field compared to CopiedProvenance.

BuildProvenance::to_json() always emits "next" (as null if absent), while CopiedProvenance::to_json() omits the field entirely when next is null. Consider aligning the pattern for consistency—either both emit null or both omit the field.

♻️ Suggestion to align with CopiedProvenance pattern
 nlohmann::json BuildProvenance::to_json() const
 {
-    return nlohmann::json{
+    nlohmann::json j{
         {"type", "build"},
         {"drv", drvPath.to_string()},
         {"output", output},
-        {"next", next ? next->to_json() : nlohmann::json(nullptr)},
     };
+    if (next)
+        j["next"] = next->to_json();
+    return j;
 }

Comment on lines +27 to +33
Provenance::Register registerCopiedProvenance("copied", [](nlohmann::json json) {
auto & obj = getObject(json);
std::shared_ptr<const Provenance> next;
if (auto prov = optionalValueAt(obj, "next"))
next = Provenance::from_json(*prov);
return make_ref<CopiedProvenance>(getString(valueAt(obj, "from")), next);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for existing "build" provenance registration across the codebase
rg -n 'Provenance::Register.*"build"' --type cpp

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

cat -n src/libstore/provenance.cc | head -60

Repository: DeterminateSystems/nix-src

Length of output: 1206


🏁 Script executed:

rg -n 'Provenance::Register' --type cpp -A 2

Repository: DeterminateSystems/nix-src

Length of output: 595


🏁 Script executed:

# Search for BuildProvenance constructor usage and creation
rg -n 'BuildProvenance' --type cpp -B 2 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 1563


🏁 Script executed:

# Search for from_json calls on provenance
rg -n 'from_json.*provenance|Provenance::from_json' --type cpp -B 2 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 3959


Add missing Provenance::Register for BuildProvenance ("build" type).

BuildProvenance is constructed during the build process (in derivation-builder.cc) and serialized with "type": "build", but lacks a deserialization registration. Without this registration, Provenance::from_json() returns UnknownProvenance when deserializing from .narinfo files, the ValidPaths table, and daemon communications, breaking round-trip serialization.

🛠️ Proposed fix: Add registration for BuildProvenance
 Provenance::Register registerCopiedProvenance("copied", [](nlohmann::json json) {
     auto & obj = getObject(json);
     std::shared_ptr<const Provenance> next;
     if (auto prov = optionalValueAt(obj, "next"))
         next = Provenance::from_json(*prov);
     return make_ref<CopiedProvenance>(getString(valueAt(obj, "from")), next);
 });
+
+Provenance::Register registerBuildProvenance("build", [](nlohmann::json json) {
+    auto & obj = getObject(json);
+    std::shared_ptr<const Provenance> next;
+    if (auto prov = optionalValueAt(obj, "next"))
+        next = Provenance::from_json(*prov);
+    return make_ref<BuildProvenance>(
+        StorePath(getString(valueAt(obj, "drv"))),
+        getString(valueAt(obj, "output")),
+        next);
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Provenance::Register registerCopiedProvenance("copied", [](nlohmann::json json) {
auto & obj = getObject(json);
std::shared_ptr<const Provenance> next;
if (auto prov = optionalValueAt(obj, "next"))
next = Provenance::from_json(*prov);
return make_ref<CopiedProvenance>(getString(valueAt(obj, "from")), next);
});
Provenance::Register registerCopiedProvenance("copied", [](nlohmann::json json) {
auto & obj = getObject(json);
std::shared_ptr<const Provenance> next;
if (auto prov = optionalValueAt(obj, "next"))
next = Provenance::from_json(*prov);
return make_ref<CopiedProvenance>(getString(valueAt(obj, "from")), next);
});
Provenance::Register registerBuildProvenance("build", [](nlohmann::json json) {
auto & obj = getObject(json);
std::shared_ptr<const Provenance> next;
if (auto prov = optionalValueAt(obj, "next"))
next = Provenance::from_json(*prov);
return make_ref<BuildProvenance>(
StorePath(getString(valueAt(obj, "drv"))),
getString(valueAt(obj, "output")),
next);
});
🤖 Prompt for AI Agents
In `@src/libstore/provenance.cc` around lines 27 - 33, Add a Provenance::Register
for the "build" type so Provenance::from_json() can deserialize BuildProvenance;
mirror the existing registerCopiedProvenance pattern: register a
Provenance::Register("build", [](nlohmann::json json){ ... }) that calls
getObject(json), extracts optional next via optionalValueAt and converts it with
Provenance::from_json, and constructs and returns make_ref<BuildProvenance>(...)
passing the string/fields read from valueAt/getString for the keys
BuildProvenance expects (match the BuildProvenance constructor parameters).
Ensure the lambda uses the same helper functions (getObject, optionalValueAt,
valueAt, getString) and returns a shared ref to BuildProvenance.

@github-actions github-actions bot temporarily deployed to pull request January 23, 2026 16:00 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/libstore/store-api.cc`:
- Around line 875-886: Replace the use of srcStore.config.getHumanReadableURI()
in addCopiedProvenance with the direct stable renderer: call
srcStore.config.getReference().render(/*withParams=*/false) when constructing
the CopiedProvenance so the persisted provenance uses the same stable identifier
pattern as other stores (e.g., binary-cache-store.cc) and avoids relying on
getHumanReadableURI() indirection.
♻️ Duplicate comments (2)
src/libstore/include/nix/store/legacy-ssh-store.hh (1)

76-79: Verify if returning true is intentional given ServeProto limitations.

A past review raised concerns that ServeProto doesn't support provenance transmission in its ReadConn/WriteConn structures. The method was renamed from isUsefulProvenance() to includeInProvenance(), but the return value remained true.

If includeInProvenance() means "should this store be recorded in provenance when paths are copied FROM it", then returning true is appropriate. If it means "can this store persist/transmit provenance data", then it may be misleading.

src/libfetchers/tarball.cc (1)

12-34: Confirm fetchurl provenance is registered for deserialization.

I don’t see the Provenance::Register hookup in this file; if it’s missing elsewhere, "type":"fetchurl" will deserialize as UnknownProvenance. Please verify it exists (or add it in a fetchers provenance registry).

#!/bin/bash
# Verify fetchurl provenance registration exists somewhere in the repo.
rg -nP 'Provenance::Register|register.*Provenance' src/libfetchers src/libutil src/libstore
rg -nP '"fetchurl"' src/libfetchers src/libutil src/libstore

Comment on lines +875 to +886
/**
* Wrap upstream provenance in a "copied" provenance record to record
* where the path was copied from. But uninformative origins like
* LocalStore are omitted.
*/
static std::shared_ptr<const Provenance>
addCopiedProvenance(std::shared_ptr<const Provenance> provenance, Store & srcStore)
{
if (!srcStore.includeInProvenance())
return provenance;
return std::make_shared<const CopiedProvenance>(srcStore.config.getHumanReadableURI(), provenance);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how getHumanReadableURI is used in provenance contexts and whether there's a stable alternative

# Find other usages of CopiedProvenance to see what URI format is expected
echo "=== CopiedProvenance usages ==="
rg -n "CopiedProvenance" --type cpp -A 3 -B 1

echo ""
echo "=== getReference().render() usages ==="
# Check if there's a pattern for stable store URIs
rg -n "getReference\(\).render" --type cpp -C 2

echo ""
echo "=== getHumanReadableURI usages (context) ==="
# Find all usages of getHumanReadableURI to understand the pattern
rg -n "getHumanReadableURI\(\)" --type cpp -B 1 -A 2 | head -50

Repository: DeterminateSystems/nix-src

Length of output: 14078


Use getReference().render(/*withParams=*/false) instead of getHumanReadableURI() for consistency with provenance storage.

While getHumanReadableURI() currently delegates to getReference().render(/*withParams=*/false) and thus produces a stable value, provenance records are persisted in JSON and the codebase pattern for stable, persistent identifiers consistently uses getReference().render() directly (see binary-cache-store.cc, store-api.cc). Using the direct method avoids unnecessary indirection and ensures the code won't break if getHumanReadableURI() behavior changes in the future.

🤖 Prompt for AI Agents
In `@src/libstore/store-api.cc` around lines 875 - 886, Replace the use of
srcStore.config.getHumanReadableURI() in addCopiedProvenance with the direct
stable renderer: call
srcStore.config.getReference().render(/*withParams=*/false) when constructing
the CopiedProvenance so the persisted provenance uses the same stable identifier
pattern as other stores (e.g., binary-cache-store.cc) and avoids relying on
getHumanReadableURI() indirection.

@github-actions github-actions bot temporarily deployed to pull request January 25, 2026 21:34 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 4, 2026 10:06 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants