Skip to content

Sweep transactions (1-in-1-out) are incorrectly scored in isolation — privacy is inherited, not generated #72

@4rkad

Description

@4rkad

Summary

A 1-input, 1-output transaction (sweep) is currently evaluated as an independent event. The tool reports zero entropy and a deterministic link (severity: low, impact: 0). This is technically accurate but analytically meaningless — it describes a property shared by every spend, not a privacy deficiency specific to sweeps.

This issue argues that:

  1. A sweep does not degrade the privacy of the spent UTXO
  2. A sweep introduces ownership ambiguity that did not exist before
  3. The correct evaluation requires inspecting the parent transaction

On-chain properties of a 1-in-1-out sweep

These are observable facts, not interpretations:

Property Applies? Why
Common Input Ownership (CIOH) No Single input — no address linkage possible
Change detection No Single output — no change to identify
Consolidation No No UTXO set or balance revealed
Round amount heuristic No Only one output — nothing to compare against
Entropy 0 bits One possible interpretation of fund flow
Script type fingerprint Observable Input and output script types are visible

Every heuristic that the tool uses to penalize transactions requires either multiple inputs, multiple outputs, or both. A sweep has neither. The only observable fact is: funds moved from address A to address B.

The sweep does not degrade privacy

The UTXO at address B carries exactly the same history, taint, and cluster associations as it had at address A. No new information about the owner's UTXO set, spending patterns, or balance is revealed by the sweep itself.

This is verifiable: take any 1-in-1-out transaction and compare the cluster graph before and after. The sweep adds one edge (A → B) but does not expand the cluster — because CIOH cannot fire on a single input.

The sweep introduces ownership ambiguity

Before the sweep, address A provably controls the UTXO. There is no ambiguity.

After the sweep, an observer knows A sent to B, but cannot determine whether:

  • B is the same entity as A (self-transfer, wallet migration)
  • B is a different entity (payment)

This is a form of plausible deniability that did not exist while the UTXO sat unspent. The ambiguity is irresolvable using only on-chain data, unless:

  • B belongs to a known entity (exchange, merchant, tagged address)
  • B already exists in a cluster linked to a different entity via prior CIOH or consolidation
  • The subsequent spend from B consolidates with UTXOs from a known different cluster

In the absence of these signals, the observer cannot assign ownership of B with certainty.

The ambiguity is real even though most sweeps are statistically self-transfers

In practice, exact payments without change are infrequent — most 1-in-1-out sweeps correspond to self-transfers, wallet migrations, or service deposits. A chain analysis professional knows this and will use it as a statistical prior.

However, the possibility that it is a payment exists, and that is enough to prevent the question from being resolved with certainty using on-chain data alone.

An analyst can combine weak signals to reinforce a suspicion — wallet fingerprint, script type, subsequent behavior of B — but none of them individually or combined constitute deterministic proof. They guide the investigation, they do not conclude it.

Combinable signals that guide suspicion

These signals are implementable with data the tool already has or can obtain with one extra API call:

Suggests self-transfer:

  • Same script type between input and output (bc1q → bc1q, bc1p → bc1p)
  • B subsequently consolidates with UTXOs that were already in A's cluster
  • Subsequent spend from B shows the same wallet fingerprint

Suggests change of ownership:

  • B belongs to a known entity
  • B already existed in a different entity's cluster
  • B consolidates with UTXOs from a different cluster than A's

Ambiguous (does not resolve):

  • Different wallet fingerprint in subsequent spend — could be a different owner or simply a software change
  • Different script type (bc1q → bc1p) — could be Taproot migration or payment

The tool should present these signals for what they are: probabilistic indicators that guide, not conclusions.

Proposal: inherit privacy from the parent transaction

Since the sweep neither generates nor destroys privacy, the score should reflect the privacy context of the UTXO being spent — i.e., the parent transaction that created it.

Parent transaction type Inherited evaluation Rationale
CoinJoin equal-output (Whirlpool, JoinMarket) High score Anon set reduces from N to 1 upon spending — note this, but the UTXO retains strong privacy
CoinJoin variable-output (WabiSabi) High score No anon set based on amount equality to degrade
Tx with detected change or round amount Inherits parent penalties These weaknesses pre-existed; the sweep did not cause them
Known entity origin (KYC exchange) Low score (inherited) The taint was present before the sweep
Escrow release (HodlHodl 2-of-3, Bisq 2-of-2) Inherits escrow detection The tool already identifies these via multisig + fee address patterns

For sweep chains (A→B→C→D), resolve recursively up to a defined limit until a non-sweep transaction is found.

Known limitations

Transaction-level vs UTXO-level scoring

This is the principal implementation challenge. The tool currently scores whole transactions. If the parent transaction has two outputs — one identified as payment, one as change — their privacy properties differ:

  • Sweep spends the change output → inherits sender-side linkage
  • Sweep spends the payment output → inherits receiver-side privacy

Without distinguishing which output the sweep spends, the inherited score may be inaccurate. The existing change detection heuristic could serve as a first approximation, with the caveat that change detection itself is probabilistic, not deterministic.

API cost and recursion depth

Each parent transaction lookup requires one API call to mempool.space. Sweep chains require recursive resolution. A reasonable limit (e.g., 5 hops) bounds the cost while covering the vast majority of real-world cases.

Ownership ambiguity is not quantifiable

The plausible deniability benefit is real but difficult to express as a numeric score modifier. It may be better represented as a qualitative finding ("ownership ambiguity: irresolvable from on-chain data alone") rather than a numeric score bonus.

Future improvement: child transaction analysis

Analysis of the subsequent spend from B (wallet fingerprint, consolidation with other clusters) is not proposed as mandatory in this iteration. It requires an extra API call and cross-transaction fingerprint comparison. It is mentioned as a future improvement that would enrich the informational context without modifying the score.

Verification

A chain analysis professional can verify the core claims:

  1. The sweep does not expand clusters: Take any 1-in-1-out tx. Run CIOH. Confirm no new address linkage is produced.
  2. Privacy is inherited: Compare the cluster graph of the input UTXO before and after the sweep. Confirm no new information is added.
  3. Ownership ambiguity exists: Take a 1-in-1-out tx where neither address belongs to a known entity. Attempt to determine ownership of the output using only on-chain data. Confirm it is indeterminate.

References

  • Current implementation: src/lib/analysis/heuristics/change-detection.ts (L40-68), src/lib/analysis/heuristics/entropy.ts (L42-62)
  • Existing backward tracing: src/lib/analysis/chain/
  • Meiklejohn et al., "A Fistful of Bitcoins" (2013) — CIOH requires multiple inputs
  • Nick, "Data-Driven De-Anonymization in Bitcoin" (2015) — change detection requires multiple outputs

Metadata

Metadata

Assignees

No one assigned

    Labels

    analysisHeuristics, scoring, detection, Boltzmann, chain tracingenhancementImprove something that worksneeds-designAmbiguous, needs discussion before implementation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions