Multi-copy upload via SP-to-SP fetch

## Summary

Implement multi-copy storage where client uploads once to an endorsed SP, then secondary SPs fetch the piece via HTTP rather than
receiving duplicate uploads.

## Background

The current multi-copy approach requires parallel uploads of the same data, which is inefficient for large pieces, presents challenges
for streaming data and our ability to retry uploads, forces awkward API choices, and increases our chances of exposing error conditions to users over alternative options.

See https://www.notion.so/filecoindev/Multi-copy-Synapse-Upload-2bddc41950c1803d9fb6c8830c8a7a28 for background. Option 4 is what's 
being outlined here.

This flow is also a small step toward the MSP model that's currently being designed and we'll be working on early next year.

## New Flow

Depends on: https://github.com/filecoin-project/curio/issues/828 (SP-to-SP fetch endpoint)

1. Upload to endorsed SP (streaming supported)
   - `uploadPiece()` -> `findPiece()`
   - Failure here = hard fail, user must retry if they want
2. Construct retrieval URL from endorsed SP
3. Request secondary SP(s) to fetch piece
   - `POST /pdp/piece/fetch` (new Curio endpoint)
   - Poll until complete
   - On failure: retry with different SP (if auto-selected)
4. `AddPieces` on all SPs in parallel when all have the piece
5. Return result
   - ALL `AddPieces` failed = throw
   - Endorsed SP's `AddPieces` failed but secondary succeeded = throw? (Open: product requirement not met)
   - At least endorsed SP succeeded = return result object (don't throw)

As per the linked document, the benefits of this flow include:

- Streaming preserved: client sends data once, no need to buffer or replay
- Simpler retries: secondary SP failures just mean "try another SP", no re-upload, we have additional retry options that we can implement in future iterations of this
- Reduced client bandwidth: upload once instead of `N` times
- Clear failure semantics: endorsed SP failure = hard fail early, secondary failures = recoverable
- Cleaner API: easier to document and explain, no `Promise.allSettled` gymnastics, solve for *"don't make me understand your system"*

## API Changes

Simplified options and simplified internal resolver (separate issue):
- `dataSetIds` and `providerIds` mutually exclusive
- Counts must match `count` option
- No options = smart select (endorsed first)

New `PDPServer` methods:

* `fetchPieces(pieces, recordKeeper, extraData): Promise<FetchPiecesResponse>`
* `getFetchStatus(fetchId): Promise<FetchStatusResponse>`: it's unclear if this is a "fetch" status, just a general `pieceCid` status, or we could just reuse `fetchPieces` as an idempotent call.

Return type for `upload()`:
- Per-copy status and additional metadata, including transaction hashes, piece IDs
- Throw only if all copies fail

Additionally, in `StorageContext`, we split apart what's currently bundled into `upload()` into separate operations. `upload()` uses these internally or they can be used separately (see below):

* `store({ data: Uint8Array | ReadableStream, pieceCid?: PieceCID }): Promise<PieceCID>`: send the data to the SP for staging / parking, wait till confirmed
* `pull({ pieceCids: PieceCID | PieceCID[], from: StorageContext }): Promise<void>`: ask for a fetch from another SP. We should also take optional `metadata` and `nonce` parameters since we need to construct a valid `extraData` and we can make an idempotent API for the user if we allow them to pass through.
* `commit({ pieceCids: PieceCID | PieceCID[], metadata?: MetadataEntry[][] }): Promise<CommitResult>`: `AddPieces` on-chain

With this split, we can make multi-piece, multi-provider storage a user concern but it would be relatively easy to document:

1. `createContexts()` (same algorithm as `upload()` uses)
2. `store()` with first SP, repeat for `N` pieces
3. `pull()` with second SP (and third etc.)
4. `commit()` on all SPs (can be parallel, e.g. via `Promise.all`)

In the process we could take this opportunity to kill the complex batched multi-piece handling inside `StorageContext` once we make it easy for the user to do this themselves.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-copy upload via SP-to-SP fetch #494

Summary

Background

New Flow

API Changes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multi-copy upload via SP-to-SP fetch #494

Description

Summary

Background

New Flow

API Changes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions