Skip to content

Multi-copy upload via SP-to-SP fetch #494

@rvagg

Description

@rvagg

Summary

Implement multi-copy storage where client uploads once to an endorsed SP, then secondary SPs fetch the piece via HTTP rather than
receiving duplicate uploads.

Background

The current multi-copy approach requires parallel uploads of the same data, which is inefficient for large pieces, presents challenges
for streaming data and our ability to retry uploads, forces awkward API choices, and increases our chances of exposing error conditions to users over alternative options.

See https://www.notion.so/filecoindev/Multi-copy-Synapse-Upload-2bddc41950c1803d9fb6c8830c8a7a28 for background. Option 4 is what's
being outlined here.

This flow is also a small step toward the MSP model that's currently being designed and we'll be working on early next year.

New Flow

Depends on: filecoin-project/curio#828 (SP-to-SP fetch endpoint)

  1. Upload to endorsed SP (streaming supported)
    • uploadPiece() -> findPiece()
    • Failure here = hard fail, user must retry if they want
  2. Construct retrieval URL from endorsed SP
  3. Request secondary SP(s) to fetch piece
    • POST /pdp/piece/fetch (new Curio endpoint)
    • Poll until complete
    • On failure: retry with different SP (if auto-selected)
  4. AddPieces on all SPs in parallel when all have the piece
  5. Return result
    • ALL AddPieces failed = throw
    • Endorsed SP's AddPieces failed but secondary succeeded = throw? (Open: product requirement not met)
    • At least endorsed SP succeeded = return result object (don't throw)

As per the linked document, the benefits of this flow include:

  • Streaming preserved: client sends data once, no need to buffer or replay
  • Simpler retries: secondary SP failures just mean "try another SP", no re-upload, we have additional retry options that we can implement in future iterations of this
  • Reduced client bandwidth: upload once instead of N times
  • Clear failure semantics: endorsed SP failure = hard fail early, secondary failures = recoverable
  • Cleaner API: easier to document and explain, no Promise.allSettled gymnastics, solve for "don't make me understand your system"

API Changes

Simplified options and simplified internal resolver (separate issue):

  • dataSetIds and providerIds mutually exclusive
  • Counts must match count option
  • No options = smart select (endorsed first)

New PDPServer methods:

  • fetchPieces(pieces, recordKeeper, extraData): Promise<FetchPiecesResponse>
  • getFetchStatus(fetchId): Promise<FetchStatusResponse>: it's unclear if this is a "fetch" status, just a general pieceCid status, or we could just reuse fetchPieces as an idempotent call.

Return type for upload():

  • Per-copy status and additional metadata, including transaction hashes, piece IDs
  • Throw only if all copies fail

Additionally, in StorageContext, we split apart what's currently bundled into upload() into separate operations. upload() uses these internally or they can be used separately (see below):

  • store({ data: Uint8Array | ReadableStream, pieceCid?: PieceCID }): Promise<PieceCID>: send the data to the SP for staging / parking, wait till confirmed
  • pull({ pieceCids: PieceCID | PieceCID[], from: StorageContext }): Promise<void>: ask for a fetch from another SP. We should also take optional metadata and nonce parameters since we need to construct a valid extraData and we can make an idempotent API for the user if we allow them to pass through.
  • commit({ pieceCids: PieceCID | PieceCID[], metadata?: MetadataEntry[][] }): Promise<CommitResult>: AddPieces on-chain

With this split, we can make multi-piece, multi-provider storage a user concern but it would be relatively easy to document:

  1. createContexts() (same algorithm as upload() uses)
  2. store() with first SP, repeat for N pieces
  3. pull() with second SP (and third etc.)
  4. commit() on all SPs (can be parallel, e.g. via Promise.all)

In the process we could take this opportunity to kill the complex batched multi-piece handling inside StorageContext once we make it easy for the user to do this themselves.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

🐱 Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions