Skip to content

Consolidate remaining v2 design issues around semiring backend contracts #20

@shinaoka

Description

@shinaoka

cc @tensor4all-meta

Summary

This issue consolidates the remaining design-v2 questions after the recent placement cleanup. The main unresolved theme is how custom semiring backends should relate to primitive vocabulary, trait contracts, and general execution engines.

Remaining issues

1. Custom semiring backend minimum contract is still inconsistent

Two different contracts are described today.

  • primitive-catalog.md treats tensor-structural operations like permute, reshape, broadcast, and diagonal as tensor-layer views / metadata transforms, so the backend-facing strict minimum is close to BatchedGemm + ReduceSum (with diagonal-related behavior depending on whether diagonal stays in the tensor layer).
  • tenferro-internal-design.md instead puts Transpose, Reshape, and BroadcastInDim into SemiringOpKind / SemiringOps, which makes them part of the required semiring contract.

These are materially different obligations for custom backends. The design needs one source of truth.

2. Compile-cache identity rules are still inconsistent

computegraph-design.md says compiled programs are cached using GlobalValKey-based structure, which includes InputKey.
tensor-api-pseudocode.md says the cache key is based on normalized graph topology and should ignore concrete InputKey / DiffPassId values so repeated differentiate calls hit the same cache entry.

This needs an explicit decision:

  • either tenferro owns a second normalized compile-cache layer above computegraph
  • or computegraph itself changes its cache contract

Without that, higher-order AD cache behavior is underspecified.

3. StdTensorOp is not fully a single source of truth yet

There are still mismatches between the vocabulary and the lowering tables. Examples:

  • direct StdTensorOp::Add / Mul / DotGeneral style descriptions vs StdTensorOp::Semiring(SemiringOpKind)
  • Cholesky described as custom-call in one place and direct stablehlo.cholesky in another
  • LuFullPivot / StdTensorOp::CustomCall shown in later sections even though they are not present in the main enum definition

The enum definition, lowering table, and extensibility story should be unified.

Current idea

A cleaner direction is to separate Tenferro primitives from the traits that execute them.

Proposed layering

  1. Primitive descriptors

    • Keep a primitive vocabulary crate/module that only describes operations and their planning/execution descriptors.
    • Examples: semiring core descriptors, scalar descriptors, linalg descriptors, transfer descriptors.
  2. Executor traits

    • Define small family-specific traits that know how to plan/execute those descriptors.
    • Backends implement these traits, not a single giant monolithic backend trait.
  3. General engines

    • Build generic interpreters / engines that consume primitive descriptors and dispatch through the executor traits.
    • Einsum becomes one such engine. Standard tensor execution could become another.
  4. Backend implementations

    • CPU / CUDA / custom algebra backends only implement the trait families they actually support.

Why this looks promising

This is close to what origin/main already does for semiring execution:

  • Semiring defines the algebra
  • TensorSemiringCore<Alg> is the required semiring execution contract
  • TensorSemiringFastPath<Alg> is the optional optimization contract
  • EinsumBackend<Alg> is just the composition of those traits
  • tenferro-einsum is the general engine that interprets einsum plans through those traits

That pattern is attractive because backend authors implement capabilities, while high-level algorithms live in reusable engines.

Design questions to resolve next

  • Should design-v2 explicitly adopt the origin/main descriptor + plan/execute pattern as the reference model?
  • For custom semiring backends, what is the true required core: only contraction/reduction primitives, or also tensor-structural view-like ops?
  • Should diagonal / trace behavior stay in the semiring core, or be normalized away earlier?
  • Should the standard backend story also move from Backend<Op> toward capability-family traits plus one or more general engines?

Suggested resolution order

  1. Fix the custom semiring minimum contract
  2. Fix compile-cache identity semantics
  3. Make StdTensorOp + lowering tables a single source of truth
  4. Then decide how far the primitive-descriptor / executor-trait / engine split should become the v2 architectural pattern

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions