Skip to content

Build Ergot's shared media processing plane for photogrammetry, Histrio, and future cluster workflows #3

@willgriffin

Description

@willgriffin

Summary

Turn Ergot into the shared media processing plane for all Ergot-managed assets, not just the first local photogrammetry workflow.

Ergot is already the MAM and ingest system for raw media coming from capture nodes and syncing to the cloud distro. Photogrammetry is the first transformation family, but Histrio and future media products will need the same durable processing substrate for image, video, audio, and metadata workflows.

We should build the real processing plan here in ergot.io so downstream systems can submit workflow requests against Ergot-managed assets and receive derived artifacts with durable lineage.

Problem

The current implementation already has the beginnings of the right shape:

  • versioned workflow definitions
  • durable output requests
  • durable processing runs
  • a local pipeline agent
  • a ComfyUI runtime adapter

But it is still biased toward the first photogrammetry slice and local ComfyUI execution.

We now need Ergot to support:

  • multiple transformation families beyond photogrammetry
  • multiple runtimes beyond ComfyUI
  • durable local and remote execution
  • lineage from raw ingest through every derived artifact
  • a stable contract for downstream consumers like Histrio

Goals

  • Make Ergot the shared media processing plane for all Ergot-managed assets.
  • Support both local and remote/cluster execution with the same logical workflow definitions.
  • Preserve asset lineage, workflow versioning, runtime metadata, and execution logs for every run.
  • Support Histrio as a first downstream consumer without making the processing plane Histrio-specific.
  • Support future image, video, audio, and metadata workflows from the same orchestration model.

Non-Goals

  • Do not move Histrio editorial, review, or publishing state into Ergot.
  • Do not make ComfyUI the orchestrator.
  • Do not make Studio Server the control plane. It should remain a runtime/backend behind the orchestrator.
  • Do not keep the core processing model photogrammetry-specific.

Product Direction

Treat Ergot as:

  • the system of record for ingesting and managing source media
  • the durable workflow and execution control plane for media transmutation
  • the lineage graph from raw capture through every derived asset

Treat photogrammetry as the first implemented transformation family, not the permanent boundary of the system.

Treat Histrio as a downstream consumer of Ergot-managed assets and Ergot-executed media workflows.

Required Scope

1. Generalize the workflow model

Extend the workflow definition system so it is runtime-neutral rather than implicitly Comfy-only.

Support workflow definitions with:

  • immutable versioning
  • runtime type
  • capability tags
  • input schema
  • output schema
  • artifact map
  • execution-plan metadata
  • support flags for local and remote execution

2. Generalize the request/run model

Refactor or extend the current request/run modeling so processing can be requested against Ergot-managed assets in a generic way, not only source package versions tied to the first photogrammetry flow.

The model needs to support:

  • source asset refs
  • request context
  • workflow settings
  • execution target
  • priority
  • retries
  • cancellation
  • partial success

3. Add runtime target and worker modeling

Introduce first-class runtime target and worker concepts so Ergot can route work to available execution backends.

Support targets such as:

  • local ComfyUI runtimes
  • cluster ComfyUI runtimes
  • TTS runtimes
  • face embedding runtimes
  • transcription runtimes
  • FFmpeg/video assembly runtimes

Track for each target:

  • capabilities
  • health
  • concurrency
  • auth configuration
  • locality
  • queue affinity or routing hints

4. Add durable orchestration behavior

Build the missing execution-plane behavior:

  • queueing
  • claiming
  • heartbeats
  • retries
  • timeout handling
  • cancellation
  • failure recording
  • structured run events/logs
  • runtime ids and prompt/job ids

5. Add runtime adapters beyond ComfyUI

Keep ComfyUI as one runtime adapter, then add adapters for:

  • Studio-style TTS / voice prompt extraction
  • face embedding
  • transcription
  • FFmpeg/video assembly

The orchestration layer should be able to execute a workflow against one runtime or multiple stages/runtimes while preserving one durable run record.

6. Add artifact and lineage support for multi-output workflows

Support workflows that produce multiple derived outputs from the same request.

Requirements:

  • multi-artifact output mapping
  • first-class stored artifacts or MAM files
  • parent/child lineage between source and derived files
  • promotion of derived artifacts into reusable references/libraries
  • stable refs for external consumers

7. Add external consumer API

Expose a stable API for downstream systems like Histrio to:

  • submit processing requests
  • poll run status
  • list runs
  • fetch logs/events
  • retry or cancel a run
  • fetch produced artifact refs

This contract must stay generic. Histrio-specific editorial concerns remain in Histrio.

8. Add operator UI

Extend the dashboard so operators can manage and observe the processing plane:

  • workflow definitions
  • queued/running/completed runs
  • run detail and logs
  • runtime target health
  • artifact lineage

9. Add remote/cluster execution support

Remote execution is already implied by the model. Build the real path so cloud-synced assets can be processed off-node with the same workflow contract.

10. Add Histrio starter workflows

Add the first Histrio-oriented workflow family so the processing plane proves it can serve more than photogrammetry.

Initial workflow set:

  • face embedding
  • voice prompt extraction
  • TTS synthesis
  • seed/scene prep
  • motion generation
  • segment assembly
  • poster extraction
  • QC

Acceptance Criteria

  • A processing request can be submitted against Ergot-managed media without assuming a photogrammetry-specific source model.
  • The same workflow definition model can represent both ComfyUI and non-Comfy execution.
  • At least one non-Comfy runtime executes successfully through the same orchestration system.
  • A completed run stores logs, runtime metadata, prompt/job ids, and produced artifacts with full lineage.
  • Produced outputs are queryable as Ergot-managed assets/artifacts.
  • Local and remote execution targets are both supported by the same logical processing model.
  • Histrio can request a run from Ergot and consume result artifacts through a stable contract without owning runtime orchestration.
  • The dashboard exposes workflow definitions, queued/running/completed runs, runtime health, and artifact lineage.

Suggested Implementation Checklist

  • Generalize WorkflowDefinition beyond Comfy-only definitions.
  • Generalize OutputRequest / ProcessingRun into a shared processing-plane contract, or add a neutral layer above them.
  • Add runtime target and worker registry models.
  • Add execution event/log storage.
  • Add queue claim/heartbeat/retry/cancel behavior.
  • Refactor the existing Comfy runtime into a pluggable runtime adapter.
  • Add Studio runtime adapters for TTS, face, and transcription.
  • Add FFmpeg/video assembly runtime adapter.
  • Add artifact/result mapping and lineage recording for multi-output runs.
  • Add remote execution path for cluster/cloud processing.
  • Add processing-plane APIs for submit/status/retry/cancel/artifacts.
  • Add dashboard screens for workflows, queue, runs, runtime health, and lineage.
  • Add Histrio starter workflows and request/response contract.
  • Add end-to-end tests for local and remote execution.
  • Document the processing-plane architecture and runtime authoring model.

Existing Anchors In This Repo

Build on these existing pieces rather than replacing them:

  • packages/ergotio-core/src/models/WorkflowDefinition.ts
  • packages/ergotio-core/src/models/OutputRequest.ts
  • packages/ergotio-core/src/models/ProcessingRun.ts
  • packages/ergotio-core/src/workflows/definitions.ts
  • agents/ergotio-pipeline/src/ergotio-pipeline.ts
  • agents/ergotio-pipeline/src/services/comfy-runtime.ts
  • README.md

Suggested Follow-Up Issues

Once this lands, split the implementation into follow-up tickets for:

  1. Generic workflow/run model refactor
  2. Runtime target registry and worker orchestration
  3. Studio/FFmpeg runtime adapters
  4. Remote cluster execution path
  5. Histrio integration against Ergot processing APIs

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions