Skip to content

Problem: PREMIS XML validation fails after /tmp cleanup #170

@jraddaoui

Description

@jraddaoui

Describe the bug

The preprocessing-sfa worker’s custom XML validation activity writes an embedded PREMIS XSD to /tmp and caches the generated file path in an activity struct field. If /tmp is cleaned, the cached path remains set and the activity does not recreate the file, so validation fails until the worker restarts.

To Reproduce

  1. Run preprocessing-sfa worker and execute a workflow that reaches the PREMIS XML validation step.
  2. Confirm the activity generates an XSD file in /tmp (e.g. premis-v3-*) and validation succeeds.
  3. Clean /tmp (or delete the generated premis-v3-* file).
  4. Run the validation step again without restarting the preprocessing worker.
  5. Observe validation failing because the activity continues using the cached /tmp path but does not recreate the missing file.

Expected behavior

  • The preprocessing-sfa worker should not fail PREMIS XML validation due to /tmp cleanup.
  • If the XSD file used for validation is missing, the activity should recover automatically without requiring a worker restart.
  • Avoid relying on a non configurable /tmp location for schema validation.

Additional context

This issue is related to a temporal-activities feature request to extend the existing xmlvalidate activity so it can support both filesystem XSD paths and embedded XSD content. Once that is available, preprocessing-sfa could switch to the shared activity and drop the custom temp file implementation (see temporal-activities issue).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    👍 Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions