Skip to content

feat(reindex): add EXPORT-based merge ingestion for reindex using file-ready Kafka events and S3-backed record files #914

Open
psmagin wants to merge 9 commits intomasterfrom
MSEARCH-1175
Open

feat(reindex): add EXPORT-based merge ingestion for reindex using file-ready Kafka events and S3-backed record files #914
psmagin wants to merge 9 commits intomasterfrom
MSEARCH-1175

Conversation

@psmagin
Copy link
Collaborator

@psmagin psmagin commented Mar 2, 2026

Purpose

Implement reindex EXPORT flow support and complete related refactoring/documentation updates for mod-search reindex processing.

This change is needed to support file-based merge ingestion (inventory.reindex.file-ready) in addition to publish-based ingestion, and to document/configure the new runtime knobs.

Approach

  • Added EXPORT-mode processing path for merge ingestion:
    • Consume ReindexFileReadyEvent
    • Read exported records from S3-compatible storage
    • Persist records to intermediate tables in batches
  • Added trace_id support in merge ranges (merge_range.trace_id) and wired it into export request payloads.
  • Introduced remote storage configuration/bean binding gated by folio.reindex.reindex-type=EXPORT.
  • Added Kafka listener/container config for file-ready events and refactored reindex Kafka container-factory duplication.
  • Refactored orchestration code to reduce duplication in merge success/failure handling and batch save logic.
  • Updated diagrams and feature docs for full/upload flow, including PUBLISH vs EXPORT branch.
  • Updated module descriptor + README environment variable documentation.

Changes Checklist

  • API Changes: No new public REST paths; existing reindex APIs unchanged.
  • Database Schema Changes: Added trace_id column to merge_range via Liquibase (changes/v6.0/add-trace-id-range-column.xml).
  • Interface Version Changes: None.
  • Interface Dependencies: Added org.folio:folio-s3-client dependency.
  • Permissions: No permission changes.
  • Logging: Added/updated logs for new reindex event paths and storage initialization.
  • Unit Testing: Updated/ran reindex orchestration unit tests.
  • Integration Testing: Existing integration tests updated in codebase; full suite not run in this update session.
  • Manual Testing: Not performed in this update session.
  • NEWS: Not updated in this PR.

Related Issues

MSEARCH-1175

Learning and Resources (if applicable)

  • FOLIO RFC-0003 Breaking Changes guidance was used to assess impact classification.

Screenshots (if applicable)

N/A

@psmagin psmagin changed the title Msearch 1175 feat(reindex): add EXPORT-based merge ingestion for reindex using file-ready Kafka events and S3-backed record files Mar 2, 2026
@psmagin psmagin self-assigned this Mar 3, 2026
@psmagin psmagin marked this pull request as ready for review March 3, 2026 09:08
@psmagin psmagin requested a review from a team as a code owner March 3, 2026 09:08
@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 3, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant