Skip to content

Impl Multi-Thread Producer–Consumer Architecture for Large-Scale Entity Generation and Batch Persistence #61

@yyytir777

Description

@yyytir777

The current single-threaded entity generation and persistence process becomes a performance bottleneck as data volume increases.
When testing with large-scale entity creation, the following results were observed:

Instance Count Elapsed Time Result
10,000 ~4 seconds ✅ Successful
100,000 ~21 seconds ✅ Successful
1,000,000 ❌ Application crashed due to memory exhaustion

Problem

The existing implementation holds generated entities in memory before persistence, leading to excessive heap usage and eventual OutOfMemoryError at high scale.
This design cannot handle large datasets efficiently and lacks parallelism between entity creation and database writing.

Expected Improvements

  • Enable large-scale entity generation exceeding 100 million+ records without memory exhaustion by using a streaming, producer–consumer architecture.
  • Achieve near-linear scalability by separating CPU (generation) and I/O (persistence) workloads.
  • Reduce total processing time significantly by parallelizing entity creation and batch persistence.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions