Skip to content

Merge Duplicate Factory Implementations #60

@SoulEvill

Description

@SoulEvill

Description

Consolidate the duplicate factory implementations (factory.py and factory_.py) into a single, well-organized module to eliminate code duplication and improve maintainability.

Background

The Starfish codebase currently has two factory implementations with overlapping functionality:

  • src/starfish/data_factory/factory.py - Contains the @data_factory decorator interface and high-level API (smaller file)
  • src/starfish/data_factory/factory_.py - Contains the main Factory class implementation with 547 lines and 20+ methods

This duplication creates several problems:

  • Developer confusion about which file to modify
  • Increased maintenance burden
  • No clear single source of truth
  • Potential for divergent implementations over time

Both files serve essential purposes but their separation is artificial and creates unnecessary complexity for understanding and maintaining the codebase.

Prerequisites

  • Familiarity with the current factory implementation and its usage patterns
  • Understanding of the @data_factory decorator public API
  • Access to run the full test suite

Task

Merge both factory files into a single cohesive module that eliminates duplication while preserving all existing functionality. The developer should choose the most appropriate approach for consolidation, considering factors like:

  • Code organization and readability
  • Import dependencies throughout the codebase
  • Backward compatibility requirements
  • Testing implications

The implementation approach is flexible - whether to merge into factory.py, factory_.py, or create a new structure entirely is at the developer's discretion.

Acceptance Criteria

  1. Single Source File: Only one factory implementation file exists after completion
  2. Backward Compatibility: All existing public APIs continue to work without changes
  3. Test Compatibility: All existing tests pass with minimal modifications (only import updates allowed)
  4. Import Consistency: All internal imports throughout the codebase are updated and functional
  5. Functionality Preservation: No existing functionality is lost or altered
  6. Clean Exports: The module clearly defines its public interface via __all__ or equivalent
  7. Documentation: Module has clear docstrings explaining its purpose and usage

Notes

  • Consider running a comprehensive search for all imports before making changes
  • Test early and often during the merge process
  • The goal is elimination of duplication, not a complete rewrite of functionality
  • Update any relevant documentation or examples that reference the old file structure

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions