Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Feb 5, 2026

Description

Added copy_parallel() method to FileHandler class for concurrent file copying using multiprocessing.Pool. Achieves 2.3x speedup over serial copying with large files (tested with 10×50MB files).

Implementation

  • FileHandler.copy_parallel(filelist, num_processes=None) - Public static method
  • _copy_files_parallel() - Internal implementation with validation and error handling
  • _copy_single_file() - Worker function for pool execution
  • Pre-validates all source files before starting copy operations
  • Propagates any copy error to fail-fast entire operation

Usage

from wxflow import FileHandler

filelist = [
    ['/path/to/source1.txt', '/path/to/dest1.txt'],
    ['/path/to/source2.txt', '/path/to/dest2.txt'],
]

# Use all available CPUs
FileHandler.copy_parallel(filelist)

# Or specify process count
FileHandler.copy_parallel(filelist, num_processes=4)

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • pynorms
  • pytests

Added 6 test cases covering:

  • Basic parallel copy with multiple files
  • Configurable process count
  • Error propagation from failed copies
  • Missing source file handling
  • File integrity verification (SHA-256)
  • Invalid input format validation

Performance validated: 10×50MB files showed 0.333s serial vs 0.145s parallel (2.3x speedup).

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published
Original prompt

This section details on the original issue you should resolve

<issue_title>Add a parallel copy method to FileHandler</issue_title>
<issue_description>Description
A function should be added to the FileHandler class that utilizes multiprocessing.Pool to distribute the copying to multiple cores.

Requirements and acceptance criteria

  • File copies are able to run in parallel.
  • The operation is faster than serial copying when working with large files.
  • Errors in one copy will cause the parent call to fail.
  • Tests are written to exercise this feature.
  • Parallel copies are identical to their sources.</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits February 5, 2026 17:43
Co-authored-by: DavidHuber-NOAA <69919478+DavidHuber-NOAA@users.noreply.github.com>
…port

Co-authored-by: DavidHuber-NOAA <69919478+DavidHuber-NOAA@users.noreply.github.com>
Copilot AI changed the title [WIP] Add parallel copy method to FileHandler Add parallel copy method to FileHandler Feb 5, 2026
Copilot AI requested a review from DavidHuber-NOAA February 5, 2026 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a parallel copy method to FileHandler

2 participants