Skip to content

Handle small tactical/openings NPZ batches#102

Merged
lukifer23 merged 1 commit intomasterfrom
codex/clamp-draw-size-and-add-regression-tests
Oct 12, 2025
Merged

Handle small tactical/openings NPZ batches#102
lukifer23 merged 1 commit intomasterfrom
codex/clamp-draw-size-and-add-regression-tests

Conversation

@lukifer23
Copy link
Owner

Summary

  • clamp tactical and openings sampling to the available curriculum data, enabling replacement sampling only when needed
  • skip empty tactical/openings shards to avoid malformed outputs
  • add regression tests that cover sampling from undersized tactical and openings NPZ files

Testing

  • pytest tests/test_data_manager_batches.py

https://chatgpt.com/codex/tasks/task_e_68e6f991ddcc83239164246fad36d7cc

Copilot AI review requested due to automatic review settings October 12, 2025 21:55
@lukifer23 lukifer23 merged commit d3109a5 into master Oct 12, 2025
1 check failed
@lukifer23 lukifer23 deleted the codex/clamp-draw-size-and-add-regression-tests branch October 12, 2025 21:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses the handling of small tactical and openings NPZ batch files to prevent runtime errors when sampling more data than available. The changes enable robust sampling from undersized curriculum data sources and skip empty data files.

Key changes include:

  • Clamping tactical and openings sampling to the available curriculum data size
  • Enabling replacement sampling only when the batch size exceeds the available data
  • Adding empty data validation to skip malformed outputs
  • Including regression tests for undersized NPZ file scenarios

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
tests/test_data_manager_batches.py Adds regression tests that verify batch sampling from undersized tactical and openings NPZ files
azchess/data_manager.py Updates tactical and openings batch methods to handle small files with proper size validation and replacement sampling

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +457 to +459
draw_size = min(batch_size, total_positions)
replace = batch_size > total_positions
target_size = batch_size if replace else draw_size
Copy link

Copilot AI Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The variable draw_size is misleading since it represents the sample size when replacement is disabled, not the size being drawn. Consider renaming to sample_size_no_replace or clamped_size for clarity.

Suggested change
draw_size = min(batch_size, total_positions)
replace = batch_size > total_positions
target_size = batch_size if replace else draw_size
sample_size_no_replace = min(batch_size, total_positions)
replace = batch_size > total_positions
target_size = batch_size if replace else sample_size_no_replace

Copilot uses AI. Check for mistakes.
Comment on lines +510 to +512
draw_size = min(batch_size, total_positions)
replace = batch_size > total_positions
target_size = batch_size if replace else draw_size
Copy link

Copilot AI Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The variable draw_size is misleading since it represents the sample size when replacement is disabled, not the size being drawn. Consider renaming to sample_size_no_replace or clamped_size for clarity.

Suggested change
draw_size = min(batch_size, total_positions)
replace = batch_size > total_positions
target_size = batch_size if replace else draw_size
clamped_size = min(batch_size, total_positions)
replace = batch_size > total_positions
target_size = batch_size if replace else clamped_size

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants