Skip to content

Conversation

@neuralsorcerer
Copy link
Collaborator

@neuralsorcerer neuralsorcerer commented Dec 31, 2025

  • Added --outcome_columns to the CLI to explicitly choose outcome columns and preserved ordered defaults when omitted.
  • Updated CLI input validation to require provided outcome columns.
  • Added CLI tests covering explicit outcome column selection, default inference ordering (including sample column), and missing outcome column validation.

Why?

Copilot AI review requested due to automatic review settings December 31, 2025 19:16
@meta-cla meta-cla bot added the cla signed label Dec 31, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds the --outcome_columns CLI argument to allow explicit selection of which columns should be treated as outcomes, addressing the need to exclude certain columns from outcome statistics without removing them from the input data.

Key Changes:

  • Added --outcome_columns optional CLI argument with validation
  • Modified outcome column inference in process_batch to preserve DataFrame column order when using defaults
  • Added comprehensive test coverage for explicit selection, default inference with order preservation, and missing column validation

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
balance/cli.py Added --outcome_columns argument, helper methods has_outcome_columns() and outcome_columns(), input validation for user-specified outcome columns, and modified process_batch to use list comprehension to preserve column order during default inference
tests/test_cli.py Added helper methods _make_cli(), _make_batch_df(), and _recording_sample_cls(), plus three tests covering default inference with order preservation, explicit column selection, and missing column validation
CHANGELOG.md Documented the new --outcome_columns feature under "New Features" section

Copy link
Contributor

@talgalili talgalili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super cool.
Could you please make sure that if outcome columns are specified, all the remaining columns not in Id, weights, covars or outcome, would go to ignore column (which you've added before). And test that it works?

Copy link
Contributor

@talgalili talgalili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks 👍

@meta-codesync
Copy link

meta-codesync bot commented Jan 1, 2026

@talgalili has imported this pull request. If you are a Meta employee, you can view this in D89977120.

@meta-codesync
Copy link

meta-codesync bot commented Jan 1, 2026

@talgalili merged this pull request in 54e0b34.

@neuralsorcerer neuralsorcerer deleted the cli branch January 1, 2026 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add to the cli the ability to indicate the outcome columns

3 participants