Skip to content

Prepare sample database script test cases#128

Open
AwaisKamran wants to merge 38 commits intomainfrom
prepare-sample-database-script-test-cases
Open

Prepare sample database script test cases#128
AwaisKamran wants to merge 38 commits intomainfrom
prepare-sample-database-script-test-cases

Conversation

@AwaisKamran
Copy link
Contributor

Description

This PR corresponds to the following Write-Test-Cases-For-Prepare-Sample-Dataset

@AwaisKamran AwaisKamran self-assigned this May 22, 2025
Copilot AI review requested due to automatic review settings May 22, 2025 19:25
@AwaisKamran AwaisKamran added the enhancement New feature or request label May 22, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds unit tests for the sample dataset preparation pipeline and updates the signature and in-call usage of the add_schema_used function.

  • Added tests for functions such as get_train_file_path, create_train_file, copy_bird_train_file, get_train_data, and add_schema_used.
  • Updated add_schema_used to accept an additional train_file parameter and modified calls accordingly.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
server/test/preprocess/test_prepare_sample_dataset.py Adds comprehensive test cases for sample dataset functions.
server/preprocess/prepare_sample_dataset.py Updates the add_schema_used function signature and call sites to pass train_file as a Path object; adjusts file path handling.
server/init.py Introduces package documentation for the server package.


if train_data:
add_schema_used(train_data, dataset_type)
add_schema_used(train_data, dataset_type, Path(train_file))
Copy link

Copilot AI May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the public function and its caller share the same name 'add_schema_used' but now with an added parameter, which could lead to confusion or unintended recursion; consider renaming one of these to clearly differentiate their responsibilities.

Copilot uses AI. Check for mistakes.
train_file = get_train_file_path()
dataset_type = PATH_CONFIG.sample_dataset_type
train_data = get_train_data(train_file)
train_data = get_train_data(Path(train_file))
Copy link

Copilot AI May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider using consistent types for file paths across your functions; either update get_train_data to accept a Path object or convert the Path to a string before passing it.

Suggested change
train_data = get_train_data(Path(train_file))
train_data = get_train_data(Path(train_file)) # Ensure get_train_data supports Path objects

Copilot uses AI. Check for mistakes.
@patch('os.path.exists', return_value=True)
@patch('os.makedirs')
@patch('shutil.copyfile')
@patch('server.preprocess.prepare_sample_dataset.add_sequential_ids_to_questions') # Mocking API key during test
Copy link

Copilot AI May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment 'Mocking API key during test' is misleading for add_sequential_ids_to_questions; please update it to accurately reflect the mocked functionality.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants