-
Notifications
You must be signed in to change notification settings - Fork 25
Leads 212 remove unittest mocking #148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Leads 212 remove unittest mocking #148
Conversation
WalkthroughThis pull request expands static analysis to include the tests directory, establishes pytest configuration across test modules, adds type annotations to script files, and refactors test files to use explicit type hints with pytest fixtures rather than implicit fixture injection. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
tests/unit/core/config/test_models.py (1)
19-82: Remove lint-suppression comments; use typing casts instead.
# pylint: disableand# pyright: ignoreare disallowed here; please address the underlying typing issue instead.✅ Proposed fix (remove suppressions with casts)
+from typing import cast + import pytest from pydantic import ValidationError @@ assert turn.contexts is not None - assert ( - turn.contexts[0] # pylint: disable=unsubscriptable-object - == "Python context" - ) + contexts = cast(list[str], turn.contexts) + assert contexts[0] == "Python context" @@ - TurnData( + invalid_contexts = cast(list[str], [{"title": "No content field"}]) + TurnData( turn_id="1", query="Valid query", response="Valid response", - contexts=[ - {"title": "No content field"} - ], # pyright: ignore[reportArgumentType] + contexts=invalid_contexts, ) @@ assert len(turn.contexts) == 3 - assert ( - turn.contexts[0] # pylint: disable=unsubscriptable-object - == "First context" - ) - assert ( - turn.contexts[1] # pylint: disable=unsubscriptable-object - == "Second context" - ) - assert ( - turn.contexts[2] # pylint: disable=unsubscriptable-object - == "Third context" - ) + contexts = cast(list[str], turn.contexts) + assert contexts[0] == "First context" + assert contexts[1] == "Second context" + assert contexts[2] == "Third context"As per coding guidelines, "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead."
tests/unit/core/models/test_api_additional.py (1)
93-108: Remove the pylint disable by using explicit type narrowing withcast().Instead of suppressing the lint rule, use
cast()to explicitly narrow the type after theis not Noneassertion. This follows the coding guideline that lint warnings should be fixed, not disabled.✅ Suggested fix
import pytest from pydantic import ValidationError +from typing import cast from lightspeed_evaluation.core.models.api import ( RAGChunk, AttachmentData, APIRequest, APIResponse, )assert request.attachments is not None assert len(request.attachments) == 2 - assert ( - request.attachments[0].content # pylint: disable=unsubscriptable-object - == "file1" - ) + attachments = cast(list[AttachmentData], request.attachments) + assert attachments[0].content == "file1"script/run_multi_provider_eval.py (1)
153-153: Pylint disable comments for broad-except violate coding guidelines.The
# pylint: disable=broad-exceptcomments at lines 153, 497, 606, and 1299 conflict with the coding guideline: "Do not disable lint warnings with # pylint: disable comments - fix the underlying issue instead."Consider catching more specific exceptions or using a custom exception handler. For a script that needs to handle various failure modes gracefully, you could catch
Exceptionand configure pylint globally to allow this pattern in scripts, or document the specific exceptions that are expected.Also applies to: 497-497, 606-606, 1299-1299
tests/unit/pipeline/evaluation/test_evaluator.py (1)
490-499:# type: ignorecomment violates coding guidelines.The
# type: ignore[no-untyped-def]comment should be avoided. Instead, add proper type annotations to the function.🔧 Proposed fix to add type annotations
- def create_mock_handler( # type: ignore[no-untyped-def] - mocker: MockerFixture, - mock_return: tuple[float, str] | list[tuple[float, str]], - ): + def create_mock_handler( + mocker: MockerFixture, + mock_return: tuple[float, str] | list[tuple[float, str]], + ) -> mocker.Mock:Note: You may need to use
Anyas the return type ifmocker.Mockcauses issues:from typing import Any def create_mock_handler( mocker: MockerFixture, mock_return: tuple[float, str] | list[tuple[float, str]], ) -> Any:
🤖 Fix all issues with AI agents
In `@tests/script/conftest.py`:
- Line 1: Remove the top-line pylint suppression for redefined-outer-name and
fix the underlying naming conflicts in conftest.py: locate any fixture or helper
function names in conftest.py that shadow outer/module-level names (e.g.,
fixture functions or variables with the same identifier as imports or globals),
rename those functions/fixtures to unique descriptive names, update all test
imports/uses to the new names, and ensure no identifiers in conftest.py
duplicate outer scope names so the redefined-outer-name lint is no longer
triggered.
In `@tests/script/test_compare_evaluations.py`:
- Line 2: Remove the file-level "# pylint: disable=protected-access" in
tests/script/test_compare_evaluations.py and fix any tests that access protected
members; locate uses of underscored attributes or methods in this file (e.g.,
any references like object._foo, module._bar or TestClass._internal) and replace
them with the public API (call public methods, access properties, or add a small
public helper on the tested module to expose the necessary state), or refactor
the test to use fixtures/mocks to obtain the needed state instead of reaching
into protected members.
In `@tests/script/test_run_multi_provider_eval.py`:
- Line 2: Remove the file-level "# pylint:
disable=protected-access,too-few-public-methods" line and fix the root causes:
stop accessing protected members (avoid names with a single leading underscore)
by switching tests to use the public API or adding explicit helper functions on
the module under test instead of referencing internals (search for any
references to _private attributes or methods in this test module), and for any
tiny test classes that triggered "too-few-public-methods" either convert them
into module-level test functions or add a meaningful public method to the class
so it has a proper public API; then delete the disable comment.
- Around line 397-405: The inline pylint disable on consider-using-with in the
helper track_temp_file should be removed; instead, refactor track_temp_file to
manage the temporary file lifecycle via a context manager/ExitStack or finalizer
so you don't instantiate NamedTemporaryFile without a with-block. Specifically,
replace the direct call to original_named_temp_file in track_temp_file by
opening the temp file using a context manager (or registering it with a
contextlib.ExitStack or an appropriate cleanup callback) so created_temp_path is
captured while ensuring the file is properly closed/cleaned up, then delete the
"# pylint: disable=consider-using-with" comment and adjust any callers to rely
on the managed resource from track_temp_file.
In `@tests/unit/core/api/test_client.py`:
- Line 1: Remove the file-level "# pylint: disable=protected-access" in
tests/unit/core/api/test_client.py and refactor tests to avoid accessing
protected members directly: update tests that reference protected symbols (e.g.,
Client._internal_state or APIClient._do_something) to exercise the same behavior
via the public API (constructor, public methods like Client.get_state or
APIClient.perform_action) or add narrow, test-only helpers on the production
module (e.g., a public accessor or factory) that expose the necessary state for
assertions; if mutation is needed, use pytest fixtures or monkeypatch to set
internal conditions instead of referencing protected attributes.
In `@tests/unit/core/metrics/test_geval.py`:
- Line 1: Remove the file-level pragma "# pylint:
disable=too-many-public-methods,protected-access" and refactor the tests in the
test_geval module to eliminate the need for these disables: split large test
classes into smaller focused test classes (so you no longer trigger
too-many-public-methods), replace direct calls to protected/private members with
tests that exercise the public API, or explicitly expose internal helpers via a
small public wrapper/helper function to be tested instead of using
protected-access; update any test class names or helper names you adjust
accordingly and run lint to verify the disables are no longer required.
In `@tests/unit/core/metrics/test_nlp.py`:
- Line 1: Remove the file-level pylint disables in
tests/unit/core/metrics/test_nlp.py and refactor the tests to satisfy lint
rules: consolidate related fixtures into one or more helper fixtures (e.g.,
create a fixture object or factory to hold shared setup), split large test
classes into smaller classes per metric function (referencing the existing test
classes in test_nlp.py), and convert tests that cause
too-many-arguments/positional-arguments into parameterized or helper functions
to reduce argument count; ensure each new test class has few public methods to
avoid the too-few-public-methods warning and delete the top-of-file "# pylint:
disable=..." line once refactoring is complete.
In `@tests/unit/core/models/test_api_additional.py`:
- Around line 31-38: Update the test_rag_chunk_extra_field_forbidden test to
remove the pyright ignore and validate via the Pydantic v2 API: call
RAGChunk.model_validate with a dict including content, source, and extra_field
and assert it raises ValidationError; reference the RAGChunk class and its
model_validate method so the extra field is rejected by
ConfigDict(extra="forbid") instead of silencing the linter.
In `@tests/unit/core/models/test_system_additional.py`:
- Around line 194-200: The test test_custom_visualization_config uses a tuple
for figsize which forced a pyright ignore; update the figsize argument in the
VisualizationConfig instantiation to use a list (e.g., [12, 8]) to match the
expected type and remove the "# pyright: ignore" comment so the lint/type check
passes; locate the VisualizationConfig usage in test_custom_visualization_config
and replace the tuple with a list accordingly.
In `@tests/unit/core/output/test_final_coverage.py`:
- Line 1: Remove the file-level pylint disable and instead update
tests/unit/core/output/test_final_coverage.py to avoid protected-access and
class lint warnings: replace direct calls to the private helpers
_calculate_stats and _generate_csv_report by exercising their behavior through
the public API functions or by adding small public test helpers (e.g.,
calculate_stats, generate_csv_report) in the module under test; if the test
defines single-method test classes causing too-few-public-methods, convert them
to plain test functions or add a clear public-purpose method name to the test
class; ensure no # pylint: disable lines remain and update imports to use the
public symbols you added or the module's public functions when asserting
expected output.
In `@tests/unit/core/system/test_validator.py`:
- Line 1: Remove the file-level "# pylint: disable=protected-access" and stop
calling protected methods directly; either (A) add a public
DataValidator.validate_data(evaluation_data: list[EvaluationData]) -> bool that
wraps the current _validate_evaluation_data logic and update tests to call
DataValidator.validate_data(...) instead of _validate_evaluation_data, or (B)
refactor the tests in test_validator.py to exercise the public
DataValidator.load_evaluation_data(...) by creating temporary YAML files (use
existing test patterns) and asserting validation through the public API; ensure
no protected members (like _validate_evaluation_data) are referenced and update
imports/fixtures accordingly.
🟡 Minor comments (14)
tests/unit/runner/test_evaluation.py-1-2 (1)
1-2: Remove the pylint disable comment and fix the underlying issue.The file-level
# pylint: disable=unused-argumentviolates the coding guidelines. Thecapsysfixture is declared but unused in three test methods:
test_run_evaluation_success(line 33)test_run_evaluation_with_output_dir_override(line 103)test_run_evaluation_closes_pipeline_on_exception(line 266)Remove the
capsysparameter from those methods instead of suppressing the warning.🔧 Proposed fix
Remove line 1:
-# pylint: disable=unused-argument - """Unit tests for runner/evaluation.py."""Then remove
capsysfrom the affected test signatures:def test_run_evaluation_success( self, mocker: MockerFixture, - capsys: pytest.CaptureFixture, ) -> None:def test_run_evaluation_with_output_dir_override( self, mocker: MockerFixture, - capsys: pytest.CaptureFixture, ) -> None:def test_run_evaluation_closes_pipeline_on_exception( self, mocker: MockerFixture, - capsys: pytest.CaptureFixture, ) -> None:As per coding guidelines: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead."
tests/unit/core/output/test_generator.py-46-51 (1)
46-51: Incorrect type annotation formock_system_configparameter.
MockerFixtureis the type of the pytest-mockmockerfixture itself, not the type of mock objects it creates. Themock_system_configfixture (from conftest.py) returns amocker.Mock()object.The same issue occurs at lines 141 and 167.
Consider using
Anyfromtyping, or define a protocol/type alias that represents the mock's structure. Alternatively, since the mock's structure matches a specific config interface, you could type it as that interface or leave it untyped.🔧 Proposed fix
+from typing import Any + from pytest_mock import MockerFixtureThen update the affected signatures:
def test_generate_csv_report( self, tmp_path: Path, sample_results: list[EvaluationResult], - mock_system_config: MockerFixture, + mock_system_config: Any, ) -> None:Apply similar changes to
test_generate_reports_creates_files(line 141) andtest_generate_reports_with_empty_results(line 167).tests/unit/core/output/test_generator.py-1-2 (1)
1-2: Pylint disable comment violates coding guidelines.As per coding guidelines for
**/*.py: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead."However, this is a unit test file that intentionally tests protected methods (
_calculate_stats,_generate_csv_report, etc.) for thorough coverage. If testing protected methods is the accepted approach here, consider documenting this decision or restructuring tests to use only public APIs.tests/unit/core/models/test_data.py-45-48 (1)
45-48: Remove# pyright: ignore[reportArgumentType]comment.This inline ignore violates the coding guidelines. While this test intentionally passes invalid input to verify validation behavior, the guideline requires fixing the underlying issue rather than suppressing the warning. Consider using
typing.castorAnytyping to indicate intentional type mismatch in test scenarios.As per coding guidelines: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead"
♻️ Suggested approach
+from typing import Any, cast + ... - expected_tool_calls=[ # pyright: ignore[reportArgumentType] - [{"tool_name": "test_tool", "arguments": {"key": "value"}}] - ], + expected_tool_calls=cast( + Any, + [[{"tool_name": "test_tool", "arguments": {"key": "value"}}]] + ),tests/unit/core/models/test_data.py-53-65 (1)
53-65: Remove# pylint: disable=unsubscriptable-objectcomments.These inline disable comments violate the coding guidelines. The
expectedvariable is typed asOptional[list[list[list[dict[str, Any]]]]], and after theassert expected is not Nonecheck, pylint should recognize it's subscriptable. Consider adding explicit type narrowing or restructuring the assertions.As per coding guidelines: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead"
♻️ Suggested approach
Extract the nested access into local variables with explicit type assertions, or use a helper function to simplify the deep access pattern:
expected = turn_data.expected_tool_calls assert expected is not None - assert ( - len(expected) == 1 # pylint: disable=unsubscriptable-object - ) # One alternative set - assert ( - len(expected[0]) == 1 # pylint: disable=unsubscriptable-object - ) # One sequence in the set - assert ( - len(expected[0][0]) == 1 # pylint: disable=unsubscriptable-object - ) # One tool call in the sequence - assert ( - expected[0][0][0]["tool_name"] # pylint: disable=unsubscriptable-object - == "test_tool" - ) + assert len(expected) == 1 # One alternative set + first_set = expected[0] + assert len(first_set) == 1 # One sequence in the set + first_sequence = first_set[0] + assert len(first_sequence) == 1 # One tool call in the sequence + assert first_sequence[0]["tool_name"] == "test_tool"tests/unit/core/llm/test_custom.py-1-1 (1)
1-1: Pylint disable comment violates coding guidelines.The file-level
# pylint: disable=protected-access,disable=too-few-public-methodscomment should be removed. As per coding guidelines for**/*.py: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead."For
protected-access: If tests need to access_token_callback, consider whether this method should be part of the public API or if there's an alternative testing approach.For
too-few-public-methods: Test classes inherently have few methods; this rule may need to be configured globally in pylint settings rather than suppressed per-file.tests/unit/core/llm/conftest.py-9-16 (1)
9-16: Improve type annotation for the fixture return type.The return type should be
dict[str, Any]instead ofdictfor consistency with typing conventions. The parameter naming (max_completion_tokens) is intentional—this fixture provides raw API parameters, and the manager layer explicitly mapsmax_tokens(LLMConfig field) tomax_completion_tokens(API parameter name) as documented inmanager.pyline 100.Suggested improvement
+from typing import Any + + `@pytest.fixture` -def llm_params() -> dict: +def llm_params() -> dict[str, Any]: """Create sample LLM parameters."""tests/unit/core/metrics/test_manager.py-1-1 (1)
1-1: Remove the# pylint: disablecomment.As per coding guidelines, lint warnings should not be disabled with
# pylint: disablecomments. Thetoo-many-public-methodswarning indicates the test class may benefit from being split into smaller, focused test classes (e.g.,TestMetricManagerResolveMetrics,TestMetricManagerThresholds,TestMetricManagerMetadata).As per coding guidelines: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead"
tests/unit/core/metrics/conftest.py-1-1 (1)
1-1: Remove the# pylint: disablecomment.As per coding guidelines, lint warnings should not be disabled with
# pylint: disablecomments. The underlying issue should be fixed instead.For the
redefined-outer-namewarning in pytest fixtures, the standard solution is to configure pylint to recognize pytest fixtures. This can be done in your pylint configuration (e.g.,pyproject.tomlor.pylintrc) by adding thepylint-pytestplugin or by excluding test files from this specific check.🔧 Suggested approaches
Option 1: Use
pylint-pytestpluginAdd to your pylint configuration:
[tool.pylint.main] load-plugins = ["pylint_pytest"]Option 2: Configure in pyproject.toml for test files
[tool.pylint."messages control"] disable = ["redefined-outer-name"]Either approach avoids inline disable comments while properly handling pytest's fixture pattern.
As per coding guidelines: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead"
tests/unit/core/output/conftest.py-80-95 (1)
80-95: Incorrect return type annotation.The return type is annotated as
MockerFixture, but the function returnsconfig, which is aMockobject. The return type should reflect what the fixture actually returns.🔧 Proposed fix
`@pytest.fixture` -def mock_system_config(mocker: MockerFixture) -> MockerFixture: +def mock_system_config(mocker: MockerFixture) -> mocker.Mock: """Create mock system config.""" config = mocker.Mock()Alternatively, since the mock represents a
SystemConfig, you could use a more specific type:from unittest.mock import MagicMock # ... def mock_system_config(mocker: MockerFixture) -> MagicMock:tests/unit/core/metrics/conftest.py-95-144 (1)
95-144: Incorrect return type annotations on mock fixtures.The return types for
mock_bleu_scorer,mock_rouge_scorer, andmock_similarity_scorerare annotated asMockerFixture, but they returnMockinstances (mock_scorer_instance).🔧 Proposed fix
`@pytest.fixture` -def mock_bleu_scorer(mocker: MockerFixture) -> MockerFixture: +def mock_bleu_scorer(mocker: MockerFixture) -> mocker.MagicMock: """Mock sacrebleu BLEU with configurable return value. ... """ # ... implementation ... return mock_scorer_instance `@pytest.fixture` -def mock_rouge_scorer(mocker: MockerFixture) -> MockerFixture: +def mock_rouge_scorer(mocker: MockerFixture) -> mocker.MagicMock: """Mock RougeScore with configurable return value. ... """ # ... implementation ... return mock_scorer_instance `@pytest.fixture` -def mock_similarity_scorer(mocker: MockerFixture) -> MockerFixture: +def mock_similarity_scorer(mocker: MockerFixture) -> mocker.MagicMock: """Mock NonLLMStringSimilarity with configurable return value.""" # ... implementation ... return mock_scorer_instanceAlternatively, use
MagicMockfromunittest.mockfor the type hint, or create a protocol/type alias for mock objects.tests/unit/core/api/test_streaming_parser.py-227-232 (1)
227-232: Avoidpyright: ignore; use anAny-typed token instead.Line 231 suppresses the type checker; per guidelines, avoid ignore comments and type the test input explicitly.
As per coding guidelines: Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.🔧 Suggested adjustment
- token = "not a dict" - result = _parse_tool_call(token) # pyright: ignore[reportArgumentType] + token: Any = "not a dict" + result = _parse_tool_call(token)tests/unit/pipeline/evaluation/test_processor.py-1-2 (1)
1-2: File-level# pylint: disablecomment violates coding guidelines.Similar to
test_evaluator.py, this file has a file-level pylint disable that should be addressed by fixing the underlying issues rather than suppressing warnings.tests/unit/pipeline/evaluation/test_evaluator.py-1-2 (1)
1-2: File-level# pylint: disablecomment violates coding guidelines.The coding guidelines state: "Do not disable lint warnings with
# noqa,# type: ignore, or# pylint: disablecomments - fix the underlying issue instead."Consider addressing the underlying issues:
protected-access: Extract helper methods or use public APIs for testingredefined-outer-name: Rename fixtures or test parameters to avoid shadowingtoo-many-arguments/too-many-positional-arguments: Consider using test helper classes or fixture composition to reduce parameter count
🧹 Nitpick comments (5)
tests/unit/core/metrics/custom/test_tool_eval.py (1)
53-55: Type ignore comment added for test scenario.The
# pyright: ignore[reportArgumentType]is used to intentionally pass invalid types to test error handling. While this conflicts with the coding guideline against type ignore comments, this is a common pattern for testing edge cases.Consider using
typing.castorAnytype to avoid the ignore comment while maintaining the test intent:Alternative approach
+from typing import Any + def test_empty_pattern_match_primary(self) -> None: """Test empty pattern match as primary.""" - expected: list[list[dict]] = [[]] # Primary: no tools expected - actual: list = [] + expected: Any = [[]] # Primary: no tools expected + actual: Any = [] - success, details = evaluate_tool_calls( - expected, actual # pyright: ignore[reportArgumentType] - ) + success, details = evaluate_tool_calls(expected, actual)script/run_multi_provider_eval.py (1)
821-824: Type ignore comments can be avoided with better type declaration.The
# type: ignore[assignment]comments are used becauseconfidence_intervalis being set toNoneafter being initialized with a dict structure. This can be avoided by properly typingscore_statsto include the optional nature ofconfidence_interval:Suggested fix
if all_scores: - score_stats: dict[str, Any] = { + score_stats: dict[str, Any] = { "mean": float(np.mean(all_scores)), "median": float(np.median(all_scores)), "std": float(np.std(all_scores)), "min": float(np.min(all_scores)), "max": float(np.max(all_scores)), "count": len(all_scores), + "confidence_interval": None, # Will be populated if scipy available and len > 1 } # Calculate confidence interval for mean score (95% confidence) if len(all_scores) > 1: try: from scipy import stats as scipy_stats # ... calculation ... score_stats["confidence_interval"] = { # ... dict contents ... } except ImportError: logger.warning( "scipy not available, skipping confidence interval calculation" ) - score_stats["confidence_interval"] = None # type: ignore[assignment] - else: - # Single score - no confidence interval - score_stats["confidence_interval"] = None # type: ignore[assignment]tests/unit/core/metrics/test_manager.py (1)
37-38: Consider using explicit empty list assertion.The change from
assert metrics == []toassert not metricsis functionally equivalent for lists, but the original assertion was more explicit about expecting an empty list specifically. The truthiness check would also pass forNone,0,"", etc.💡 Optional: More explicit assertion
# Should return empty list - assert not metrics + assert metrics == []tests/unit/pipeline/evaluation/conftest.py (1)
8-23: Consider consolidating imports from the same module.There are two separate import statements from
lightspeed_evaluation.core.models(lines 8-12 and line 16). These could be consolidated for cleaner organization.💡 Optional: Consolidate imports
from lightspeed_evaluation.core.models import ( EvaluationData, + EvaluationRequest, + EvaluationResult, SystemConfig, TurnData, ) from lightspeed_evaluation.core.system.loader import ConfigLoader from lightspeed_evaluation.core.metrics.manager import MetricManager from lightspeed_evaluation.core.script import ScriptExecutionManager -from lightspeed_evaluation.core.models import EvaluationResult, EvaluationRequest from lightspeed_evaluation.pipeline.evaluation.amender import APIDataAmendertests/unit/pipeline/evaluation/test_errors.py (1)
24-24: Inconsistent type annotation: uselist[str]instead of barelist.Line 24 uses
listwhile line 122 useslist[str]for the same type. For consistency and clarity, use the more specific annotation.♻️ Proposed fix
- resolved_conversation_metrics: list = [] + resolved_conversation_metrics: list[str] = []
| @@ -0,0 +1,214 @@ | |||
| # pylint: disable=redefined-outer-name | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the pylint suppression for redefined-outer-name.
Line 1 disables a lint rule; please fix the underlying naming issue instead of suppressing it.
As per coding guidelines: Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.
🤖 Prompt for AI Agents
In `@tests/script/conftest.py` at line 1, Remove the top-line pylint suppression
for redefined-outer-name and fix the underlying naming conflicts in conftest.py:
locate any fixture or helper function names in conftest.py that shadow
outer/module-level names (e.g., fixture functions or variables with the same
identifier as imports or globals), rename those functions/fixtures to unique
descriptive names, update all test imports/uses to the new names, and ensure no
identifiers in conftest.py duplicate outer scope names so the
redefined-outer-name lint is no longer triggered.
| @@ -1,4 +1,6 @@ | |||
| #!/usr/bin/env python3 | |||
| # pylint: disable=protected-access | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the file-level pylint disable.
Line 2 suppresses protected-access linting; please fix the underlying issue instead of disabling it.
As per coding guidelines: Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.
🤖 Prompt for AI Agents
In `@tests/script/test_compare_evaluations.py` at line 2, Remove the file-level "#
pylint: disable=protected-access" in tests/script/test_compare_evaluations.py
and fix any tests that access protected members; locate uses of underscored
attributes or methods in this file (e.g., any references like object._foo,
module._bar or TestClass._internal) and replace them with the public API (call
public methods, access properties, or add a small public helper on the tested
module to expose the necessary state), or refactor the test to use
fixtures/mocks to obtain the needed state instead of reaching into protected
members.
| @@ -1,95 +1,27 @@ | |||
| #!/usr/bin/env python3 | |||
| # pylint: disable=protected-access,too-few-public-methods | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the file-level pylint disable and fix underlying issues.
Line 2 disables pylint rules across the file; please avoid suppressions and address the lint causes directly.
As per coding guidelines: Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.
🤖 Prompt for AI Agents
In `@tests/script/test_run_multi_provider_eval.py` at line 2, Remove the
file-level "# pylint: disable=protected-access,too-few-public-methods" line and
fix the root causes: stop accessing protected members (avoid names with a single
leading underscore) by switching tests to use the public API or adding explicit
helper functions on the module under test instead of referencing internals
(search for any references to _private attributes or methods in this test
module), and for any tiny test classes that triggered "too-few-public-methods"
either convert them into module-level test functions or add a meaningful public
method to the class so it has a proper public API; then delete the disable
comment.
| # Track the temp file path that gets created | ||
| created_temp_path = None | ||
| original_named_temp_file = temp_module.NamedTemporaryFile | ||
|
|
||
| def track_temp_file(*args, **kwargs): | ||
| def track_temp_file(*args: Any, **kwargs: Any) -> Any: | ||
| nonlocal created_temp_path | ||
| temp_file = original_named_temp_file(*args, **kwargs) | ||
| temp_file = original_named_temp_file( # pylint: disable=consider-using-with | ||
| *args, **kwargs | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid inline pylint disable in track_temp_file.
Line 403 disables consider-using-with; refactor to manage the temp file via a context stack/finalizer instead of suppressing the warning.
As per coding guidelines: Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.
🔧 One way to avoid the lint suppression
@@
-import logging
+import logging
+from contextlib import ExitStack
@@
- def test_temp_config_cleanup_on_yaml_dump_failure(
- self,
- runner: MultiProviderEvaluationRunner,
- mocker: MockerFixture,
- ) -> None:
+ def test_temp_config_cleanup_on_yaml_dump_failure(
+ self,
+ runner: MultiProviderEvaluationRunner,
+ mocker: MockerFixture,
+ request: pytest.FixtureRequest,
+ ) -> None:
@@
- original_named_temp_file = temp_module.NamedTemporaryFile
+ original_named_temp_file = temp_module.NamedTemporaryFile
+ exit_stack = ExitStack()
+ request.addfinalizer(exit_stack.close)
@@
- temp_file = original_named_temp_file( # pylint: disable=consider-using-with
- *args, **kwargs
- )
+ temp_file = exit_stack.enter_context(
+ original_named_temp_file(*args, **kwargs)
+ )🤖 Prompt for AI Agents
In `@tests/script/test_run_multi_provider_eval.py` around lines 397 - 405, The
inline pylint disable on consider-using-with in the helper track_temp_file
should be removed; instead, refactor track_temp_file to manage the temporary
file lifecycle via a context manager/ExitStack or finalizer so you don't
instantiate NamedTemporaryFile without a with-block. Specifically, replace the
direct call to original_named_temp_file in track_temp_file by opening the temp
file using a context manager (or registering it with a contextlib.ExitStack or
an appropriate cleanup callback) so created_temp_path is captured while ensuring
the file is properly closed/cleaned up, then delete the "# pylint:
disable=consider-using-with" comment and adjust any callers to rely on the
managed resource from track_temp_file.
| @@ -1,45 +1,23 @@ | |||
| # pylint: disable=protected-access | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the file-level pylint disable and address protected-access cleanly.
Line 1 disables pylint for protected access across the file; please refactor tests to avoid the suppression (e.g., exercise behavior via public methods or narrow helpers) instead of disabling the rule.
As per coding guidelines: Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.
🤖 Prompt for AI Agents
In `@tests/unit/core/api/test_client.py` at line 1, Remove the file-level "#
pylint: disable=protected-access" in tests/unit/core/api/test_client.py and
refactor tests to avoid accessing protected members directly: update tests that
reference protected symbols (e.g., Client._internal_state or
APIClient._do_something) to exercise the same behavior via the public API
(constructor, public methods like Client.get_state or APIClient.perform_action)
or add narrow, test-only helpers on the production module (e.g., a public
accessor or factory) that expose the necessary state for assertions; if mutation
is needed, use pytest fixtures or monkeypatch to set internal conditions instead
of referencing protected attributes.
| @@ -1,3 +1,5 @@ | |||
| # pylint: disable=too-many-arguments,too-many-positional-arguments,disable=too-few-public-methods | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove file-level pylint disables and refactor to satisfy lint rules.
The blanket disable for argument counts / class size should be replaced by refactors (e.g., bundle related fixtures into a helper fixture object, split test classes by metric, or move to module-level tests) rather than suppressing lint.
As per coding guidelines: Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.
🤖 Prompt for AI Agents
In `@tests/unit/core/metrics/test_nlp.py` at line 1, Remove the file-level pylint
disables in tests/unit/core/metrics/test_nlp.py and refactor the tests to
satisfy lint rules: consolidate related fixtures into one or more helper
fixtures (e.g., create a fixture object or factory to hold shared setup), split
large test classes into smaller classes per metric function (referencing the
existing test classes in test_nlp.py), and convert tests that cause
too-many-arguments/positional-arguments into parameterized or helper functions
to reduce argument count; ensure each new test class has few public methods to
avoid the too-few-public-methods warning and delete the top-of-file "# pylint:
disable=..." line once refactoring is complete.
| def test_rag_chunk_extra_field_forbidden(self) -> None: | ||
| """Test that extra fields are forbidden.""" | ||
| with pytest.raises(ValidationError): | ||
| RAGChunk(content="content", source="source", extra_field="not allowed") | ||
| RAGChunk( | ||
| content="content", | ||
| source="source", | ||
| extra_field="not allowed", # pyright: ignore[reportCallIssue] | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Locate and examine the test file
find . -path "*/tests/unit/core/models/test_api_additional.py" -type fRepository: lightspeed-core/lightspeed-evaluation
Length of output: 128
🏁 Script executed:
# Check for Pydantic version in project dependencies
fd -t f "(pyproject.toml|requirements|setup.py)" | head -20Repository: lightspeed-core/lightspeed-evaluation
Length of output: 209
🏁 Script executed:
# Search for RAGChunk model definition
rg "class RAGChunk" -A 10Repository: lightspeed-core/lightspeed-evaluation
Length of output: 941
🏁 Script executed:
# Read the specific test file to see the current state at lines 31-38
sed -n '25,45p' ./tests/unit/core/models/test_api_additional.pyRepository: lightspeed-core/lightspeed-evaluation
Length of output: 773
🏁 Script executed:
# Check Pydantic version in pyproject.toml
cat pyproject.toml | grep -A 5 -B 5 "pydantic"Repository: lightspeed-core/lightspeed-evaluation
Length of output: 1138
Remove the pyright ignore by using model_validate to validate the dictionary.
Pydantic v2.0+ is confirmed in dependencies. The RAGChunk model has ConfigDict(extra="forbid") configured, so using model_validate() will properly raise ValidationError for extra fields while removing the type-ignore comment that violates the coding guideline against disabling lint warnings.
✅ Suggested fix
- with pytest.raises(ValidationError):
- RAGChunk(
- content="content",
- source="source",
- extra_field="not allowed", # pyright: ignore[reportCallIssue]
- )
+ with pytest.raises(ValidationError):
+ RAGChunk.model_validate(
+ {
+ "content": "content",
+ "source": "source",
+ "extra_field": "not allowed",
+ }
+ )🤖 Prompt for AI Agents
In `@tests/unit/core/models/test_api_additional.py` around lines 31 - 38, Update
the test_rag_chunk_extra_field_forbidden test to remove the pyright ignore and
validate via the Pydantic v2 API: call RAGChunk.model_validate with a dict
including content, source, and extra_field and assert it raises ValidationError;
reference the RAGChunk class and its model_validate method so the extra field is
rejected by ConfigDict(extra="forbid") instead of silencing the linter.
| def test_custom_visualization_config(self) -> None: | ||
| """Test custom visualization configuration.""" | ||
| config = VisualizationConfig( | ||
| enabled_graphs=["pass_rates", "score_distribution"], | ||
| dpi=150, | ||
| figsize=(12, 8), | ||
| figsize=(12, 8), # pyright: ignore[reportArgumentType] | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid pyright: ignore by using the correct type for figsize.
Replace the tuple with a list to satisfy typing and remove the ignore.
As per coding guidelines, “Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.”
✅ Suggested change
- figsize=(12, 8), # pyright: ignore[reportArgumentType]
+ figsize=[12, 8],📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_custom_visualization_config(self) -> None: | |
| """Test custom visualization configuration.""" | |
| config = VisualizationConfig( | |
| enabled_graphs=["pass_rates", "score_distribution"], | |
| dpi=150, | |
| figsize=(12, 8), | |
| figsize=(12, 8), # pyright: ignore[reportArgumentType] | |
| ) | |
| def test_custom_visualization_config(self) -> None: | |
| """Test custom visualization configuration.""" | |
| config = VisualizationConfig( | |
| enabled_graphs=["pass_rates", "score_distribution"], | |
| dpi=150, | |
| figsize=[12, 8], | |
| ) |
🤖 Prompt for AI Agents
In `@tests/unit/core/models/test_system_additional.py` around lines 194 - 200, The
test test_custom_visualization_config uses a tuple for figsize which forced a
pyright ignore; update the figsize argument in the VisualizationConfig
instantiation to use a list (e.g., [12, 8]) to match the expected type and
remove the "# pyright: ignore" comment so the lint/type check passes; locate the
VisualizationConfig usage in test_custom_visualization_config and replace the
tuple with a list accordingly.
| @@ -1,16 +1,23 @@ | |||
| # pylint: disable=protected-access,too-few-public-methods | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove file-level pylint disables and fix the underlying lint triggers.
The blanket disable hides issues (e.g., protected-access usage of _calculate_stats / _generate_csv_report and class lint warnings). Please refactor tests to avoid protected access and reduce lint violations instead of suppressing them.
As per coding guidelines: Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead.
🤖 Prompt for AI Agents
In `@tests/unit/core/output/test_final_coverage.py` at line 1, Remove the
file-level pylint disable and instead update
tests/unit/core/output/test_final_coverage.py to avoid protected-access and
class lint warnings: replace direct calls to the private helpers
_calculate_stats and _generate_csv_report by exercising their behavior through
the public API functions or by adding small public test helpers (e.g.,
calculate_stats, generate_csv_report) in the module under test; if the test
defines single-method test classes causing too-few-public-methods, convert them
to plain test functions or add a clear public-purpose method name to the test
class; ensure no # pylint: disable lines remain and update imports to use the
public symbols you added or the module's public functions when asserting
expected output.
| @@ -1,23 +1,27 @@ | |||
| # pylint: disable=protected-access | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import ast, pathlib
files = list(pathlib.Path("src").rglob("validator.py"))
for f in files:
tree = ast.parse(f.read_text())
for node in tree.body:
if isinstance(node, ast.ClassDef) and node.name == "DataValidator":
print(f"\n{f}:")
for item in node.body:
if isinstance(item, ast.FunctionDef):
is_protected = item.name.startswith("_") and not item.name.startswith("__")
prefix = "[protected] " if is_protected else "[public] "
print(f" {prefix}{item.name}")
PYRepository: lightspeed-core/lightspeed-evaluation
Length of output: 456
🏁 Script executed:
cat -n tests/unit/core/system/test_validator.py | head -50Repository: lightspeed-core/lightspeed-evaluation
Length of output: 2035
🏁 Script executed:
cat -n tests/unit/core/system/test_validator.py | tail -n +50 | head -100Repository: lightspeed-core/lightspeed-evaluation
Length of output: 4311
🏁 Script executed:
wc -l tests/unit/core/system/test_validator.pyRepository: lightspeed-core/lightspeed-evaluation
Length of output: 125
🏁 Script executed:
cat -n tests/unit/core/system/test_validator.py | tail -n +150 | head -100Repository: lightspeed-core/lightspeed-evaluation
Length of output: 4415
🏁 Script executed:
# Count how many times protected methods are called in the test
rg '_validate_evaluation_data|_validate_metrics_availability|_validate_metric_requirements|_check_metric_requirements|_filter_by_scope|_validate_scripts|_validate_single_script' tests/unit/core/system/test_validator.py --countRepository: lightspeed-core/lightspeed-evaluation
Length of output: 83
🏁 Script executed:
cat -n src/lightspeed_evaluation/core/system/validator.py | head -100Repository: lightspeed-core/lightspeed-evaluation
Length of output: 4483
🏁 Script executed:
cat -n src/lightspeed_evaluation/core/system/validator.py | grep -A 30 "def load_evaluation_data"Repository: lightspeed-core/lightspeed-evaluation
Length of output: 1486
🏁 Script executed:
cat -n tests/unit/core/system/test_validator.py | grep -B 5 "def test_load_evaluation_data"Repository: lightspeed-core/lightspeed-evaluation
Length of output: 1538
🏁 Script executed:
# Check if other tests create temp files for load_evaluation_data
rg "tempfile|\.dump|yaml\.dump" tests/unit/core/system/test_validator.py -A 3 -B 3Repository: lightspeed-core/lightspeed-evaluation
Length of output: 1265
🏁 Script executed:
# Check the entire load_evaluation_data implementation
cat -n src/lightspeed_evaluation/core/system/validator.py | sed -n '104,250p'Repository: lightspeed-core/lightspeed-evaluation
Length of output: 6667
🏁 Script executed:
# Check if there's a way to test without file I/O - look for any method that accepts data objects
cat -n src/lightspeed_evaluation/core/system/validator.py | grep -E "def |validation_errors|evaluate"Repository: lightspeed-core/lightspeed-evaluation
Length of output: 1035
🏁 Script executed:
# Understand test pattern - count tests that use protected vs public methods
echo "=== Tests using _validate_evaluation_data directly ===" && \
rg "\.\_validate_evaluation_data\(" tests/unit/core/system/test_validator.py | wc -l && \
echo "=== Tests using load_evaluation_data ===" && \
rg "\.load_evaluation_data\(" tests/unit/core/system/test_validator.py | wc -lRepository: lightspeed-core/lightspeed-evaluation
Length of output: 296
🏁 Script executed:
# Check again with proper regex
echo "=== Tests using _validate_evaluation_data directly ===" && \
rg '_validate_evaluation_data\(' tests/unit/core/system/test_validator.py | wc -l && \
echo "=== Tests using load_evaluation_data ===" && \
rg 'load_evaluation_data\(' tests/unit/core/system/test_validator.py | wc -lRepository: lightspeed-core/lightspeed-evaluation
Length of output: 181
🏁 Script executed:
# View all test methods to understand the structure
rg "def test_" tests/unit/core/system/test_validator.py | head -30Repository: lightspeed-core/lightspeed-evaluation
Length of output: 1722
🏁 Script executed:
# View more of the test file to understand test patterns
cat -n tests/unit/core/system/test_validator.py | sed -n '44,120p'Repository: lightspeed-core/lightspeed-evaluation
Length of output: 3305
Remove file-level pylint disable and refactor tests to use public validator API.
The # pylint: disable=protected-access violates project guidelines: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments — fix the underlying issue instead."
The tests currently call protected methods like _validate_evaluation_data() directly on data objects. Instead, either:
- Add a public
validate_data(evaluation_data: list[EvaluationData]) -> boolmethod to DataValidator for unit-testing validation logic in isolation, or - Refactor tests to use the public
load_evaluation_data()method with temporary YAML files (as some existing tests already do).
This aligns the test behavior with the designed public API contract.
🤖 Prompt for AI Agents
In `@tests/unit/core/system/test_validator.py` at line 1, Remove the file-level "#
pylint: disable=protected-access" and stop calling protected methods directly;
either (A) add a public DataValidator.validate_data(evaluation_data:
list[EvaluationData]) -> bool that wraps the current _validate_evaluation_data
logic and update tests to call DataValidator.validate_data(...) instead of
_validate_evaluation_data, or (B) refactor the tests in test_validator.py to
exercise the public DataValidator.load_evaluation_data(...) by creating
temporary YAML files (use existing test patterns) and asserting validation
through the public API; ensure no protected members (like
_validate_evaluation_data) are referenced and update imports/fixtures
accordingly.
Description
Stacked on top of PR#146. Review after PR#146 is merged.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.