Skip to content

ONNX MODULE UPDATE#6

Open
astridesa wants to merge 7 commits intoFLock-io:mainfrom
astridesa:base-onnx
Open

ONNX MODULE UPDATE#6
astridesa wants to merge 7 commits intoFLock-io:mainfrom
astridesa:base-onnx

Conversation

@astridesa
Copy link
Collaborator

@astridesa astridesa commented Aug 14, 2025

Summary by CodeRabbit

  • New Features

    • Introduced ONNX-based time-series forecasting validation with configurable metrics and HTTP-loaded datasets.
    • Added per-task module isolation and a test mode to prevent hard exits during testing.
    • Provided ready-to-use configuration and environment for ONNX validation.
  • Tests

    • Added an end-to-end ONNX validation test harness using mocked services and demo data.
  • Chores

    • Added editor workspace setting to ignore pull requests from the main branch.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 14, 2025

Walkthrough

Introduces a new ONNX validation module for time-series forecasting, supporting model download, inference, and metrics; adds config and environment files; provides an SVR-to-ONNX training utility; updates base interfaces and validation runner for task-scoped modules and test mode; adds an end-to-end ONNX test; and a VSCode setting.

Changes

Cohort / File(s) Summary of Changes
IDE Settings
.vscode/settings.json
Adds VSCode workspace setting to ignore PRs from branch "main".
ONNX Config & Env
configs/onnx.json, validator/modules/onnx/environment.yml
Adds ONNX validation JSON config and Conda environment specifying numpy/pandas/sklearn/scipy/requests/pyarrow plus pip installs for onnxruntime and huggingface-hub.
ONNX Validation Module
validator/modules/onnx/__init__.py
Adds ONNX time-series validation: config/input/metrics models, module class with model/data loading, inference, metrics computation, error handling, cleanup, and MODULE export.
ONNX Training Utility
validator/modules/onnx/train_svm.py
Adds SVR training with feature engineering, scaling, ONNX export via skl2onnx, and CLI orchestration.
Base Module API
validator/modules/base.py
Adds BaseConfig/BaseInputData/BaseMetrics (frozen). Extends BaseValidationModule with metrics_schema, task_type; makes validate concrete (no-op).
Validation Runner Updates
validator/validation_runner.py
Adds test_mode flag; switches to per-task module instances; passes task_id to validate; adjusts error handling to avoid exits in test mode; removes BaseConfig import exposure.
ONNX Test Harness
test_onnx.py
Adds data prep helper and end-to-end ONNX validation test with mocked requests/FedLedger; constructs input data and executes ValidationRunner in test mode.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant ValidationRunner
  participant ONNXModule
  participant HFHub
  participant HTTP as DataSource (HTTP)
  participant ORT as ONNX Runtime

  Caller->>ValidationRunner: perform_validation(task_id, module="onnx", input_data)
  ValidationRunner->>ONNXModule: instantiate (per task_id, config)
  ValidationRunner->>ONNXModule: validate(input_data, task_id)
  ONNXModule->>HFHub: download model (repo_id, filename, rev)
  HFHub-->>ONNXModule: model path
  ONNXModule->>ORT: create InferenceSession(model)
  ONNXModule->>HTTP: GET test_data_url
  HTTP-->>ONNXModule: CSV data
  ONNXModule->>ONNXModule: preprocess/split features, target
  ONNXModule->>ORT: run inference(features)
  ORT-->>ONNXModule: predictions
  ONNXModule->>ONNXModule: compute metrics (mae, rmse, mape, ...)
  ONNXModule-->>ValidationRunner: ONNXMetrics
  ValidationRunner-->>Caller: result
Loading
sequenceDiagram
  participant Test as test_onnx.py
  participant Runner as ValidationRunner(test_mode=True)
  participant Module as ONNXModule
  participant Mocks as Mocked deps

  Test->>Mocks: patch requests.get, FedLedger
  Test->>Runner: perform_validation(...)
  alt error occurs
    Module-->>Runner: raise Recoverable/ValueError
    Runner-->>Test: return None (no exit due to test_mode)
  else success
    Module-->>Runner: ONNXMetrics
    Runner-->>Test: metrics
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

I thump my paws—new graphs to see,
ONNX hums with metrics free.
I nibble data, hop through bytes,
Forecast moons and carrot nights.
With test-mode trails and tidy logs,
I bound through models—happy hogs! 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🔭 Outside diff range comments (1)
validator/modules/base.py (1)

30-35: Validate return type consistency with base implementation

The method signature indicates it returns BaseMetrics, but the base implementation returns None (via pass). This could cause type checking issues. Consider either making it abstract or providing a proper base implementation.

-    @abstractmethod
     def validate(self, data: BaseInputData, **kwargs) -> BaseMetrics:
         """
         Download/prep the repo/revision, run validation, and return metrics parsed into a Pydantic model.
         """
-        pass
+        raise NotImplementedError("Subclasses must implement validate()")
🧹 Nitpick comments (12)
configs/onnx.json (1)

7-7: Add trailing newline to JSON file

JSON files should end with a newline character for better compatibility with various tools and to follow common conventions.

     "evaluation_metrics": ["mae", "rmse", "mape", "smape"],
     "output_dir": "/tmp/onnx_validation"
-}
+}
+
validator/modules/base.py (1)

17-21: Consider making task_type a class-level constant

The task_type attribute should be defined as a class-level constant (e.g., task_type: ClassVar[str]) since it's meant to identify the module type rather than being an instance attribute.

+from typing import ClassVar
 from abc import ABC, abstractmethod
 from pydantic import BaseModel

 class BaseValidationModule(ABC):
     config_schema: type[BaseConfig]
     input_data_schema: type[BaseInputData]
     metrics_schema: type[BaseMetrics]
-    task_type: str
+    task_type: ClassVar[str]
validator/modules/onnx/train_svm.py (2)

129-131: Redundant traceback import

The traceback module is imported twice (lines 129 and 161). Consider importing it once at the module level.

Add at the top of the file with other imports:

 import pandas as pd
 import numpy as np
+import traceback
 from sklearn.svm import SVR

Then remove the local imports:

     except Exception as e:
         print(f"Error: Could not convert to ONNX format: {e}")
-        import traceback
-
         traceback.print_exc()

And:

     except Exception as e:
         print(f"Error during training: {e}")
-        import traceback
-
         traceback.print_exc()

143-145: Consider using sys.exit for error conditions

When the training data is not found, the function returns silently. Consider using sys.exit(1) to indicate failure more explicitly.

+    import sys
+    
     if not csv_path.exists():
         print(f"Error: Training data not found at {csv_path}")
-        return
+        sys.exit(1)
validator/modules/onnx/__init__.py (5)

4-4: Remove unused imports flagged by static analysis

Both imports are unused and flagged by Ruff (F401). Drop them to keep the module clean.

-from pydantic import BaseModel
@@
-import os

Also applies to: 14-14


92-93: Force CPU provider to avoid environment-specific provider issues

Explicitly set providers to ensure consistent behavior (e.g., avoid CUDA provider selection on systems without GPU).

-            self.model = ort.InferenceSession(model_path)
+            self.model = ort.InferenceSession(model_path, providers=["CPUExecutionProvider"])

74-79: Remove unused data_processor attribute and related cleanup

It’s never used; removing it simplifies the module and avoids confusion.

     def __init__(self, config: ONNXConfig, **kwargs):
         """Initialize the OONX validation module"""
         self.config = config
         self.model = None
-        self.data_processor = None
@@
     def cleanup(self):
         """Clean up resources"""
         if self.model is not None:
             del self.model
             self.model = None
-        if self.data_processor is not None:
-            del self.data_processor
-            self.data_processor = None

Also applies to: 284-287


243-271: Minor: Validate output and shape alignment before metric computation

Double-check shapes to avoid subtle mismatches before computing metrics. Optional, but protects against unexpected ONNX output shapes.

You already verify lengths match pre-inference; after inference, add an assert (or explicit check) that predictions shape matches ground truth shape exactly before flattening and computing metrics. If desired, I can provide a small helper to enforce this.


33-46: Naming/consistency: keep module “task_type” consistent with input “task_type”

The module declares task_type = "time_series_forecasting" while ONNXInputData carries a free-form task_type string consumed in metric selection. Consider either:

  • Normalize incoming task_type values to a known set; or
  • Drive metric selection purely from config/required_metrics for determinism.

Also applies to: 66-73

test_onnx.py (3)

1-1: Remove unused imports to satisfy Ruff and keep test lean

These imports are unused.

-import numpy as np
@@
-from validator.modules.onnx import (
-    ONNXValidationModule,
-    ONNXConfig,
-    ONNXInputData,
-    ONNXMetrics,
-)
+from validator.modules.onnx import ONNXInputData

Also applies to: 13-17


57-62: Prefer groupby+transform for rolling features to preserve index alignment

Using rolling().mean().values relies on positional alignment. transform keeps index alignment explicit and is more idiomatic.

-    for window in [3, 7, 14]:
-        df[f"number_sold_rolling_{window}"] = (
-            df.groupby(["store", "product"])["number_sold"]
-            .rolling(window=window)
-            .mean()
-            .values
-        )
+    for window in [3, 7, 14]:
+        df[f"number_sold_rolling_{window}"] = (
+            df.groupby(["store", "product"])["number_sold"]
+            .transform(lambda s: s.rolling(window=window).mean())
+        )

153-170: Strengthen assertions instead of print-based checks

Replace prints with assertions to make the test actionable and CI-friendly. For example, assert metrics is not None and individual metrics are finite/non-negative.

If you’d like, I can provide a pytest-style version that asserts MAE/RMSE are finite and within a loose bound.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 817ef86 and f39c914.

⛔ Files ignored due to path filters (7)
  • .DS_Store is excluded by !**/.DS_Store
  • validator/.DS_Store is excluded by !**/.DS_Store
  • validator/modules/.DS_Store is excluded by !**/.DS_Store
  • validator/modules/onnx/.DS_Store is excluded by !**/.DS_Store
  • validator/modules/onnx/demo_data/.DS_Store is excluded by !**/.DS_Store
  • validator/modules/onnx/demo_data/test.csv is excluded by !**/*.csv
  • validator/modules/onnx/demo_data/train.csv is excluded by !**/*.csv
📒 Files selected for processing (8)
  • .vscode/settings.json (1 hunks)
  • configs/onnx.json (1 hunks)
  • test_onnx.py (1 hunks)
  • validator/modules/base.py (3 hunks)
  • validator/modules/onnx/__init__.py (1 hunks)
  • validator/modules/onnx/environment.yml (1 hunks)
  • validator/modules/onnx/train_svm.py (1 hunks)
  • validator/validation_runner.py (5 hunks)
🧰 Additional context used
🪛 YAMLlint (1.37.1)
validator/modules/onnx/environment.yml

[warning] 11-11: wrong indentation: expected 6 but found 4

(indentation)


[error] 12-12: no new line character at the end of file

(new-line-at-end-of-file)

🪛 Ruff (0.12.2)
validator/modules/onnx/train_svm.py

19-19: f-string without any placeholders

Remove extraneous f prefix

(F541)

test_onnx.py

1-1: numpy imported but unused

Remove unused import: numpy

(F401)


13-13: validator.modules.onnx.ONNXValidationModule imported but unused

Remove unused import

(F401)


14-14: validator.modules.onnx.ONNXConfig imported but unused

Remove unused import

(F401)


16-16: validator.modules.onnx.ONNXMetrics imported but unused

Remove unused import

(F401)

validator/modules/onnx/__init__.py

4-4: pydantic.BaseModel imported but unused

(F401)


14-14: os imported but unused

(F401)


99-99: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


153-153: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


277-277: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (4)
validator/modules/base.py (1)

5-6: LGTM! Good use of frozen models for immutability

Using frozen=True on the base classes ensures configuration and data integrity throughout the validation pipeline, preventing accidental mutations.

validator/validation_runner.py (3)

9-9: Public import surface looks good

Exposing only BaseValidationModule, BaseInputData, BaseMetrics simplifies dependency boundaries.


25-26: test_mode flag addition is appropriate

This enables non-exiting behavior during tests; good addition for testability.


72-72: Passing task_id into validate aligns with per-task module isolation

Looks correct with updated module API.

Comment on lines +1 to +5
{
"githubPullRequests.ignoredPullRequestBranches": [
"main"
]
} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Consider adding this file to .gitignore

IDE-specific configuration files like .vscode/settings.json are typically excluded from version control as they contain user-specific preferences. Consider adding .vscode/ to .gitignore to avoid committing personal workspace settings.


🏁 Script executed:

#!/bin/bash
# Description: Check if .vscode is already in .gitignore

# Check if .gitignore exists and if it contains .vscode
if [ -f .gitignore ]; then
    echo "=== Current .gitignore content related to .vscode ==="
    grep -E "^\.vscode|^\*\.vscode" .gitignore || echo "No .vscode entry found in .gitignore"
else
    echo ".gitignore file does not exist"
fi

# Check if there are other VS Code settings files in the repository
echo -e "\n=== Other VS Code configuration files in the repository ==="
fd -t f . .vscode --exec echo "{}"

Length of output: 462


Action: Add .vscode/ to .gitignore or remove committed workspace settings

Verified: .vscode/settings.json is tracked in the repo and .gitignore does not contain a .vscode entry. Workspace-specific VS Code settings are typically not committed.

Files to address:

  • .vscode/settings.json — decide whether this should be shared. If not, remove it from the repo.
  • .gitignore — add an entry to ignore VS Code workspace settings.

Suggested change (add to .gitignore):

+.vscode/

If you want to remove the tracked file after updating .gitignore:

  • git rm --cached .vscode/settings.json
  • git commit -m "Remove user VS Code settings and ignore .vscode/"
🤖 Prompt for AI Agents
.vscode/settings.json around lines 1 to 5: workspace-specific VS Code settings
are committed but should not be tracked; add ".vscode/" to .gitignore to ignore
workspace settings, then remove the tracked file from the index with git rm
--cached .vscode/settings.json and commit (suggest commit message: "Remove user
VS Code settings and ignore .vscode/"), or if these settings must be shared
intentionally, remove the .vscode/ entry from .gitignore and keep the file
committed.

Comment on lines +96 to +121
def test_onnx_validation_works(mock_requests, mock_fedledger):
"""Test that ONNX validation can complete successfully using real HuggingFace model"""

test_csv = load_and_preprocess_demo_data()
if test_csv is None:
print("Failed to load demo data")
return False

# Mock API
mock_api = MagicMock()
mock_api.list_tasks.return_value = [
{"id": 1, "task_type": "onnx", "title": "Test", "data": {}}
]
mock_api.mark_assignment_as_failed = MagicMock()
mock_fedledger.return_value = mock_api

# Mock HTTP requests for CSV data (use real HuggingFace download for model)
def mock_get_side_effect(url):
response = MagicMock()
response.raise_for_status.return_value = None
response.text = test_csv # CSV contains both features and target
return response

mock_requests.side_effect = mock_get_side_effect

runner = ValidationRunner(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make the test fully offline and deterministic by mocking model download/inference

The test currently performs a real HuggingFace download and actual ONNX inference, which is brittle in CI/offline environments. Mock hf_hub_download to a local test artifact and mock onnxruntime.InferenceSession.run to return plausible outputs.

Example (concise):

from unittest.mock import MagicMock, patch

@patch("validator.validation_runner.FedLedger")
@patch("requests.get")
@patch("validator.modules.onnx.hf_hub_download", return_value="dummy.onnx")
@patch("validator.modules.onnx.ort.InferenceSession")
def test_onnx_validation_works(mock_sess, mock_hf, mock_requests, mock_fedledger):
    # Configure InferenceSession mock to output a vector matching input rows
    sess = MagicMock()
    sess.get_inputs.return_value = [MagicMock(name="input_0")]
    # Suppose test_features has shape (N, F); return (N,1) predictions
    def run_side_effect(_, feed):
        x = next(iter(feed.values()))
        return [x[:, :1]]
    sess.run.side_effect = run_side_effect
    mock_sess.return_value = sess
    ...

Also applies to: 129-145

@abstractmethod
def __init__(self, config: BaseConfig, **kwargs):
"""
""".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix incomplete docstring

The docstring appears to be incomplete with just "". at the beginning.

-        """.
+        """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
""".
"""
🤖 Prompt for AI Agents
In validator/modules/base.py around line 25 the module/class/function docstring
is incomplete and currently contains only '""'. Replace it with a complete,
descriptive triple-quoted docstring that briefly explains the purpose and
responsibilities of the module/class/function, any important
parameters/attributes or return values if applicable, and include a short
example or notes if helpful; ensure proper formatting ("""...""") and
punctuation so linters and documentation tools pick it up.

Comment on lines +97 to +100
except Exception as e:
print(f"Error loading model from {model_repo_id}: {e}")
raise ValueError(f"Failed to load ONNX model: {e}")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Preserve original exception context with raise ... from e

Exception chaining will significantly improve debuggability and aligns with Ruff B904.

@@
-        except Exception as e:
-            print(f"Error loading model from {model_repo_id}: {e}")
-            raise ValueError(f"Failed to load ONNX model: {e}")
+        except Exception as e:
+            print(f"Error loading model from {model_repo_id}: {e}")
+            raise ValueError(f"Failed to load ONNX model: {e}") from e
@@
-        except Exception as e:
-            print(f"Error during inference: {e}")
-            raise ValueError(f"ONNX inference failed: {e}")
+        except Exception as e:
+            print(f"Error during inference: {e}")
+            raise ValueError(f"ONNX inference failed: {e}") from e
@@
-        except Exception as e:
-            if isinstance(e, InvalidTimeSeriesDataException):
-                raise e
-            else:
-                raise ValueError(f"Validation failed: {str(e)}")
+        except Exception as e:
+            if isinstance(e, InvalidTimeSeriesDataException):
+                raise e
+            else:
+                raise ValueError(f"Validation failed: {str(e)}") from e

Also applies to: 151-154, 273-278

🧰 Tools
🪛 Ruff (0.12.2)

99-99: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In validator/modules/onnx/__init__.py around lines 97-100 (and likewise update
occurrences at 151-154 and 273-278), the code catches exceptions and re-raises a
new ValueError without preserving the original exception context; change the
re-raise to use exception chaining (raise ValueError("Failed to load ONNX model:
{e}") from e) so the original exception is attached to the new one, ensuring
full traceback and better debuggability.

Comment on lines +107 to +109
response = requests.get(data_url)
response.raise_for_status()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add HTTP timeouts and raise recoverable exceptions on CSV load failures

Avoid hanging on network calls and surface data issues as recoverable to align with the module’s error semantics.

@@
-            response = requests.get(data_url)
+            response = requests.get(data_url, timeout=30)
             response.raise_for_status()
@@
-        except Exception as e:
-            print(f"Error loading CSV data: {e}")
-            return None, None
+        except requests.RequestException as e:
+            # Network-related issues: recoverable after retry
+            raise InvalidTimeSeriesDataException(f"Error fetching CSV data from {data_url}: {e}") from e
+        except pd.errors.ParserError as e:
+            # Bad CSV: recoverable if data source is fixed
+            raise InvalidTimeSeriesDataException(f"Failed to parse CSV data from {data_url}: {e}") from e
+        except Exception as e:
+            raise InvalidTimeSeriesDataException(f"Unexpected error loading CSV data: {e}") from e

Also applies to: 130-133

🤖 Prompt for AI Agents
In validator/modules/onnx/__init__.py around lines 107-109 and 130-133, network
calls to requests.get should include a timeout and CSV parsing failures should
be converted into the module's recoverable exception type: add a reasonable
timeout argument to requests.get (e.g. timeout=10) and wrap the call in
try/except catching requests.Timeout and requests.RequestException to raise the
module's RecoverableError (or the existing recoverable exception class used by
this module) with a descriptive message; likewise wrap CSV load/parsing code in
try/except catching parsing errors (e.g. csv.Error, pandas errors, ValueError)
and raise the same RecoverableError so data issues are surfaced as recoverable
rather than causing an uncaught crash.

Comment on lines +19 to +20
print(f"First few rows:")
print(df.head())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unnecessary f-string prefix

Line 19 uses an f-string without any placeholders.

-    print(f"First few rows:")
+    print("First few rows:")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print(f"First few rows:")
print(df.head())
print("First few rows:")
print(df.head())
🧰 Tools
🪛 Ruff (0.12.2)

19-19: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents
In validator/modules/onnx/train_svm.py around lines 19 to 20, the first print
uses an f-string without placeholders; remove the unnecessary f prefix so it
becomes a normal string literal (i.e., change print(f"First few rows:") to
print("First few rows:")) and keep the following print(df.head()) unchanged.

Comment on lines +40 to +46
for window in [3, 7, 14]:
df[f"number_sold_rolling_{window}"] = (
df.groupby(["store", "product"])["number_sold"]
.rolling(window=window)
.mean()
.values
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential alignment issues with rolling window calculations

Using .values after a grouped rolling operation can cause misalignment if the groups have different sizes or if the DataFrame order changes. Consider using .reset_index(drop=True) or .transform() for safer assignment.

     for window in [3, 7, 14]:
-        df[f"number_sold_rolling_{window}"] = (
-            df.groupby(["store", "product"])["number_sold"]
-            .rolling(window=window)
-            .mean()
-            .values
-        )
+        df[f"number_sold_rolling_{window}"] = df.groupby(["store", "product"])["number_sold"].transform(
+            lambda x: x.rolling(window=window, min_periods=1).mean()
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for window in [3, 7, 14]:
df[f"number_sold_rolling_{window}"] = (
df.groupby(["store", "product"])["number_sold"]
.rolling(window=window)
.mean()
.values
)
for window in [3, 7, 14]:
df[f"number_sold_rolling_{window}"] = df.groupby(["store", "product"])["number_sold"].transform(
lambda x: x.rolling(window=window, min_periods=1).mean()
)
🤖 Prompt for AI Agents
In validator/modules/onnx/train_svm.py around lines 40 to 46, the code uses
.values after a grouped rolling mean which can misalign results across groups;
replace that pattern with a group-aligned transform such as
df[f"number_sold_rolling_{window}"] =
df.groupby(["store","product"])["number_sold"].transform(lambda x:
x.rolling(window=window).mean()) so the rolling means stay aligned to the
original index and you avoid using .values.

Comment on lines +84 to +89
if X.shape[0] > 100000:
subset_size = 100000
indices = np.random.choice(X.shape[0], subset_size, replace=False)
X = X.iloc[indices]
y = y.iloc[indices]
print(f"Using {subset_size} samples for training")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Set random seed for reproducibility

The random sampling for large datasets should use a fixed seed to ensure reproducible results across different training runs.

     if X.shape[0] > 100000:
         subset_size = 100000
+        np.random.seed(42)  # Set seed for reproducibility
         indices = np.random.choice(X.shape[0], subset_size, replace=False)
         X = X.iloc[indices]
         y = y.iloc[indices]
         print(f"Using {subset_size} samples for training")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if X.shape[0] > 100000:
subset_size = 100000
indices = np.random.choice(X.shape[0], subset_size, replace=False)
X = X.iloc[indices]
y = y.iloc[indices]
print(f"Using {subset_size} samples for training")
if X.shape[0] > 100000:
subset_size = 100000
np.random.seed(42) # Set seed for reproducibility
indices = np.random.choice(X.shape[0], subset_size, replace=False)
X = X.iloc[indices]
y = y.iloc[indices]
print(f"Using {subset_size} samples for training")
🤖 Prompt for AI Agents
In validator/modules/onnx/train_svm.py around lines 84 to 89 the random sampling
uses np.random.choice without a fixed seed which makes training
non-reproducible; modify the sampling to use a deterministic RNG (for example
create rng = np.random.default_rng(SEED) with a chosen constant SEED and call
rng.choice(X.shape[0], subset_size, replace=False)) or set np.random.seed(SEED)
immediately before np.random.choice, and apply the returned indices to X and y
as before, then keep the same print message.

Comment on lines +58 to 64
# Map task_id to module instance
self.task_id_to_module: dict[str, BaseValidationModule] = {}
for task_id in self.task_ids:
config = load_config_for_task(task_id, self.module, module_cls.config_schema)
self.module_config_to_module.setdefault(config, module_cls(config=config))
self.task_id_to_module[task_id] = self.module_config_to_module[config]
# Create a new module instance for each task_id
self.task_id_to_module[task_id] = module_cls(config=config)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: model_validate_json misused on dict in run(); should use model_validate

The call in run() uses model_validate_json on a Python dict, which will error. It should be model_validate(merged_data), or serialize to JSON string first.

Here’s the minimal fix (outside this hunk):

# In run(), line where input_data is constructed:
# Before:
input_data = module_obj.input_data_schema.model_validate_json(merged_data)
# After:
input_data = module_obj.input_data_schema.model_validate(merged_data)

Optionally, if the upstream API returns a JSON string, then pass that to model_validate_json instead.

🤖 Prompt for AI Agents
In validator/validation_runner.py around lines 58 to 64, the code calls
model_validate_json on a Python dict which will raise an error; replace that
call with model_validate(merged_data) so the dict is validated directly (or, if
you actually have a JSON string from upstream, pass that string to
model_validate_json instead). Ensure the change is made where input_data is
constructed in run().

🛠️ Refactor suggestion

Normalize task_id keys to strings to avoid dict misses and type drift

self.task_ids is annotated as list[str], but tests pass ints. Normalize keys on both store and lookup to prevent subtle bugs.

-        # Map task_id to module instance
-        self.task_id_to_module: dict[str, BaseValidationModule] = {}
-        for task_id in self.task_ids:
-            config = load_config_for_task(task_id, self.module, module_cls.config_schema)
-            # Create a new module instance for each task_id
-            self.task_id_to_module[task_id] = module_cls(config=config)
+        # Map task_id to module instance
+        self.task_id_to_module: dict[str, BaseValidationModule] = {}
+        for task_id in self.task_ids:
+            config = load_config_for_task(task_id, self.module, module_cls.config_schema)
+            # Create a new module instance for each task_id
+            self.task_id_to_module[str(task_id)] = module_cls(config=config)

Outside this hunk, update lookups accordingly:

# perform_validation(...)
module_obj = self.task_id_to_module[str(task_id)]
# run()
module_obj = self.task_id_to_module[str(task_id)]
🤖 Prompt for AI Agents
In validator/validation_runner.py around lines 58 to 64, task_id keys are stored
using the original task_id type which can be int in tests and string elsewhere,
causing lookup misses; normalize task_id to str when creating the mapping and
update all lookups to use str(task_id) (e.g., in perform_validation and run) so
the dict always uses string keys and lookups consistently convert incoming
task_id values to strings.

⚠️ Potential issue

Fix invalid import of Base from unittest.mock (module import will fail)

There is no Base in unittest.mock. This will raise ImportError on import, breaking the runner.

Outside this hunk, remove or correct the import:

# Remove this invalid line near the top of the file:
# from unittest.mock import Base
🤖 Prompt for AI Agents
In validator/validation_runner.py near the top of the file (around lines 1–20)
there is an invalid import `from unittest.mock import Base` which does not exist
and will raise ImportError; remove that import line, and if a symbol named Base
is actually required elsewhere, replace it with the correct source (or define it
locally) and update usages accordingly so the module imports succeed.

Comment on lines +77 to +85
if not self.test_mode:
sys.exit(1)
return None
except (RuntimeError, ValueError) as e:
logger.error(e)
self.api.mark_assignment_as_failed(assignment_id)
sys.exit(1)
if not self.test_mode:
sys.exit(1)
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Unify retry behavior and failure marking across exception types

Currently (RuntimeError, ValueError) mark the assignment as failed immediately (even on first attempt) and return, while generic Exception retries up to 3 times. Align this to only mark as failed on the final attempt.

-            except (RuntimeError, ValueError) as e:
-                logger.error(e)
-                self.api.mark_assignment_as_failed(assignment_id)
-                if not self.test_mode:
-                    sys.exit(1)
-                return None
+            except (RuntimeError, ValueError) as e:
+                logger.error(e)
+                if attempt == 2:
+                    self.api.mark_assignment_as_failed(assignment_id)
+                    if not self.test_mode:
+                        sys.exit(1)
+                    return None
+                # Try again on next loop iteration
+                continue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if not self.test_mode:
sys.exit(1)
return None
except (RuntimeError, ValueError) as e:
logger.error(e)
self.api.mark_assignment_as_failed(assignment_id)
sys.exit(1)
if not self.test_mode:
sys.exit(1)
return None
if not self.test_mode:
sys.exit(1)
return None
except (RuntimeError, ValueError) as e:
logger.error(e)
if attempt == 2:
self.api.mark_assignment_as_failed(assignment_id)
if not self.test_mode:
sys.exit(1)
return None
# Try again on next loop iteration
continue
🤖 Prompt for AI Agents
In validator/validation_runner.py around lines 77 to 85, the current except
block for (RuntimeError, ValueError) immediately marks the assignment as failed
and returns on the first hit while generic Exception retries up to 3 times;
change this so RuntimeError/ValueError follow the same retry loop and only call
self.api.mark_assignment_as_failed when the final attempt is exhausted.
Concretely: remove the immediate mark_assignment_as_failed and return from that
except branch, let the loop continue (or re-raise to be handled by the shared
retry logic), and when attempts reach the max retry count (the same check used
for generic Exception) call self.api.mark_assignment_as_failed(assignment_id)
and then, if not self.test_mode, call sys.exit(1) before returning None to keep
behavior consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant