feat: Add XGBoost model implementation by tamilatiurina · Pull Request #23 · Palamabron/AlphaPulse

tamilatiurina · 2026-02-04T11:49:21Z

Added needed dependencies for models to pyproject.toml
Added models/model_abstract, models/model_xgboost where xgboost script is allocated that follow base model abstract class' methods
Added tests/models/test_models_xgboost for testing train, finetune and predict methods of xgboost model

Copilot

Pull request overview

Adds an XGBoost-backed model implementation under alphapulse.models, wires up a new optional dependency extra (models), and introduces tests intended to validate train/finetune/predict behavior.

Changes:

Add ModelAbstract and ModelXgboost implementations under src/alphapulse/models/.
Add a new models optional dependency group (and lockfile updates) including xgboost.
Add pytest coverage for XGBoost training/finetuning/prediction.

Reviewed changes

Copilot reviewed 5 out of 7 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
`pyproject.toml`	Adds `models` optional dependencies for model-related packages.
`uv.lock`	Locks `models` extra deps, including `xgboost` and transitive packages.
`src/alphapulse/models/model_abstract.py`	Introduces a base abstract model interface.
`src/alphapulse/models/model_xgboost.py`	Adds XGBoost model wrapper implementing the abstract interface.
`src/alphapulse/models/__init__.py`	Exposes model classes from the `alphapulse.models` package.
`tests/models/test_models_xgboost.py`	Adds tests for training, finetuning, and prediction output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-08T19:58:04Z

tests/models/test_models_xgboost.py

+ROOT = Path(__file__).parent.parent.parent
+TRAIN_DATA_PATH = ROOT / "data" / "v5.2" / "train.parquet"
+FEATURES_JSON_PATH = ROOT / "data" / "v5.2" / "features.json"
+TEST_DATA_PATH = ROOT / "data" / "v5.2" / "live.parquet"
+
+
+@pytest.fixture
+def test_data() -> tuple[pd.DataFrame, list[str]]:
+    """Load Numerai data"""
+    with open(FEATURES_JSON_PATH, encoding="utf-8") as f:
+        feature_metadata = json.load(f)
+    feature_cols = feature_metadata["feature_sets"]["small"]
+    target_cols = feature_metadata["targets"]
+    train = pd.read_parquet(
+        TRAIN_DATA_PATH, columns=["era"] + feature_cols + target_cols
+    )
+    return train, feature_cols


These tests depend on local Numerai dataset files under data/v5.2/*, but the repository doesn’t include a data/ directory. As written, the test suite will fail on a clean checkout/CI. Consider replacing this with a small synthetic DataFrame fixture (or add minimal test fixtures to tests/ and load them from there).

Copilot · 2026-02-08T19:58:04Z

tests/models/test_models_xgboost.py

+        "subsample": 0.8,
+        "colsample_bytree": 0.8,
+        "lambda": 1,
+        "alpha": 0,


xgb_params does not set an objective, but the test asserts predictions are in [0, 1]. With XGBoost defaults (regression), predictions are not guaranteed to be bounded, so this assertion can be flaky/incorrect. Either set an objective that guarantees bounds (e.g., logistic) or relax the assertion to properties that always hold (shape, finite values, etc.).

Suggested change

"alpha": 0,

"alpha": 0,

"objective": "binary:logistic",

Copilot · 2026-02-08T19:58:04Z

src/alphapulse/models/model_abstract.py

+        raise NotImplementedError("Train method needs to be overriden")
+
+    @abstractmethod
+    def finetune(self, *_args: Any, **_kwargs: Any) -> xgb.Booster:
+        """Finetune the trained model"""
+        raise NotImplementedError("Finetune method needs to be overriden")
+
+    @abstractmethod
+    def predict(self, *_args: Any, **_kwargs: Any) -> pd.Series:
+        """Predict the result of the trained model"""
+        raise NotImplementedError("Predict method needs to be overriden")


Typos in error text: “overriden” should be “overridden” in these NotImplementedError messages.

Suggested change

raise NotImplementedError("Train method needs to be overriden")

@abstractmethod

def finetune(self, *_args: Any, **_kwargs: Any) -> xgb.Booster:

"""Finetune the trained model"""

raise NotImplementedError("Finetune method needs to be overriden")

@abstractmethod

def predict(self, *_args: Any, **_kwargs: Any) -> pd.Series:

"""Predict the result of the trained model"""

raise NotImplementedError("Predict method needs to be overriden")

raise NotImplementedError("Train method needs to be overridden")

@abstractmethod

def finetune(self, *_args: Any, **_kwargs: Any) -> xgb.Booster:

"""Finetune the trained model"""

raise NotImplementedError("Finetune method needs to be overridden")

@abstractmethod

def predict(self, *_args: Any, **_kwargs: Any) -> pd.Series:

"""Predict the result of the trained model"""

raise NotImplementedError("Predict method needs to be overridden")

Copilot · 2026-02-08T19:58:05Z

pyproject.toml

+models = [
+  "xgboost",
+  "pandas",
+  "numpy",
+]


The standard dev workflow (uv sync --extra dev / uv run mypy / uv run pytest) will not install the new models extra, but the code and tests unconditionally import xgboost. To keep CI/local dev green, either add xgboost to the dev extra (or core deps), or make the xgboost import/tests conditional on the extra being installed.

Copilot · 2026-02-08T19:58:05Z

src/alphapulse/models/model_xgboost.py

+        **kwargs: Any,
+    ) -> xgb.Booster:
+        if self.model is None:
+            raise RuntimeError("Train initial model")


The runtime error message here is ambiguous (“Train initial model”). Consider making it actionable, e.g., explicitly telling callers to call train() before finetune(), and ideally include the class/method name in the message for easier debugging.

Suggested change

raise RuntimeError("Train initial model")

raise RuntimeError(

"ModelXgboost.finetune() requires an initial model. Call "

"ModelXgboost.train() before finetune()."

)

Copilot · 2026-02-08T19:58:05Z

src/alphapulse/models/model_abstract.py

+import pandas as pd
+import xgboost as xgb
+
+
+class ModelAbstract(ABC):
+    """Abstract class for all models"""
+
+    @abstractmethod
+    def train(self, *_args: Any, **_kwargs: Any) -> xgb.Booster:
+        """Initial training the model"""


ModelAbstract imports xgboost and returns xgb.Booster, which makes the abstract base class (and any import of alphapulse.models) require the optional models extra. This will break uv run mypy / uv run pytest in the default dev environment where xgboost isn’t installed. Consider removing the hard dependency from the abstract layer (e.g., return Any/a protocol, or use a TYPE_CHECKING import and a forward reference) so the package can be imported/type-checked without the optional extra.

Copilot · 2026-02-08T19:58:06Z

src/alphapulse/models/__init__.py

+from .model_abstract import ModelAbstract
+from .model_xgboost import ModelXgboost
+
+__all__ = ["ModelAbstract", "ModelXgboost"]


This package __init__ eagerly imports ModelXgboost, which will raise ModuleNotFoundError for users who install the base package without the models extra (because xgboost is optional). Consider making alphapulse.models safe to import without xgboost (lazy/conditional import, or avoid exporting ModelXgboost at package import time).

Copilot · 2026-02-08T19:58:06Z

pyproject.toml

+  "pandas",
+  "numpy",
+]
+




The new models extra includes pandas, but pandas is already a core dependency. Keeping duplicates makes dependency intent unclear; consider limiting this extra to only what’s actually optional (likely just xgboost, and possibly numpy if not already required elsewhere).

Suggested change

"pandas",

"numpy",

]

"numpy",

]

Copilot · 2026-02-08T19:58:06Z

src/alphapulse/models/model_abstract.py

+    """Abstract class for all models"""
+
+    @abstractmethod
+    def train(self, *_args: Any, **_kwargs: Any) -> xgb.Booster:


Overridden method signature does not match call, where it is passed an argument named 'params'. Overriding method method ModelXgboost.train matches the call.
Overridden method signature does not match call, where it is passed an argument named 'params'. Overriding method method ModelXgboost.train matches the call.
Overridden method signature does not match call, where it is passed an argument named 'params'. Overriding method method ModelXgboost.train matches the call.
Overridden method signature does not match call, where it is passed an argument named 'num_boost_round'. Overriding method method ModelXgboost.train matches the call.
Overridden method signature does not match call, where it is passed an argument named 'num_boost_round'. Overriding method method ModelXgboost.train matches the call.
Overridden method signature does not match call, where it is passed an argument named 'num_boost_round'. Overriding method method ModelXgboost.train matches the call.

Suggested change

def train(self, *_args: Any, **_kwargs: Any) -> xgb.Booster:

def train(

self,

*args: Any,

params: Any = None,

num_boost_round: Any = None,

**kwargs: Any,

) -> xgb.Booster:

Copilot · 2026-02-08T19:58:06Z

src/alphapulse/models/model_abstract.py

+        raise NotImplementedError("Train method needs to be overriden")
+
+    @abstractmethod
+    def finetune(self, *_args: Any, **_kwargs: Any) -> xgb.Booster:


Overridden method signature does not match call, where it is passed an argument named 'params'. Overriding method method ModelXgboost.finetune matches the call.
Overridden method signature does not match call, where it is passed an argument named 'num_boost_round'. Overriding method method ModelXgboost.finetune matches the call.

Suggested change

def finetune(self, *_args: Any, **_kwargs: Any) -> xgb.Booster:

def finetune(

self,

params: Any = None,

num_boost_round: int | None = None,

*_args: Any,

**_kwargs: Any,

) -> xgb.Booster:

feat: Add XGBoost model implementation

9108171

Palamabron requested a review from Copilot February 8, 2026 19:53

Copilot started reviewing on behalf of Palamabron February 8, 2026 19:54 View session

Copilot AI reviewed Feb 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add XGBoost model implementation#23

feat: Add XGBoost model implementation#23
tamilatiurina wants to merge 1 commit intomainfrom
feat/xgboost-logic

tamilatiurina commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tamilatiurina commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants