Conversation
Introduce a pluggable evaluation architecture that supports multiple
model backends for position assessment:
- Evaluator protocol (evaluation/base.py): Defines the interface any
evaluator must implement (evaluate + reset methods)
- ClassicalEvaluator (evaluation/classical.py): Wraps existing PeSTO
evaluation as an Evaluator implementation
- NNEvaluator (evaluation/nn.py): Framework for neural network models
supporting ONNX Runtime, PyTorch, and custom callables (e.g., LLMs)
- NNEngine (engines/nn_engine.py): Alpha-beta search engine with a
pluggable evaluator instead of hardcoded PeSTO
- Board encoding: 773-feature vector (12 bitboard planes + metadata)
Usage examples:
# ONNX model
moonfish --algorithm nn --nn-model-path model.onnx
# Custom LLM evaluator (Python API)
evaluator = NNEvaluator(eval_fn=my_llm_function)
engine = NNEngine(config, evaluator=evaluator)
# Subclass for custom logic
class MyEval(NNEvaluator):
def _raw_evaluate(self, board): ...
BenchmarksThe following benchmarks are available for this PR:
Post a comment with the command to trigger a benchmark run. |
Greptile SummaryThis PR introduces a pluggable evaluation architecture allowing neural networks, LLMs, or custom models to be used for position assessment alongside the existing PeSTO evaluation.
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| moonfish/engines/nn_engine.py | New engine subclass overriding eval_board with pluggable evaluator. The Syzygy exception handling is broader than the parent class, which could mask real errors. |
| moonfish/evaluation/init.py | Clean package init exporting the three evaluation components. No issues. |
| moonfish/evaluation/base.py | Well-defined Evaluator protocol with evaluate() and reset() methods. Clean and follows existing project patterns. |
| moonfish/evaluation/classical.py | Thin wrapper around existing PeSTO evaluation. Correctly delegates to board_evaluation and clears the global cache on reset. |
| moonfish/evaluation/nn.py | Core NN evaluator with ONNX/PyTorch/callable backends. Module-level docstring mentions en passant and halfmove clock metadata but the implementation only encodes side-to-move and castling rights. |
| moonfish/helper.py | Factory function extended with nn algorithm support. Clean integration with lazy import of NNEvaluator. |
| moonfish/main.py | Added --nn-model-path CLI option and wired it through to Config. Straightforward addition. |
| moonfish/config.py | Added nn_model_path optional field with None default. No issues. |
Class Diagram
classDiagram
class Evaluator {
<<Protocol>>
+evaluate(board: Board) float
+reset() None
}
class ClassicalEvaluator {
+evaluate(board: Board) float
+reset() None
}
class NNEvaluator {
-_eval_fn: Callable
-_board_encoder: Callable
-_model
-_backend: str
+from_file(model_path: str) NNEvaluator
+evaluate(board: Board) float
+reset() None
-_raw_evaluate(board: Board) float
-_load_model(model_path: str) None
-_load_onnx(model_path: str) None
-_load_pytorch(model_path: str) None
}
class ChessEngine {
<<Protocol>>
+search_move(board: Board) Move
}
class AlphaBeta {
+eval_board(board: Board) float
+search_move(board: Board) Move
+negamax() tuple
+quiescence_search() float
}
class NNEngine {
+evaluator: Evaluator
+eval_board(board: Board) float
+search_move(board: Board) Move
}
Evaluator <|.. ClassicalEvaluator
Evaluator <|.. NNEvaluator
ChessEngine <|.. AlphaBeta
AlphaBeta <|-- NNEngine
NNEngine --> Evaluator : uses
Last reviewed commit: fb66718
| except Exception: | ||
| pass |
There was a problem hiding this comment.
Overly broad exception handler
The base AlphaBeta.eval_board catches (chess.syzygy.MissingTableError, KeyError) specifically, but here a bare except Exception is used. This could silently swallow real errors (I/O failures, corrupted tablebase data, etc.) and fall through to the NN evaluator without any indication that something went wrong. Consider matching the parent class's specific exception types.
| except Exception: | |
| pass | |
| except (chess.syzygy.MissingTableError, KeyError): | |
| pass |
| - Metadata: side to move, castling rights, en passant, halfmove clock | ||
| - Total input size: 12*64 + 5 = 773 floats |
There was a problem hiding this comment.
Module docstring doesn't match implementation
The module-level docstring states "Metadata: side to move, castling rights, en passant, halfmove clock" and "Total input size: 12*64 + 5 = 773 floats", but the actual board_to_tensor implementation only encodes side-to-move (1 feature) and castling rights (4 features) — no en passant square and no halfmove clock. The function-level docstring is correct, but this module header could mislead users who rely on it to understand the encoding format.
Summary
Introduce a pluggable evaluation architecture that makes it easy to use neural networks, LLMs, or any custom model for position assessment:
Evaluatorprotocol (evaluation/base.py): Defines the interface any evaluator must implementClassicalEvaluator(evaluation/classical.py): Wraps existing PeSTO evaluationNNEvaluator(evaluation/nn.py): Framework supporting ONNX Runtime, PyTorch, and custom callablesNNEngine(engines/nn_engine.py): Alpha-beta search with pluggable evaluationUsage examples
Architecture
Test plan
get_engine)