Add pawn structure, bishop pair, and rook file evaluation#49
Add pawn structure, bishop pair, and rook file evaluation#49
Conversation
Enhance the evaluation function beyond pure PST with: - Passed pawn bonus (tapered, increasing with rank advancement) - Isolated pawn penalty (no friendly pawns on adjacent files) - Doubled pawn penalty (multiple pawns on same file) - Bishop pair bonus (+30 MG / +50 EG) - Rook on open file bonus (+20 MG / +10 EG) - Rook on semi-open file bonus (+10 MG / +5 EG) All bonuses use tapered evaluation (midgame/endgame weights). Benchmark at depth 4: nodes reduced 20.6% (4,760,507 → 3,778,411) due to more accurate evaluation improving pruning decisions. Total search time reduced 14% (210s → 181s).
BenchmarksThe following benchmarks are available for this PR:
Post a comment with the command to trigger a benchmark run. |
|
/run-nps-benchmark |
|
/run-stockfish-benchmark |
Greptile SummaryThis PR extends the PeSTO-based board evaluation in
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| moonfish/psqt.py | Adds structural evaluation terms (pawn structure, bishop pair, rook files) with tapered MG/EG weights. Contains unused FILE_SQUARES constant and a doubled pawn penalty that applies per-pawn rather than per-extra-pawn. |
Flowchart
flowchart TD
A[board_evaluation called] --> B[get_phase from piece counts]
B --> C[Iterate piece_map]
C --> D[Accumulate MG/EG PST + piece values]
C --> E[Collect pawn/rook positions, bishop counts]
E --> F[Build pawn file sets & per-file counts]
F --> G[White Pawn Structure]
F --> H[Black Pawn Structure]
G --> G1[Doubled pawn penalty]
G --> G2[Isolated pawn penalty]
G --> G3[Passed pawn bonus by rank]
H --> H1[Doubled pawn penalty]
H --> H2[Isolated pawn penalty]
H --> H3[Passed pawn bonus mirrored rank]
G1 & G2 & G3 & H1 & H2 & H3 --> I[Bishop Pair Bonus]
I --> J[Rook Open/Semi-Open File Bonus]
J --> K[Tapered eval: blend MG & EG by phase]
K --> L[Return score relative to side to move]
Last reviewed commit: 36475d9
| FILE_SQUARES: tuple[set[int], ...] = tuple( | ||
| {file + rank * 8 for rank in range(8)} for file in range(8) | ||
| ) |
There was a problem hiding this comment.
Unused FILE_SQUARES constant
FILE_SQUARES is defined here but never referenced anywhere in the codebase. This appears to be dead code left over from development. Consider removing it to keep the module clean.
| FILE_SQUARES: tuple[set[int], ...] = tuple( | |
| {file + rank * 8 for rank in range(8)} for file in range(8) | |
| ) |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| # Doubled pawn: more than one pawn on same file | ||
| if white_pawns_per_file[f] > 1: | ||
| mg_white += MG_DOUBLED_PAWN | ||
| eg_white += EG_DOUBLED_PAWN |
There was a problem hiding this comment.
Doubled pawn penalty scales per-pawn
The penalty is applied to each pawn on the file. Two pawns on the same file incur 2× MG_DOUBLED_PAWN total, and three pawns incur 3× penalty. A more standard approach is to penalize only the extra pawns (i.e., count - 1 times per file), so doubled pawns receive 1× penalty and tripled pawns receive 2×. The current approach over-penalizes relative to the weight values, since the effective doubled-pawn penalty for a standard doubled pawn pair is -20 MG / -30 EG rather than the -10 / -15 the constants suggest.
If this is intentional, consider renaming the constants (e.g. MG_DOUBLED_PAWN_PER_PAWN) or adding a comment to clarify. Otherwise, the fix would be to apply the penalty (count - 1) times per file rather than per pawn:
for f, count in white_pawns_per_file.items():
if count > 1:
mg_white += MG_DOUBLED_PAWN * (count - 1)
eg_white += EG_DOUBLED_PAWN * (count - 1)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
🔬 Stockfish Benchmark Resultsvs Stockfish Skill Level 3
Non-checkmate endings:
vs Stockfish Skill Level 4
Non-checkmate endings:
vs Stockfish Skill Level 5
Non-checkmate endings:
Configuration
|
⚡ NPS Benchmark Results
Per-position breakdown |
Summary
All terms use tapered evaluation (midgame/endgame weights) consistent with the existing PeSTO framework.
Benchmark results (depth 4)
Better evaluation improves pruning decisions, reducing total nodes despite slightly slower per-node evaluation.
Local Stockfish Benchmark
Settings: 20 games, Stockfish skill 3, 10s/move, no opening book.
Use
/run-stockfish-benchmarkfor CI validation with opening book and longer time control.Test plan
/run-nps-benchmark/run-stockfish-benchmark