Skip to content

Add pawn structure, bishop pair, and rook file evaluation#49

Open
luccabb wants to merge 1 commit intomasterfrom
improve/evaluation-enhancements
Open

Add pawn structure, bishop pair, and rook file evaluation#49
luccabb wants to merge 1 commit intomasterfrom
improve/evaluation-enhancements

Conversation

@luccabb
Copy link
Owner

@luccabb luccabb commented Feb 16, 2026

Summary

  • Add passed pawn bonus (tapered MG/EG, increasing with rank advancement)
  • Add isolated pawn penalty (no friendly pawns on adjacent files)
  • Add doubled pawn penalty (multiple pawns on same file)
  • Add bishop pair bonus (+30 MG / +50 EG)
  • Add rook on open file bonus (+20 MG / +10 EG)
  • Add rook on semi-open file bonus (+10 MG / +5 EG)

All terms use tapered evaluation (midgame/endgame weights) consistent with the existing PeSTO framework.

Benchmark results (depth 4)

Metric Before After Change
Total nodes 4,760,507 3,778,411 −20.6%
NPS 22,634 20,903 −7.6%
Total time 210.32s 180.75s −14.0%

Better evaluation improves pruning decisions, reducing total nodes despite slightly slower per-node evaluation.

Local Stockfish Benchmark

Settings: 20 games, Stockfish skill 3, 10s/move, no opening book.

W L D Win Rate
Master (baseline) 19 1 0 95%
This PR 17 3 0 85%

Use /run-stockfish-benchmark for CI validation with opening book and longer time control.

Test plan

  • All alpha_beta unit tests pass (16/16)
  • /run-nps-benchmark
  • /run-stockfish-benchmark

Enhance the evaluation function beyond pure PST with:
- Passed pawn bonus (tapered, increasing with rank advancement)
- Isolated pawn penalty (no friendly pawns on adjacent files)
- Doubled pawn penalty (multiple pawns on same file)
- Bishop pair bonus (+30 MG / +50 EG)
- Rook on open file bonus (+20 MG / +10 EG)
- Rook on semi-open file bonus (+10 MG / +5 EG)

All bonuses use tapered evaluation (midgame/endgame weights).

Benchmark at depth 4: nodes reduced 20.6% (4,760,507 → 3,778,411)
due to more accurate evaluation improving pruning decisions.
Total search time reduced 14% (210s → 181s).
@github-actions
Copy link

Benchmarks

The following benchmarks are available for this PR:

Command Description
/run-nps-benchmark NPS speed benchmark (depth 5, 48 positions)
/run-stockfish-benchmark Stockfish strength benchmark (300 games)

Post a comment with the command to trigger a benchmark run.

@luccabb
Copy link
Owner Author

luccabb commented Feb 16, 2026

/run-nps-benchmark

@luccabb
Copy link
Owner Author

luccabb commented Feb 16, 2026

/run-stockfish-benchmark

@greptile-apps
Copy link

greptile-apps bot commented Feb 16, 2026

Greptile Summary

This PR extends the PeSTO-based board evaluation in moonfish/psqt.py with pawn structure analysis (passed, isolated, and doubled pawns), a bishop pair bonus, and rook open/semi-open file bonuses. All new terms use tapered midgame/endgame weights consistent with the existing evaluation framework. The changes are well-structured and localized to a single file.

  • Pawn structure: Passed pawn bonus (rank-dependent), isolated pawn penalty, and doubled pawn penalty added to both white and black evaluation
  • Bishop pair: +30 MG / +50 EG bonus when a side has 2+ bishops
  • Rook file evaluation: Open file (+20/+10) and semi-open file (+10/+5) bonuses
  • Dead code: FILE_SQUARES constant is defined but never used — can be removed
  • Doubled pawn accounting: Penalty is applied per-pawn rather than per-extra-pawn, so the effective penalty for a standard doubled pair is 2× the constant values

Confidence Score: 4/5

  • This PR is safe to merge — no functional bugs, and the evaluation changes are well-integrated with the existing tapered eval framework.
  • The code is logically correct and consistent with the existing architecture. The two issues found are minor: dead code (FILE_SQUARES) and a doubled-pawn penalty accounting choice that may not match the author's intent. No runtime errors or security concerns.
  • No files require special attention — all changes are confined to moonfish/psqt.py and are straightforward evaluation additions.

Important Files Changed

Filename Overview
moonfish/psqt.py Adds structural evaluation terms (pawn structure, bishop pair, rook files) with tapered MG/EG weights. Contains unused FILE_SQUARES constant and a doubled pawn penalty that applies per-pawn rather than per-extra-pawn.

Flowchart

flowchart TD
    A[board_evaluation called] --> B[get_phase from piece counts]
    B --> C[Iterate piece_map]
    C --> D[Accumulate MG/EG PST + piece values]
    C --> E[Collect pawn/rook positions, bishop counts]
    E --> F[Build pawn file sets & per-file counts]
    F --> G[White Pawn Structure]
    F --> H[Black Pawn Structure]
    G --> G1[Doubled pawn penalty]
    G --> G2[Isolated pawn penalty]
    G --> G3[Passed pawn bonus by rank]
    H --> H1[Doubled pawn penalty]
    H --> H2[Isolated pawn penalty]
    H --> H3[Passed pawn bonus mirrored rank]
    G1 & G2 & G3 & H1 & H2 & H3 --> I[Bishop Pair Bonus]
    I --> J[Rook Open/Semi-Open File Bonus]
    J --> K[Tapered eval: blend MG & EG by phase]
    K --> L[Return score relative to side to move]
Loading

Last reviewed commit: 36475d9

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +191 to +193
FILE_SQUARES: tuple[set[int], ...] = tuple(
{file + rank * 8 for rank in range(8)} for file in range(8)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused FILE_SQUARES constant
FILE_SQUARES is defined here but never referenced anywhere in the codebase. This appears to be dead code left over from development. Consider removing it to keep the module clean.

Suggested change
FILE_SQUARES: tuple[set[int], ...] = tuple(
{file + rank * 8 for rank in range(8)} for file in range(8)
)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +377 to +380
# Doubled pawn: more than one pawn on same file
if white_pawns_per_file[f] > 1:
mg_white += MG_DOUBLED_PAWN
eg_white += EG_DOUBLED_PAWN
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doubled pawn penalty scales per-pawn
The penalty is applied to each pawn on the file. Two pawns on the same file incur 2× MG_DOUBLED_PAWN total, and three pawns incur 3× penalty. A more standard approach is to penalize only the extra pawns (i.e., count - 1 times per file), so doubled pawns receive 1× penalty and tripled pawns receive 2×. The current approach over-penalizes relative to the weight values, since the effective doubled-pawn penalty for a standard doubled pawn pair is -20 MG / -30 EG rather than the -10 / -15 the constants suggest.

If this is intentional, consider renaming the constants (e.g. MG_DOUBLED_PAWN_PER_PAWN) or adding a comment to clarify. Otherwise, the fix would be to apply the penalty (count - 1) times per file rather than per pawn:

for f, count in white_pawns_per_file.items():
    if count > 1:
        mg_white += MG_DOUBLED_PAWN * (count - 1)
        eg_white += EG_DOUBLED_PAWN * (count - 1)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@github-actions
Copy link

🔬 Stockfish Benchmark Results

vs Stockfish Skill Level 3

Metric Wins Losses Draws Total Win %
Overall 23 71 6 100 23.0%
As White 12 34 4 50 24.0%
As Black 11 37 2 50 22.0%

Non-checkmate endings:

  • Draw by 3-fold repetition: 4

vs Stockfish Skill Level 4

Metric Wins Losses Draws Total Win %
Overall 22 72 6 100 22.0%
As White 10 38 2 50 20.0%
As Black 12 34 4 50 24.0%

Non-checkmate endings:

  • Draw by 3-fold repetition: 5

vs Stockfish Skill Level 5

Metric Wins Losses Draws Total Win %
Overall 12 80 8 100 12.0%
As White 7 37 6 50 14.0%
As Black 5 43 2 50 10.0%

Non-checkmate endings:

  • Draw by 3-fold repetition: 6
Configuration
  • 5 chunks × 20 rounds × 3 skill levels = 300 total games
  • Each opening played with colors reversed (-repeat) for fairness
  • Moonfish: 60s per move
  • Stockfish: 60+5 time control

@github-actions
Copy link

⚡ NPS Benchmark Results

Metric Value
Depth 5
Positions 48
Total nodes 20397618
Total time 2655.15s
Nodes/second 7682

Node count is the primary signal — it's deterministic and catches search behavior changes. If the node count changes, the PR changed search behavior. NPS is informational only (CI runner performance varies).

Per-position breakdown
Position  1/48: nodes=188157     time=19.07s  nps=9864
Position  2/48: nodes=643670     time=81.35s  nps=7912
Position  3/48: nodes=17501      time=1.32s  nps=13279
Position  4/48: nodes=782855     time=103.92s  nps=7533
Position  5/48: nodes=69860      time=10.06s  nps=6942
Position  6/48: nodes=590170     time=79.48s  nps=7425
Position  7/48: nodes=409629     time=59.50s  nps=6884
Position  8/48: nodes=287783     time=28.81s  nps=9988
Position  9/48: nodes=1240798    time=136.53s  nps=9088
Position 10/48: nodes=451359     time=45.09s  nps=10009
Position 11/48: nodes=537456     time=81.16s  nps=6622
Position 12/48: nodes=1004042    time=155.88s  nps=6441
Position 13/48: nodes=685196     time=94.31s  nps=7265
Position 14/48: nodes=855349     time=109.05s  nps=7843
Position 15/48: nodes=498231     time=61.59s  nps=8089
Position 16/48: nodes=335937     time=35.82s  nps=9378
Position 17/48: nodes=11148      time=0.87s  nps=12762
Position 18/48: nodes=14825      time=0.83s  nps=17770
Position 19/48: nodes=38765      time=3.58s  nps=10815
Position 20/48: nodes=101407     time=7.78s  nps=13035
Position 21/48: nodes=5916       time=0.32s  nps=18329
Position 22/48: nodes=876        time=0.05s  nps=18181
Position 23/48: nodes=11779      time=0.69s  nps=17095
Position 24/48: nodes=38093      time=3.99s  nps=9554
Position 25/48: nodes=7990       time=0.52s  nps=15238
Position 26/48: nodes=65456      time=5.75s  nps=11382
Position 27/48: nodes=110248     time=10.38s  nps=10618
Position 28/48: nodes=334154     time=36.17s  nps=9239
Position 29/48: nodes=337430     time=44.18s  nps=7637
Position 30/48: nodes=3870       time=0.32s  nps=12119
Position 31/48: nodes=2053670    time=257.51s  nps=7974
Position 32/48: nodes=872695     time=100.58s  nps=8676
Position 33/48: nodes=2018895    time=391.20s  nps=5160
Position 34/48: nodes=1256537    time=210.66s  nps=5964
Position 35/48: nodes=437908     time=45.76s  nps=9568
Position 36/48: nodes=2178556    time=246.41s  nps=8841
Position 37/48: nodes=1224098    time=125.36s  nps=9764
Position 38/48: nodes=14373      time=0.66s  nps=21742
Position 39/48: nodes=8191       time=0.32s  nps=25364
Position 40/48: nodes=22879      time=0.39s  nps=59181
Position 41/48: nodes=189439     time=14.96s  nps=12660
Position 42/48: nodes=66708      time=6.40s  nps=10426
Position 43/48: nodes=41432      time=2.46s  nps=16845
Position 44/48: nodes=158011     time=14.63s  nps=10803
Position 45/48: nodes=99717      time=11.70s  nps=8524
Position 46/48: nodes=74559      time=7.77s  nps=9593
Position 47/48: nodes=0          time=0.00s  nps=0  (terminal)
Position 48/48: nodes=0          time=0.00s  nps=0  (terminal)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant