psychlingbench

psychlingbench is a lightweight, extensible benchmarking toolkit designed to evaluate how well language models align with human word-by-word processing behavior, particularly during natural reading.

Purpose

Rather than scoring language models by perplexity, BLEU, or task accuracy, psychlingbench uses behavioral science signals to directly assess how well models mirror human time-locked processing:

First-pass fixation durations (eye-tracking)
Reaction times
EEG/ERP responses (future versions)

The benchmark uses noise-normalized Pearson R² between model predictions and human data, making scores directly comparable across different datasets and measurement types.

Installation

# Basic installation
pip install psychlingbench

# With baseline model support (requires PyTorch)
pip install psychlingbench[baselines]

# Development installation
pip install -e ".[dev,baselines]"

Quick Start

from psychlingbench.benchmarks.fix10k import Fix10kBenchmark
from psychlingbench.baselines.gpt2_surprisal import predict_fix_ms

# Initialize benchmark
benchmark = Fix10kBenchmark()

# Evaluate a model
results = benchmark.evaluate(
    model=None,  # The model is encapsulated in the predict function
    predict_fixation_function=predict_fix_ms,
    model_name="gpt2_surprisal"
)

print(f"Score: {results['metrics']['noise_normalized']:.3f}")

CLI Usage

# Download benchmark data
psychlingbench download fix10k

# Run evaluation with GPT-2 surprisal baseline
psychlingbench eval gpt2_surprisal fix10k

# Launch visualization dashboard
psychlingbench view

Available Benchmarks

fix10k: First-pass fixation durations from eye-tracking data (v0.1)

Development Roadmap

Version	Adds	Purpose
v0.1	fix10k (first-pass fixations)	Core reading behavior test
v0.2	RT benchmark (e.g., ELP)	Tests reaction time prediction
v0.3	EEG/ERP signals (e.g., N400)	Adds neural-behavioral links
v0.4	Energy probe support	Enables fit-vs-cost analysis
v1.0	Composite PsychScore	Unified score across domains

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
imgs		imgs
psychlingbench		psychlingbench
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
psychlingoverview.md		psychlingoverview.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

psychlingbench

Purpose

Installation

Quick Start

CLI Usage

Available Benchmarks

Development Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

williamjwall/psychlingbench

Folders and files

Latest commit

History

Repository files navigation

psychlingbench

Purpose

Installation

Quick Start

CLI Usage

Available Benchmarks

Development Roadmap

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages