This repository contains a PyTorch implementation of a Convolutional Neural Network (CNN) designed to predict the correlation coefficient from scatter plot images, based on the gameplay of guessthecorrelation.com. The model takes an image of a scatter plot and outputs a continuous value representing the Pearson correlation coefficient (
The model demonstrates excellent performance in predicting the correlation coefficient from unseen test data:
| Metric | Value | Interpretation |
|---|---|---|
| Test Loss (MSE) | The average squared difference between the actual and predicted correlation is very small. | |
| Correlation ( |
The model's predictions have a near-perfect linear relationship with the actual correlation values. |
The scatter plot below visually confirms the strong performance. The predicted values closely follow the red dashed line, which represents perfect prediction (where predicted equals actual).
!
- Dataset: The model is trained on a custom dataset of scatter plot images and their corresponding correlation coefficients, sourced from the game data (
responses.csv). - Data Split: The dataset is split into Training (80%), Validation (10%), and Test (10%) sets for robust evaluation.
- Training Data: 120,000 samples
- Validation Data: 15,000 samples
- Test Data: 15,000 samples
- Custom Dataset Class (
ImageDataset): Handles image loading and data pairing.- Images are loaded from the
input/imagesdirectory. - Images are converted to grayscale (
convert('1')) as scatter plots are black and white, reducing input channels to 1.
- Images are loaded from the
- Transforms:
transforms.ToTensor(): Converts the PIL image to a PyTorch tensor.transforms.Normalize((0.5,), (0.5,)): Normalizes the single-channel image data.
The network is a standard CNN designed for image processing and tailored for this regression task.
[Image of a general Convolutional Neural Network architecture]
-
Input Channel:
$1$ (Grayscale image). -
Layers:
-
Convolutional Layer 1 (
conv1):$1$ input channel,$8$ output channels,$3\times3$ kernel. -
Max Pooling (
pool):$2\times2$ kernel. -
2D Dropout (
dropout2d):$p=0.2$ for regularization. -
Convolutional Layer 2 (
conv2):$8$ input channels,$16$ output channels,$3\times3$ kernel. -
Max Pooling (
pool):$2\times2$ kernel. -
Flatten: Prepares the feature maps for the fully connected layers (
$16 \times 36 \times 36$ size). -
Fully Connected Layer 1 (
fc1): Maps$16 \times 36 \times 36$ features to$256$ outputs. -
Dropout (
dropout):$p=0.4$ for regularization. -
Fully Connected Layer 2 (
fc2): Maps$256$ inputs to$1$ output (the predicted correlation). -
Activation:
$\text{Tanh}$ function is applied to the final output, which conveniently scales the prediction to the required range of$[-1, 1]$ (the range of the correlation coefficient).
-
Convolutional Layer 1 (
- Device: Configured to use MPS (Metal Performance Shaders) for Apple M1/M2 chips, falling back to CPU if unavailable.
-
Hyperparameters:
-
Epochs (
n_epochs):$3$ -
Batch Size (
batchsize):$64$ -
Learning Rate (
learning_rate):$0.001$
-
Epochs (
-
Loss Function (
criterion): Mean Squared Error (nn.MSELoss) is used for this regression problem. -
Optimizer: Adam (
torch.optim.Adam) is used for optimization. -
Validation: The model is evaluated on the validation set every
$200$ steps during training to monitor performance (both validation loss and actual vs. predicted correlation).