Re-Implementation of Paper

Hi,
First of all thanks a lot for the Paper on Arxiv you released.....
I did try to implement it with the low threshold and much more easy data (just for testing)

```
import torch
import torch.nn as nn
import torch.optim as optim

# Define the Forward Forward model
def forward_forward_model(input_dim, hidden_dim, num_layers):
    layers = [nn.Linear(input_dim if i == 0 else hidden_dim, hidden_dim) for i in range(num_layers)]
    return nn.ModuleList(layers)

# Define the training loop
def train(model, inputs, labels, optimizer, criterion, thresholds, device):
    epoch_loss = 0
    for input_data, label in zip(inputs, labels):
        input_data = input_data.unsqueeze(0).to(device)  # Add batch dimension and move to device
        layer_outputs = input_data
        layer_losses = []
        for layer, threshold in zip(model, thresholds):
            pos_outputs = layer(layer_outputs)
            pos_loss = torch.pow(pos_outputs - threshold, 2).mean()
            neg_outputs = layer(layer_outputs)
            neg_loss = torch.pow(threshold - neg_outputs, 2).mean()
            layer_losses.append(pos_loss + neg_loss)
            layer_outputs = pos_outputs
        loss = sum(layer_losses)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()
    return epoch_loss / len(inputs)

# Generate random binary inputs and labels
def generate_data(num_samples, input_dim):
    inputs = torch.randint(0, 2, (num_samples, input_dim), dtype=torch.float)
    labels = (inputs.sum(dim=1) > input_dim // 2).float().unsqueeze(1)
    return inputs, labels

# Set hyperparameters
input_dim = 12
hidden_dim = 24
num_layers = 4
#thresholds = [0.1, 0.5, 1.0, 2.0] # Set threshold for each layer
thresholds = [0.005, 0.005, 0.005, 0.005] # Set threshold for each layer
#thresholds = list(reversed(thresholds))  # comment to not use reversed thresholds
learning_rate = 0.01
num_epochs = 12
num_samples = 2000

# Generate random data
inputs, labels = generate_data(num_samples, input_dim)

# Check if CUDA GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Initialize the model and optimizer
model = forward_forward_model(input_dim, hidden_dim, num_layers)
model.to(device)  # Move the model to the device
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
criterion = nn.MSELoss()

# Train the model
for epoch in range(num_epochs):
    train_loss = train(model, inputs, labels, optimizer, criterion, thresholds, device)
    print(f'Epoch: {epoch+1:02}, Train Loss: {train_loss:.3f}')
```

The test data generates lists with random 0 and 1's and if there are more 1's than 0's it equals to good data.

What i somehow do not get is the exact effect on the threshold and also the spikes in the loss. Especially the spikes in a higher loss in the middle of the training do in no way make sense to me.

Here is a sample output:
```
Epoch: 01, Train Loss: 0.007
Epoch: 02, Train Loss: 0.002
Epoch: 03, Train Loss: 0.040
Epoch: 04, Train Loss: 0.001
Epoch: 05, Train Loss: 0.000
Epoch: 06, Train Loss: 0.001
Epoch: 07, Train Loss: 0.002
Epoch: 08, Train Loss: 0.029
Epoch: 09, Train Loss: 0.000
Epoch: 10, Train Loss: 0.000
Epoch: 11, Train Loss: 0.001
Epoch: 12, Train Loss: 0.003
```

As you see in epoch 3 it suddenly spikes very very big, like something unexpected happend. Might you have some input on what i might be doing wrong? Or is there something wrong with my code to re-evaluate the paper?

The reason why i did chose different data to try on is to much much faster search over hyper parameters and their effect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Re-Implementation of Paper #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Re-Implementation of Paper #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions