MakeMore - Neural Network for Name Generation

This project trains a character-level neural network to generate new names based on a list of names. The network predicts the next character in a sequence, ultimately generating unique and creative names.

Features

Character Embedding: Converts characters into 10-dimensional embedding vectors.
Multi-Layer Perceptron (MLP): Includes several hidden layers with batch normalization and activation functions (Tanh).
Custom Dataset: Built from the names.txt dataset, where each name is tokenized into character sequences.
Dynamic Name Generation: Generates names by sampling from the predicted character probabilities.
Model Visualization: Plots the activation distributions of Tanh layers and tracks the gradients of model parameters.

Dataset

The dataset (names.txt) consists of names, one per line. Each name is processed and extended with a '.' to signify the end of the name.

Model Architecture

Embedding Layer: A learnable embedding of size n_embd = 10 for each character.
Hidden Layers: 5 hidden layers with 100 neurons each, followed by batch normalization and Tanh activation.
Output Layer: A softmax layer that predicts the next character from the vocabulary of 27 characters (including '.').

Training

Learning Rate: The learning rate is decayed from 0.1 to 0.01 after 150,000 steps.
Loss Function: Cross-entropy loss is used to measure how well the network predicts the next character.
Optimizer: Gradient-based optimization is performed by manually updating the parameters after backpropagation.
Steps: Training runs for a maximum of 200,000 steps with a batch size of 32.

Dataset Splitting

The dataset is split into training (80%), validation (10%), and test (10%) sets:

Training set: 182,625 samples
Validation set: 22,655 samples
Test set: 22,866 samples

Results

After training, the model is evaluated on the training, validation, and test sets, achieving the following losses:

Training Loss: 2.00
Validation Loss: 2.08
Test Loss: 2.08

Name Samples

Here are some sample names generated by the model:

montaymyah.
madhayla.
ejdra.
shivaelle.
arliegh.
xaviona.
halisa.

Usage

Clone the repository:

git clone https://github.com/suhass434/MakeMore.git

Install the dependencies:
```
pip install torch matplotlib
```
Place your dataset (names.txt) in the root directory.
To train and generate names, run the script:
```
python name_generator.py
```

Visualizations

The model provides two types of visualizations:

Activation Distribution: The activation distribution of each Tanh layer can be plotted to visualize how the neurons are activated throughout the network.
Gradient Update Ratios: Tracks the gradient update ratios of the model's parameters throughout the training process to ensure stable learning.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
MakeMore.ipynb		MakeMore.ipynb
README.md		README.md
names.txt		names.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MakeMore - Neural Network for Name Generation

Features

Dataset

Model Architecture

Training

Dataset Splitting

Results

Name Samples

Usage

Visualizations

About

Uh oh!

Releases

Packages

Languages

suhass434/MakeMore

Folders and files

Latest commit

History

Repository files navigation

MakeMore - Neural Network for Name Generation

Features

Dataset

Model Architecture

Training

Dataset Splitting

Results

Name Samples

Usage

Visualizations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages