Skip to content

suhass434/MakeMore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

MakeMore - Neural Network for Name Generation

This project trains a character-level neural network to generate new names based on a list of names. The network predicts the next character in a sequence, ultimately generating unique and creative names.

Features

  • Character Embedding: Converts characters into 10-dimensional embedding vectors.
  • Multi-Layer Perceptron (MLP): Includes several hidden layers with batch normalization and activation functions (Tanh).
  • Custom Dataset: Built from the names.txt dataset, where each name is tokenized into character sequences.
  • Dynamic Name Generation: Generates names by sampling from the predicted character probabilities.
  • Model Visualization: Plots the activation distributions of Tanh layers and tracks the gradients of model parameters.

Dataset

The dataset (names.txt) consists of names, one per line. Each name is processed and extended with a '.' to signify the end of the name.

Model Architecture

  • Embedding Layer: A learnable embedding of size n_embd = 10 for each character.
  • Hidden Layers: 5 hidden layers with 100 neurons each, followed by batch normalization and Tanh activation.
  • Output Layer: A softmax layer that predicts the next character from the vocabulary of 27 characters (including '.').

Training

  • Learning Rate: The learning rate is decayed from 0.1 to 0.01 after 150,000 steps.
  • Loss Function: Cross-entropy loss is used to measure how well the network predicts the next character.
  • Optimizer: Gradient-based optimization is performed by manually updating the parameters after backpropagation.
  • Steps: Training runs for a maximum of 200,000 steps with a batch size of 32.

Dataset Splitting

The dataset is split into training (80%), validation (10%), and test (10%) sets:

  • Training set: 182,625 samples
  • Validation set: 22,655 samples
  • Test set: 22,866 samples

Results

After training, the model is evaluated on the training, validation, and test sets, achieving the following losses:

  • Training Loss: 2.00
  • Validation Loss: 2.08
  • Test Loss: 2.08

Name Samples

Here are some sample names generated by the model:

  • montaymyah.
  • madhayla.
  • ejdra.
  • shivaelle.
  • arliegh.
  • xaviona.
  • halisa.

Usage

  1. Clone the repository:

    git clone https://github.com/suhass434/MakeMore.git
  2. Install the dependencies:

    pip install torch matplotlib
  3. Place your dataset (names.txt) in the root directory.

  4. To train and generate names, run the script:

    python name_generator.py

Visualizations

The model provides two types of visualizations:

  1. Activation Distribution: The activation distribution of each Tanh layer can be plotted to visualize how the neurons are activated throughout the network.

  2. Gradient Update Ratios: Tracks the gradient update ratios of the model's parameters throughout the training process to ensure stable learning.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published