NANO GIPTY

The accronym GPT (Generative Pre-Trained Transformer) is self explanatory. It's a transformer that is pre-trained and generates something..., but what is a transformer?

Role out

This is the traditional picture of a transformer architecure. The left side of the model encodes input to a sequence of states and the right side is the decoder, which takes the output from the encoder and predicts the next appropriate token.

But this is not what GPT uses. GPT is a decoder only model so throw out the left side.

This means that GPT uses the input tokens directly to create an output token of the same type. This can be fed back into the system to continuously produce new tokens.

What the hell is a token?

A token is just a short fragment of some input. The most basic token is a binary 0 or 1, but they can even be whole words / sentences depending on the systems resources. Larger tokens allow models to store more information and perform better however, they demand significant memory and computational resources.

How does this create?

Given a set of previous tokens (aka the context) he transformer uses a method called 'self-attention' to create a probaility distribution of the next token.

Transformer Architecure As described by Andrej Karpathy https://www.youtube.com/watch?v=kCc8FmEb1nY

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
minbpe		minbpe
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
input.txt		input.txt
requirements.txt		requirements.txt
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NANO GIPTY

Role out

What the hell is a token?

How does this create?

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NANO GIPTY

Role out

What the hell is a token?

How does this create?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages