BERT-Easy-Tutorial

This is a minimalist introductory tutorial on the BERT model proposed by Google AI Language in 2018, designed to guide beginners to understand BERT at the fastest speed possible.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper Link: https://arxiv.org/abs/1810.04805

🚩Features

Ultra simplified data: A dataset consisting of only two lines of text.
Ultra detailed comments: Each line of core code has an explanation.
Comprehensive tutorial documents: Detailed introduction to data pipeline in both Chinese and English.
No redundant code: No need for graphics card training, configuration loading, model saving, and other operations.
Easily configure environment: Only Python, Pytorch, Numpy are needed to run.

💻Environment

Environmental requirements: Python 3.x、Pytorch>0.4、Numpy
The environment used for the development of this project is:

# Python 3.10.0
pip install torch==1.12.0 numpy==1.26.3

🚀Quickstart

Run prepare_vocab.py with default configuration to convert data/capus.txt to a data/vocab (Optional as vocab is already provided).

Run train.by with default configuration to start training!

For detailed explanations of data and code, please refer to the Tutorial.

Reference

Some modules of this project refer to the following repos:

BERT-pytorch

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
bert		bert
data		data
docs		docs
LICENSE		LICENSE
README.md		README.md
prepare_vocab.py		prepare_vocab.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BERT-Easy-Tutorial

🚩Features

💻Environment

🚀Quickstart

Reference

About

Uh oh!

Languages

License

chenaoxuan/BERT-Easy-Tutorial

Folders and files

Latest commit

History

Repository files navigation

BERT-Easy-Tutorial

🚩Features

💻Environment

🚀Quickstart

Reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages