🎭 KobiGPT — Rabindranath Tagore Poem Generator

KobiGPT is a character-level GPT language model trained exclusively on the complete collection of Rabindranath Tagore’s Bengali poems.
It generates new verses that stylistically and linguistically resemble Tagore’s original poetic rhythm, word patterns, and emotional tone.

🧠 Model Overview

KobiGPT is based on a GPT (Generative Pretrained Transformer) architecture — a type of decoder-only transformer that predicts the next character in a sequence given the previous ones. Autoregressive modeling allows it to generate coherent text one character at a time.

Type: Character-level GPT
Architecture: Decoder-only Transformer (like GPT-2 but smaller)
Framework: PyTorch
Dataset: Complete collection of Rabindranath Tagore’s Bengali poems
Tokenizer: Character-level (139 unique characters)
Training Device: NVIDIA T4 GPU (Google Colab Free Tier)

⚙️ Model Configuration

Parameter	Description / Value
Context Length (`block_size`)	256 characters
Vocabulary Size (`vocab_size`)	139 unique characters
Transformer Layers (`n_layer`)	6
Attention Heads (`n_head`)	8
Embedding Dimension (`n_embd`)	512
Dropout	0.2
Use Bias	✅ True
Total Parameters	≈ 19.18 million
Optimizer	AdamW
Initial Learning Rate	3 × 10⁻⁴
LR Schedule	Decayed by ×0.1 at steps 3000 and 5000
Batch Size	32
Training Steps	8000
Training Duration	~1.5 hours
Hardware	NVIDIA T4 GPU (Google Colab Free Tier)

🔍 Model Architecture Explained

🔸 Embedding Layer

Each character is mapped to a 512-dimensional vector via a learnable embedding matrix.
Positional embeddings are added to retain sequence order.

🔸 Transformer Blocks

The model consists of 6 layers, each with:

Masked multi-head self-attention (8 heads)
Feed-forward projection (2-layer MLP)
Layer normalization and residual connections

🔸 Output Projection

The final layer projects the hidden representation back into the vocabulary space, producing logits for each character.

🧩 Forward Pass (Intuitive Overview)

Input sequence (e.g., "\n") is encoded into character IDs.
Each ID is converted into embeddings and combined with positional encodings.
The transformer layers use self-attention to model dependencies between all characters.
The model predicts the next character probability distribution.
Loss is computed via cross-entropy.

📈 Training Curve (Loss Progression)

Metric	Value
Final Train Loss	1.50
Final Val Loss	1.69
Training Steps	8000
GPU Time	~1.5 hours

The final validation loss of ~1.69 nats (≈2.44 bits per character) indicates strong character-level modeling capability for a small GPT trained on poetic text.

🪶 Sample Generated Poem

From a generation run without any prompt (starting from "\n"):

হাসি ছুটি তাে গান গেয়ে,
তারে এঁকে দেবে হেসে।
এ ভুবনচরণ তৃষা এসে বেদনা,
তরুণ হৃদয় উঠে শ্যামা মেঘের ফুল
ওই যেন আলো
ঝাউয়ের'পরে পড়ে আজ।
আকাশ ছিঁড়ে নাইল চোখে।
স্নেহমুখীর ছড়ায়ে নাই, চমকি
চূর্ণ ছড়িয়া চলে যায়,
তার কপোলে তার বালে দেখী,
বাঁধন কেঁদে উড়ে যায়।
হৃদয় কুসুম সাগরে
মিশে কে জানিল না কখনো,
চেয়ে না এসে না বেশি বেশি বেদে।
বসন্ত যবে কালবৈশাখে,
ডালি পাছে ঝরিয়া গাঁথিতে খুঁজে,
ফুল যে পাছে ফুটে ফুটে,
তারি সুর পথে পাতা পাতা তফুটে।
হাসিমুখে মেঘের কোণে
রয়েছে চিরদিন
ব্যাপিয়া মধু,
প্রভাতসংগীত
ভালোবাসিল আকাশ-কুমল।

With a custom prompt: "বিনয় ভাবছে আকাশ কেন নীল,"

বিনয় ভাবছে আকাশ কেন নীল,
ভবিষ্যৎ মাঞ্চিতের হানল;
আমাদের মন্ত্র দেখছি বিস্ময়ের ক্রন্দনে
আমাকে এসেছি আমার দলে, কেবল সিহাস করে।
বিশ্বাস-বীরে কল্পনা...

⚠️ Note: Since the model was trained exclusively on Tagore’s poems, it may not mention modern names like “বিনয়” unless they appear in the prompt. To guide the model, you can repeat your name in the prompt or use a leading token like "বিনয়:".

While writing prompt avoid using characters outside the known vocabulary to prevent unknown token issues. Unknown characters are currently mapped to space (' ').

🧩 Usage

🔹 Inference

You can generate text using:

python src/generate.py

You can adjust max_new_tokens and temperature in src/generate.py to control generation length and creativity.

🧾 How to Train

python src/train.py

Tweak hyperparameters in src/train.py as needed.

📜 Acknowledgements

Inspired by Andrej Karpathy’s nanoGPT implementation.
Trained and fine-tuned on Rabindranath Tagore’s Bengali poem corpus.
Dataset: Public domain Bengali poems by Rabindranath Tagore, available on Kaggle.
Developed and trained by Binoy Bhushan Barman Dipu as part of learning and experimentation with GPT architectures.

💡 Future Work

Implement BPE or WordPiece tokenization for better semantic coherence.
Experiment with larger context lengths (512+) and deeper models.
Add Bangla rhyme and meter control for stylistic conditioning.
Integrate UI-based poem generator using Streamlit or Flask.

“KobiGPT is not just a model — it’s a poetic mirror where Tagore’s spirit of words finds a neural echo.”

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
output		output
src		src
trained_models		trained_models
.gitignore		.gitignore
README.md		README.md
build_kobigpt.ipynb		build_kobigpt.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎭 KobiGPT — Rabindranath Tagore Poem Generator

🧠 Model Overview

⚙️ Model Configuration

🔍 Model Architecture Explained

🔸 Embedding Layer

🔸 Transformer Blocks

🔸 Output Projection

🧩 Forward Pass (Intuitive Overview)

📈 Training Curve (Loss Progression)

🪶 Sample Generated Poem

🧩 Usage

🔹 Inference

🧾 How to Train

📜 Acknowledgements

💡 Future Work

About

Uh oh!

Releases

Packages

Languages

binoydipu/kobigpt

Folders and files

Latest commit

History

Repository files navigation

🎭 KobiGPT — Rabindranath Tagore Poem Generator

🧠 Model Overview

⚙️ Model Configuration

🔍 Model Architecture Explained

🔸 Embedding Layer

🔸 Transformer Blocks

🔸 Output Projection

🧩 Forward Pass (Intuitive Overview)

📈 Training Curve (Loss Progression)

🪶 Sample Generated Poem

🧩 Usage

🔹 Inference

🧾 How to Train

📜 Acknowledgements

💡 Future Work

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages