GujjuGPT is a Gujrati language based LLM Model It can take Gujrati as the input, process it and gives it's output back in gujrati
It is still under-development
- Data collection - Done
- Data extraction - Done
- Data cleaning - Done
- Creating a high quality Instruction-style dataset - In Progress
- i. Fine-tuned llama 1b on 500 Q/A - Done ii. Generating more questions - In progress
- Quantization of the model - Done
- Fine-tuning - In progress (v0 done)
- Inference - Not completed
Base model: Llama2-7b (Changed the base model from Mistral to llama)
Finetuning technique: LORA + PEFT
Dataset: Sanghara(Bhasini)
- Can generate some semi-contexted gujrati text on a given prompt
- It was trained on a small dataset and is great for gujrati/gujrat-context information
- Does not have a GUI yet, it is only a console based GPT for now
- No inference
- Hasn't been evaluated on metrics/norms
- It has it's own tokenizer now allowing it to embedd input and understand context better
- Made a ChatGPT-like UI Interface for Inference
- Current stats:
-
- Training loss: 0.352
-
- Validation loss: 0.346
- Limited by Compute to train it on a larger dataset
- Currently using small chunks of data with Lora and Peft to fine-tune the model