Skip to content

Finetuned the BERT model using Transformer library to recommend text for multiple Classes. Also developed the API in FastAPI to enable user interaction easily.

Notifications You must be signed in to change notification settings

rishabh214/MultiClass-Classification-using-Transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recommending Classes from (USPTO ID manual) based on Text Input.

The objective was to group query text from user to the correct class among the 46 classes mentioned in USPTO ID manual.

Training the model

For Training part of this model refer Train.ipynb

I have used Transformers library and finetuned the BERT model on the Data set provided. The model was trained on 90% of the total data and rest 10% was reserved for testing.

After 4 epochs we reached the accuracy of about 90% on our training data.

Evaluation

I have saved and loaded the trained model and performed Evaluation using the testing data we reserved. I also Calulated Class level accuracy that is how many samples of each class are actually correct.

We achieved Testing accuracy of about 87%.

API

Please refer main.py

The API is developed in FastAPI using python. User sends a POST request containing the Text or Query and the system recommends the class to which the text belongs with highest Probability.

As an Additional feature system also returns the second most probable class. Please have a look at example images priovided.

Next Steps

In order to improve our model accuracy we can experiment with multiple approaches like Hyperparameter Tuning (GridCVsearch) or using different architecture/models like DistilBERT, RoBERTa, LayoutLM etc

Screenshot

image1

About

Finetuned the BERT model using Transformer library to recommend text for multiple Classes. Also developed the API in FastAPI to enable user interaction easily.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published