This repository contains code for a text classification task using the Transformers library in PyTorch. The model is designed to classify text into three labels. It utilizes the Hugging Face Transformers library, which provides pre-trained transformer models for natural language processing tasks.
- Python 3.x
- PyTorch
- Transformers library (transformers)
Install the required dependencies using:
pip install -r requirements.txt
To train the text classification model, run the following command:
docker build -t your_image_name .
docker run --env-file .env your_image_name
If you have an ARM Machine, run the following command instead:
docker buildx build --platform linux/arm64 -t your_image_name .
Make sure to customize the training parameters, model architecture and HuggingFace Token on the .env file.
After training, you can use the trained model for text classification. You can execute commands within the running container using:
docker exec -it your-container-name python predict.py "Your input text goes here."
This will output the predicted label for the given text.
The model architecture used in this project is based on the Hugging Face Transformers library. You can find more information about the model on the Hugging Face Model Hub.
The training data used for this project is not included in this repository. Remember to upload the .csv file to the 'data' folder.
The metrics for the validation test are the following:
If you want to play with the classification model, excecute the following on your docker container and a tab will appear on your browser:
docker run -p <local_port>:<container_port> <your_image_name> streamlit run stream_app.py
If you would like to read about a different approach to the text classification problem, proposed by the author, please refer to the Notebook: Text_classifier_sentence_transformers.ipynb
Hugging Face Transformers: For providing state-of-the-art pre-trained transformer models. Neat tutorial for text NLP classification: https://www.youtube.com/watch?v=UNezAbTasdE&t=4s
Daniel Amaya