This repository offers a Flask web application designed for in-depth analysis of the 20 Newsgroups dataset. Utilizing state-of-the-art models like BERT, BART, and RoBERTa, users can classify articles, obtain summaries, and get answers to their questions.
- Article Classification with BERT: Efficiently categorize articles into predefined topics.
- Document Summarization with BART: Generate concise and coherent summaries of lengthy articles.
- Question Answering with RoBERTa: Extract specific information from articles by posing questions. Two RoBERTa variants are available for this task:
- Flask Framework: The application is built on Flask, offering a lightweight web server to interact with the models.
- BERT for Classification: BERT's bidirectional context capturing capabilities are used for classifying articles.
- BART for Summarization: BART, a sequence-to-sequence model, is employed to condense articles into shorter summaries.
You can download the 20 Newsgroups dataset from this link.
- Python 3.x
- Flask
- PyTorch
- Transformers library
Clone the repository:
git clone https://github.com/OthmanMohammad/20Newsgroups-QuestionAnswering-Summarization-BERT.gitNavigate to the project directory and install the required packages:
pip install -r requirements.txtStart the Flask server:
python app.py