Skip to content

AlexBarbi/NaturalLanguageUnderstanding-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLU Project: Language Modeling & Natural Language Understanding

Author: Alejandro Enrique Barbi

This repository contains the coursework for the Natural Language Understanding (NLU) course at UniTN. The project is divided into two main sections: Language Modeling (LM) and Natural Language Understanding (NLU), each further split into two parts.

Project Structure

The repository is organized as follows:

  • LM (Language Modeling)

    • part_A: Implementation and analysis of RNN and LSTM-based language models. Includes experiments with Dropout and AdamW optimization.
    • part_B: Advanced LSTM architectures including Weight Tying, Variational Dropout, and Averaged SGD (ASGD).
    • LM_report.pdf: Detailed report of the Language Modeling experiments and results.
  • NLU (Natural Language Understanding)

    • part_A: Intent Classification and Slot Filling using Recurrent Neural Networks (RNN/LSTM) with various configurations (Base, Bidirectional, Dropout).
    • part_B: Intent Classification and Slot Filling using BERT-based models (bert-base-uncased).
    • NLU_report.pdf: Detailed report of the NLU experiments and results.

Prerequisites

The project requires Python 3.x and the following libraries:

  • pytorch (torch)
  • numpy
  • scikit-optimize (skopt)
  • tqdm
  • scikit-learn (sklearn)
  • transformers (specific to NLU Part B)

You can install the dependencies using pip:

pip install torch numpy scikit-optimize tqdm scikit-learn transformers

How to Run

Each part of the project has its own main.py which executes the training and Bayesian Optimization processes. To run the experiments, navigate to the root directory of the project and execute the corresponding script.

Language Modeling (LM)

Part A: Runs optimization for rnn, lstm, dropout, and dropout_adamw models.

python LM/part_A/main.py

Part B: Runs optimization for lstm, lstm_wt (Weight Tying), var_dropout (Variational Dropout), and avsgd (Averaged SGD).

python LM/part_B/main.py

Natural Language Understanding (NLU)

Part A: Runs optimization for base, bidirectional, and dropout RNN/LSTM models for Intent Classification and Slot Filling.

python NLU/part_A/main.py

Part B: Runs optimization for the BERT-based model (bert-base-uncased).

python NLU/part_B/main.py

Results

The training scripts use Bayesian Optimization (via skopt) to find the best hyperparameters. Results, including best hyperparameters and model checkpoints, are saved in the bin directory within each respective part folder (e.g., LM/part_A/bin/).

Reports

For a detailed explanation of the methodologies, model architectures, and experimental results, please refer to the PDF reports located in the respective directories:

  • LM/LM_report.pdf
  • NLU/NLU_report.pdf

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages