NLU Project: Language Modeling & Natural Language Understanding

Author: Alejandro Enrique Barbi

This repository contains the coursework for the Natural Language Understanding (NLU) course at UniTN. The project is divided into two main sections: Language Modeling (LM) and Natural Language Understanding (NLU), each further split into two parts.

Project Structure

The repository is organized as follows:

LM (Language Modeling)
- part_A: Implementation and analysis of RNN and LSTM-based language models. Includes experiments with Dropout and AdamW optimization.
- part_B: Advanced LSTM architectures including Weight Tying, Variational Dropout, and Averaged SGD (ASGD).
- LM_report.pdf: Detailed report of the Language Modeling experiments and results.
NLU (Natural Language Understanding)
- part_A: Intent Classification and Slot Filling using Recurrent Neural Networks (RNN/LSTM) with various configurations (Base, Bidirectional, Dropout).
- part_B: Intent Classification and Slot Filling using BERT-based models (bert-base-uncased).
- NLU_report.pdf: Detailed report of the NLU experiments and results.

Prerequisites

The project requires Python 3.x and the following libraries:

pytorch (torch)
numpy
scikit-optimize (skopt)
tqdm
scikit-learn (sklearn)
transformers (specific to NLU Part B)

You can install the dependencies using pip:

pip install torch numpy scikit-optimize tqdm scikit-learn transformers

How to Run

Each part of the project has its own main.py which executes the training and Bayesian Optimization processes. To run the experiments, navigate to the root directory of the project and execute the corresponding script.

Language Modeling (LM)

Part A: Runs optimization for rnn, lstm, dropout, and dropout_adamw models.

python LM/part_A/main.py

Part B: Runs optimization for lstm, lstm_wt (Weight Tying), var_dropout (Variational Dropout), and avsgd (Averaged SGD).

python LM/part_B/main.py

Natural Language Understanding (NLU)

Part A: Runs optimization for base, bidirectional, and dropout RNN/LSTM models for Intent Classification and Slot Filling.

python NLU/part_A/main.py

Part B: Runs optimization for the BERT-based model (bert-base-uncased).

python NLU/part_B/main.py

Results

The training scripts use Bayesian Optimization (via skopt) to find the best hyperparameters. Results, including best hyperparameters and model checkpoints, are saved in the bin directory within each respective part folder (e.g., LM/part_A/bin/).

Reports

For a detailed explanation of the methodologies, model architectures, and experimental results, please refer to the PDF reports located in the respective directories:

LM/LM_report.pdf
NLU/NLU_report.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
LM		LM
NLU		NLU
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLU Project: Language Modeling & Natural Language Understanding

Project Structure

Prerequisites

How to Run

Language Modeling (LM)

Natural Language Understanding (NLU)

Results

Reports

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NLU Project: Language Modeling & Natural Language Understanding

Project Structure

Prerequisites

How to Run

Language Modeling (LM)

Natural Language Understanding (NLU)

Results

Reports

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages