Code accompanying this project.
In order to use the system, you will need to install the following dependencies:
- Pytorch
- Numpy
To install, build the source code from the repository by running the following command in your terminal:
git clone https://github.com/JGuymont/vae-anomaly-detector.git
cd vae-anomaly-detector
python -m venv venv
venv\Scripts\activate
python -m pip install --upgrade pipInstall Pytorch (see website for GPU supported installation):
pip install torch==1.2.0+cpu torchvision==0.4.0+cpu -f https://download.pytorch.org/whl/torch_stable.htmlInstall other requirements:
pip install -r requirements.txtYou can download the dataset on Kaggle. The SMS Spam Collection is a set of SMS tagged messages that have been collected for SMS Spam research. It contains one set of SMS messages in English of 5,574 messages, tagged acording being ham (legitimate) or spam. The file spam.csv contain one message per line. Each line is composed by two columns: v1 contains the label (ham or spam) and v2 contains the raw text.
To reproduce the experiment without changing the configuration, you should save the file spam.csv in the data/ directory.
Start by splitting the data into a training set and a test set.
python split_data.py --train_size 0.5Running this command in the terminal will create train.csv and a test.csv and save them in the directory data/. Note that you only need to run this command omce.
Train the model by running the flollowing command
python main.py --model boc