Skip to content

hniu1/GraphLog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GraphLog: Execution Anomaly Detection for System Logs

An execution anoamly detection method based on variable-order network representation. The description instructs GraphLog and baseline methods and helps to reproduce the evaluation results. 3 parts were included:

Note: This repo does not include log parsing,if you need to use it, please check logparser

Dataset

The preprocessed datasets are provided for evaluation. We used two dataset here:

  • OpenStackLog: We collected it from OpenStack that was deployed on CouldLab, which is a testbed for research and education in cloud computing. There are 174,725 logs collected. After preprocessing, it contains 6,000 sessions as normal, 500 abnormal sessions and 36 event templates. The detail of the dataset can be found in our paper.
  • HDFS: The HDFS dataset was collected running Hadoop-based jobs from more than 200 Amazon’s EC2 nodes, and labeled by Hadoop domain experts. There are 11,175,629 logs in the dataset and it parsed 558,223 normal sequences and 16,838 abnormal sequences (2.9%). The detail of the dataset can be found here.

Requirement

  • python>=3.6
  • pytorch >= 1.1.0

Quick start

git clone https://github.com/hniu1/GraphLog.git
cd GraphLog

GraphLog

This section shows the steps how to run GraphLog.

cd GraphLog/
# Training
python AD_log_openstacklog.py train
# Testing
python AD_log_openstacklog.py predict

'training_ratio' can be set to different value in the code for different percentage of normal data as training data.

Baseline methods

4 baseline methods are used:

we use open-source machine learning-based log analysis toolkit for baseline methods, Loglizer and logdeep.

PCA

In PCA_OpenStackLog.py, first set training_ratio. Then,

cd baselines/
python PCA_OpenStackLog.py

InvariantsMiner

In InvariantsMiner_OpenStackLog.py, first set training_ratio. Then,

cd baselines/
python InvariantsMiner_OpenStackLog.py

LogCluster

In LogClustering_OpenStackLog.py, first set training_ratio. Then,

cd baselines/
python LogClustering_OpenStackLog.py

DeepLog

In deeplog_OpenStackLog.py, first set training_ratio. Then,

cd baselines/deeplog/
# training
python deeplog_OpenStackLog.py train
# testing
python deeplog_OpenStackLog.py predict

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages