Fix Pattern-Aware Vulnerability Patch Generation via In-Context Learning

Overview

In this repository, you will find a Python implementation of our PailGen. As described in our paper, PailGen is a novel automatic vulnerability patch generation approach that integrates retrieval-augmented fix pattern mining with in-context learning.

Setting up the environment

You can set up the environment by following commands:

conda create -n PailGen python=3.9.7
pip install transformers
pip install torch
pip install numpy
pip install tqdm
pip install pandas
pip install tokenizers
pip install datasets
pip install gdown
pip install tensorboard
pip install scikit-learn
pip install tree-sitter
pip install tree-sitter-c
pip install codebleu

Alternatively, we provide requirements.txt with version of packages specified to ensure the reproducibility, you may install via the following commands:

pip install -r requirements.txt

Data preprocess

python preprocess_data.py

After preprocessing dataset, you can obtain two .csv files, i.e., train.csv and test.csv.

Generate fix patterns

cd fix_patterns
python generate_patterns.py

The above command generates fix patterns from the retrieved relevant vulnerability-fix cases. The file retrieved_results_bigvul_cvefixes_top50.json contains the retrieved results of our hybrid retriever. In this file, each vulnerable code sample includes the top 50 most relevant vulnerability-fix pairs. We follow DPR to train and test our hybrid retriever.

cd ..
python process_prompt_data.py

Execute the above command to obtain all components of the LLM's prompt.

Patch generation

python llm_api_call_augment.py

The above command will generate candidate repair patches.

Calculate metrics

python calculate_combined_metrics.py

Acknowledgements

Special thanks to authors of VulMaster (Zhou et al.)
Special thanks to authors of TypeFix (Peng et al.)
Special thanks to dataset providers of CVEFixes (Bhandari et al.), Big-Vul (Fan et al.), and D2A (Zheng et al.).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fix Pattern-Aware Vulnerability Patch Generation via In-Context Learning

Overview

Setting up the environment

Data preprocess

Generate fix patterns

Patch generation

Calculate metrics

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
data		data
fix_patterns		fix_patterns
README.md		README.md
calculate_combined_metrics.py		calculate_combined_metrics.py
llm_api_call_augment.py		llm_api_call_augment.py
preprocess_data.py		preprocess_data.py
process_prompt_data.py		process_prompt_data.py
requirements.txt		requirements.txt

VulDet/PailGen

Folders and files

Latest commit

History

Repository files navigation

Fix Pattern-Aware Vulnerability Patch Generation via In-Context Learning

Overview

Setting up the environment

Data preprocess

Generate fix patterns

Patch generation

Calculate metrics

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages