The dataset and code are for research purpose only
Analysis of current methods of Function Boundary Detection in Stripped Binary Files on executable files generated from Rust source code
The main components of this repository are:
- Ripkit : tool for cloning and compiling rust binaries
- XDA : Implementation from "XDA: Accurate, Robust Disassembly with Transfer Learning" of a function boundary detector
- ghidra_bench: tool for benchmarking ghidra
When reporting results that use the dataset or code in this repository, please cite the paper below:
Ryan Evans, William Hawkins, and Boyang Wang "RustBound: Function Boundary Detection over Rust Stripped Binaries," The 2nd EAI International Conference on Security and Privacy in Cyber-Physical Systems and Smart Vehicles (SmartSP 2024), New Orleans, LA, USA, Nov. 7 - Nov. 8, 2024.
Our datasets used in this study can be accessed through the link below (last modified: Nov. 2025):
Note: the above link need to be updated every 6 months due to certain settings of OneDrive. If you find the links are expired and you cannot access the data, please feel free to email us (Dr. Boyang Wang, boyang.wang@uc.edu). We will be update the links as soon as we can (typically within 1~2 days). Thanks!
Ripkit can:
- Clone rust crates and compile them for various targets
- Save produced Rust binaries in db
- Export rust datasets
- Profile such datasets
- Use Ghidra, or IDA to analyze function boundary detection in datasets
See the README.md file in the ripkit directory for ripkit installation and setup.
Input: The BiRNN requires ".npy" files for training. These can be generated using the command:
cli.py gen-npzsThis will extract the .text section of the provided binaires and generate feature vectors for the model.
Training: Use the command:
cli.py train-onTesting: Use the command:
cli.py test-on-nongpuThere are some preprocesing steps that are best exapline dby XDA's github repo.
Once those are done, the cli tool ryan_cli.py is what I used to help me train test and log expirements
Ripkit has just one function to support extracting function bounds from a file uploaded to IDA. Look at the command for ripkit 'ida':
python ripkit/main.py idaSimilarly ghidra is supported as well:
python ripkit/main ghidraRyan Evans evans2ra@mail.uc.edu
Boyang Wang boyang.wang@uc.edu