Buddy Benchmark is an extensible benchmark framework. We intend to provide a platform for performance comparison of various frameworks and optimizers. This project is based on Google Benchmark.
Clone the project:
$ git clone git@github.com:buddy-compiler/buddy-benchmark.git
$ git submodule update --init
$ cd buddy-benchmark/thirdparty/opencv
$ mkdir build && cd build
$ cmake -G Ninja .. -DCMAKE_BUILD_TYPE=Release
$ ninja
Currently, the image processing benchmark includes the following frameworks or optimizers:
- OpenCV (link)
NOTE: Please build OpenCV from source to achieve the best performance.
NOTE: Please make sure the buddy-opt tool of buddy-mlir project can work well.
Run the image processing benchmark:
| CMake Options | Default Value |
|---|---|
-DBUDDY_OPT_STRIP_MINING |
256 |
-DMLIR_LINALG_TILE |
2 |
-DBUDDY_OPT_ATTR |
avx512f |
-DBUDDY_OPT_TRIPLE |
x86_64-unknown-linux-gnu |
Note:
1. Please replace the /PATH/TO/* with your local path.
2. For running executable :
i. Please replace <image path> with path of the image which is to be used for
benchmarking.
ii. Please replace <kernel name> with name of the kernel which is to be used for
benchmarking as specifed in include/ImageProcessing/Kernels.h.
ii. Please replace <kernelmorph name> with name of the unsigned int kernel which is to be used for
benchmarking as specifed in include/ImageProcessing/Kernels.h.
iii. Please replace <Boundary Option> with CONSTANT_PADDING or REPLICATE_PADDING.
Ex. ./image-processing-benchmark ../../benchmarks/ImageProcessing/Images/YuTu.png random3x3KernelAlign random3x3KernelAlignInt CONSTANT_PADDING
$ cd buddy-benchmark
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DIMAGE_PROCESSING_BENCHMARKS=ON \
-DOpenCV_DIR=/PATH/TO/OPENCV/BUILD/ \
-DEIGEN_DIR=/PATH/TO/EIGEN/SOURCE/CODE \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja image-processing-benchmark
$ cd bin && ./image-processing-benchmark <image path> <kernel name> <kernelmorph name> <Boundary Option>
| CMake Options | Default Value |
|---|---|
-DBUDDY_OPT_ATTR |
avx512f |
-DBUDDY_OPT_TRIPLE |
x86_64-unknown-linux-gnu |
Note: Please replace the /PATH/TO/* with your local path.
$ cd buddy-benchmark
$ git lfs pull
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DDEEP_LEARNING_BENCHMARKS=ON \
-DOpenCV_DIR=$PWD/../thirdparty/opencv/build/ \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja
The deep learning benchmark includes the following e2e models and operations:
- MobileNet
We generated the model code with IREE and made appropriate modifications, and then compiled it with the MLIR tool chain.
Run the MobileNet benchmark:
$ cd <path to build>/bin && ./mobilenet-benchmark
- DepthwiseConv2DNhwcHwc Operation
Run the DepthwiseConv2DNhwcHwc operation benchmark:
$ cd <path to build>/bin && ./depthwise-conv-2d-nhwc-hwc-benchmark
Currently, the audio processing benchmark includes the following frameworks or optimizers:
- KFR (link)
Note: Please replace the /PATH/TO/* with your local path.
$ cd buddy-benchmark
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DAUDIO_PROCESSING_BENCHMARKS=ON \
-DCMAKE_CXX_COMPILER=clang++ \
-DKFR_DIR=/PATH/TO/KFR/SOURCE/CODE \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja audio-processing-benchmark
$ cd bin
$ ./audio-processing-benchmark
To better demonstrate the result after processing, we provide a tool for figure plotting. To use this tool, you have to make sure that you are using python3 and that the numpy, matplotlib and scipy packages have been installed properly. Use the following command to install the required packages:
$ pip install matplotlib scipy
You can customize the python3 path by adding the option -DPYTHON_BINARY_DIR=/PATH/TO/PYTHON/BIN while building:
Note: Please replace the /PATH/TO/* with your local path.
$ cd build
$ cmake -G Ninja .. \
-DAUDIO_PROCESSING_BENCHMARKS=ON \
-DCMAKE_CXX_COMPILER=clang++ \
-DKFR_DIR=/PATH/TO/KFR/SOURCE/CODE \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD \
-DPYTHON_BINARY_DIR=/PATH/TO/PYTHON/BIN/
$ ninja audio-plot
Once the processing is done, you can use this tool to plot a comparision figure:
$ cd bin
$ ./audio-plot ../../benchmarks/AudioProcessing/Audios/NASA_Mars.wav ResultKFRIir.wav
The result is saved in bin/res.png. For more usage, use audio-plot -h for detailed information.
Some of the benchmarks are ported from gcc-loops(link) in LLVM test suit and linpackc(link)
Note: Please replace the /PATH/TO/* with your local path and the XXX with specific target name (ex: gccloops,linpackc,matrix).
$ cd buddy-benchmark
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DVECTORIZATION_BENCHMARKS=ON \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja vectorization-XXX-benchmark
$ cd bin
$ ./vectorization-XXX-benchmark
Currently, we use the Spike simulator to run the Gemmini cases. The cycle-accurate benchmark cases are working in the progress. Before building the benchmark target, please see the following table and ensure you use the correct configuration.
| Cases | Hardware Configuration |
|---|---|
| Gemmini-ResNet-101 | defaultFpConfig (link) |
We assume you have already built all the components in the Gemmini README file. Now, let's build and run the cases.
$ source /path/to/chipyard/env.sh
$ cd buddy-benchmark
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/ \
-DGEMMINI_BENCHMARKS=ON
$ ninja
$ cd bin
$ spike --extension=gemmini pk Gemmini-ResNet-101
Build and run MLIR operation optimization benchmark cases.
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DOP_OPTIMIZATION_BENCHMARKS=ON \
-DCMAKE_CXX_COMPILER=clang++ \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja <your target operation benchmark>
// Operation benchamrk supported include:
// - conv2d-nchw-fchw-benchmark
// - matmul-benchmark
OpenMP and lld LTO is required in matmul-benchmark. To ensure version compatibility with the project, it's recommended to use the LLVM toolchains built within the buddy-benchmark. Follow the steps below:
- build llvm toolchains with
lldandOpenMP.
$ cd buddy-mlir/llvm/build
$ cmake -G Ninja ../llvm \
-DLLVM_ENABLE_PROJECTS="mlir;clang;lld;openmp" \
-DLLVM_TARGETS_TO_BUILD="host;RISCV" \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DLLVM_ENABLE_RUNTIMES=all \
-DOPENMP_ENABLE_LIBOMPTARGET=OFF \
-DCMAKE_BUILD_TYPE=RELEASE
- use the
clang++inbuddy-mlir/llvm/build/bin.
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DOP_OPTIMIZATION_BENCHMARKS=ON \
-DCMAKE_CXX_COMPILER=/PATH/TO/BUDDY-MLIR/BUILD/bin/clang++ \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja matmul-benchmark
matmul-benchmarkneed to load thelibomp.soinbuddy-mlir/llvm/build/libto execute, here's a temporary way without root.
$ export LD_LIBRARY_PATH=/PATH/TO/BUDDY-MLIR/BUILD/lib/:$LD_LIBRARY_PATH
Run TVM operation optimization benchmark cases.
- Install TVM (steps).
- Enter to your TVM (virtual) environment.
- Configure TVM path and Python path.
- Navigate to your target operation directory (e.g.
buddy-benchmark/benchmarks/OpOptimization/MatMul/TVM). - (Optional) Configure the main file to specify the
targetorsizeof the benchmark. - Run the main python file.
(tvm)$ export TVM_HOME=/path/to/tvm
(tvm)$ export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
(tvm)$ cd benchmarks/OpOptimization/<target operation>/TVM
(tvm)$ python main.py