Beating main-memory bandwidths in geospatial pipelines with fast in-memory compression
MRes Project
Author: Omar Tanner (omsst2)
Report: here
codecs/*: core codecs implementing the codecs/generic_codecs.h inteface
Main programs:
test_comp.cpp: test codecsbench_comp.cpp: benchmark codecsbench_pipeline.cpp: benchmark geospatial pipelines
Additional files:
codec_collection.h: bundled codecstest_remappings.cpp: verifies Morton remappingsutil.h,transformations.h,remappings.h: CPP utilitiespy/*: Python utilitiessh/*: Shell performance-monitoring utilities
We assume a Linux environment.
- install these packages with
apt-get(you might need more, debug appropriately)g++,g++-11,libgdal-dev,python3-gdal,liblz4-dev,libzstd-dev,zlib1g-dev,liblzma-dev
- obtain the submodules in
externaland build them make
- re-build
external/FastPForandexternal/simdcompfrom these forks:- FastPFor: https://github.com/omarathon/FastPFor
- simdcomp: https://github.com/omarathon/simdcomp
- replace
codecs/custom_vec_logic_codecs.hwithagg/custom_vec_logic_codecs.h(the new file contains the modification to thecustom_rle_vecavx512codec which fuses summing into decompression) - replace
bench_pipeline.cppwithagg/bench_pipeline.cpp make clean && make
source hpc/modules.sh- replace
Makefilewithhpc/Makefile(the new Makefile contains compiler modifications for the HPC) make
Licence:
- MIT for all files in
codecs, except the TurboPFor wrapper (codecs/turbopfor_codecs.h), LZ4 wrapper (codecs/lz4_codecs.h) and 2ibench wrapper (codecs/2ibench_codecs.h) which are GPL. - GPL for everything else.
Full repo/data/report on request