Skip to content

preprocessor

gereon edited this page Jan 31, 2024 · 3 revisions

Preprocessor

The main task of the preprocess utility is to prepare datasets such that they can be used by the different systems. It therefore primarily takes parameters from the workload definition and applies them to the given datasets. Internally, the utility uses the chain of responsibility pattern to parse the datasets.

Using the Preprocess Utility

The preprocess utility can be called using the following command:

python preprocess.py --system {system} \
	--vector_path {vector_dir} \
 	--vector_target_suffix {vector_target_format} \
 	--vector_output_folder {vector_output_folder}  \
 	--vector_target_crs {vector_target_crs}  \
 	--vectorization_type {vectorize_type}  \
 	--raster_path {raster_dir}  \
 	--raster_target_suffix {raster_target_format}  \
 	--raster_output_folder {raster_output_folder}  \
 	--raster_target_crs {raster_target_crs}

Building the Preprocess Utility

The preprocess utility is packaged into a separate docker container. It can be built using the following command:

docker build . --target=preprocess -t preprocess

Note: If a new version of the preprocessor shall be used, the reference to the container needs to be updated in each preprocess.sh script, which can be found at hub/deployment/files/**/preprocess.sh. If RaVeN is used in its dockerized version, the version also has to be updated there.

Clone this wiki locally