The Iterative Query Processor is a Python project that provides a framework for performing iterative queries on large datasets using Dask DataFrames. It includes several built-in strategies for performing iterative queries, as well as the ability to define custom strategies.
Navigate to the project directory and install the required dependencies using pip:
cd iterative-query-processor
pip install -r requirements.txt
To use the Iterative Query Processor, import the IterativeQueryProcessor class and any desired strategy classes, then instantiate the IterativeQueryProcessor with the desired data and number of partitions.
Contributions to the Iterative Query Processor are welcome! To contribute, please fork the repository, make any desired changes, and submit a pull request. Please ensure that any changes are covered by appropriate tests.
This project is licensed under the MIT License. See the LICENSE file for details.