Scale up to Large Datasets

Research in the Pipeline

1. IO speed Analysis
2. Multi-Threading Functionality
3. Computation Bottlenecks
4. Memory/Processing Profiling
5. System Specific Changes in parameters
6. Chunking data and Batching Tasks

So that the pipeline can be effectively and efficiently scaled to very large datasets, to perform the analysis on the whole Clearly Defined Dataset.