This Jupyter Notebook documents the process of handling and imputing missing data from an ecological survey conducted on Rabbit Island, located in Lake Superior. The survey includes data on various environmental and biological variables collected during fieldwork.
- Assess the extent and patterns of missing data in the dataset.
- Apply appropriate imputation techniques to fill in the gaps.
- Ensure data integrity for subsequent ecological analysis and modeling.
- Data Overview: Description and initial exploration of the dataset.
- Missing Data Analysis: Visualizations and summaries to identify missingness patterns.
- Imputation Methods: Implementation of statistical and machine learning-based imputation techniques.
- Results: Export of the cleaned dataset for further use.
- The final imputed dataset is saved as df_complete_RESULT.csv in the Output/ folder.
- Ensure required packages (e.g., pandas, numpy, sklearn, matplotlib) are installed before running the notebook.
- The methods used aim to preserve ecological validity and minimize bias in downstream analyses.
Presentation: https://docs.google.com/presentation/d/13VBPduLzPGX4jGEQeVnM4EGal4DileWsvyR7WSXtUeY/edit?usp=sharing
Map: https://studio.foursquare.com/map/public/2d2ac9f6-9c7a-4115-b611-fd703ad4551d
