Releases · ankilab/HANCOCK_MultimodalDataset

Release Overview

The v1.0 release organizes the code into clear, logical folders:

Environment setup: A Conda environment.yml defines all dependencies for smooth installation and reproducibility.

Data loaders & explorers: Jupyter notebooks guide users through loading the HANCOCK dataset (demographics, pathology, blood, surgical reports, WSIs) and performing exploratory analyses using pandas and matplotlib.

Preprocessing & feature extraction: Scripts for cleaning clinical text, normalizing lab values, and extracting histopathological features from whole‐slide images via OpenSlide and custom pipelines.

Core Modules and Workflows

To facilitate rigorous machine-learning experimentation, the release includes:

Train/Test split generation using a genetic‐algorithm approach to ensure balanced cohorts across modalities.

Multimodal fusion pipelines that integrate tabular, imaging, and textual features into unified PyTorch datasets and DataLoaders.

Model training & evaluation notebooks showcasing baseline classifiers (e.g., random forests, XGBoost) and deep‐learning architectures, complete with hyperparameter tuning and performance metrics (AUC, calibration curves).

Documentation & Citation

Comprehensive usage instructions, code comments, and example workflows are detailed in the README.md, with links to the public dataset portal (www.hancock.research.fau.eu)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Release Overview

Core Modules and Workflows

Documentation & Citation

Uh oh!

Releases: ankilab/HANCOCK_MultimodalDataset

Primary release

Release Overview

Core Modules and Workflows

Documentation & Citation

Uh oh!