forked from NUstat/2025-datafest
-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathstep5_models.qmd
More file actions
30 lines (19 loc) · 1.78 KB
/
step5_models.qmd
File metadata and controls
30 lines (19 loc) · 1.78 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
---
title: "Step 5: Classification Model"
editor:
markdown:
wrap: 72
---
**Aim: Create a machine learning model to classify slices as low-grade, high-grade, or benign.**
# Summary
The classification model aims to categorize tissue slices as **low-grade, high-grade, or benign C-MIL**. Key considerations include incorporating **patient demographics** (age, gender, lesion location) to enhance accuracy, identifying **melanocyte distribution patterns** for improved classification, and prioritizing **Melan-A and Sox-10** stains over H&E, as they provide clearer visibility of melanocytes. An ensemble approach may be explored to refine predictions.
# Model Considerations
Here are some potential steps/directions to consider when creating classification models.
## Patient demographics
Using the information from this [file](https://nuwildcat-my.sharepoint.com/:x:/r/personal/akl0407_ads_northwestern_edu/_layouts/15/Doc.aspx?sourcedoc=%7B808B83F5-0AC5-4834-8518-E454850AB281%7D&file=clincal%20data%20sheet%20for%20scoring.xlsx&action=default&mobileredirect=true), add patient information including age, gender, ethnicity, laterality, and location of lesion to our ML model.
## Pattern recognition
- Ideally, our model needs to be able to identify the distribution and size/shape of melanocytes.
- Note that `low-grade CMIL's` have no-to-mild atypia, and `high-grade CMIL's`have moderate-to-severe atypia.
- Once we verify a list of discernible patterns, we can create separate models for specific patterns and then use an ensemble model to generate a final classification decision.
## Melan-A and Sox-10 vs. H&E
Since Melan-A and Sox-10 appear to display melanocyte distribution more clearly than H&E, consider weighing the former two stain types more than the latter when creating our model.