Sizing reader study (radiologists versus standalone CAD) 

Hi all! 

For a project, we are setting up a reader study combined with an AI challenge and want to use the iMRMC software for analysis and sizing of our reader study. 

Within our project we want to compare the performance of 20+ radiologists for prostate cancer detection in MRI against the best AI algorithms (from the challenge).
 
We want to perform a power analysis based on simulated pilot data and came across several questions while using the iMRMC software. With this post I want to present these issues and kindly ask if you are willing, and or have the time, to help us out. 

Currently we are using pilot / simulated data to mimic a split-plot study design, in which a group of 20+ readers are divided among 2 or 3 groups and read 100 cases each (in total 200-300 cases depending on the number of readers, with 2/3 negative and 1/3 positive for cancer). We want to compare its performance with standalone CAD (formatted as an independent reader, reading under its own modality). We are able to make the software run the variance analysis and now want to use its variance estimations to run a power analysis. 
 
This is where we encounter some problems and or uncertainties about the compatibility of its function regarding our study design. 
 
•	Based on this post a prior post (https://github.com/DIDSR/iMRMC/issues/166), we ran the software without changing the study design (if I understand correctly, the variance analysis accounts for the stand-alone reads of AI).
Running this analysis provides us with very low power (=0.05)
![example_power_1](https://user-images.githubusercontent.com/86237964/153447001-acd8d57f-c050-42b7-a041-eaf061e4b71f.png)

 
•	Running the power estimation with e.g. 2 splits (as simulated in our pilot data) and “Paired Readers” set to “No” (radiologists and AI are independent readers), we see a more familiar power (=0.68)
![example_power_2](https://user-images.githubusercontent.com/86237964/153447030-5a86d3d8-cb6a-4c7d-940b-7791a30f3f0c.png)

 
However, we are still uncertain if this analysis is performed correctly. 
 
Another question that came to our minds is the distribution of the readers within our study design, and the distribution of readers as used in the power analysis. Within our design, readers are not equally distributed among groups and modalities (20+ readers in split-plot) against a single individual reader (or multiple AI reruns/reads to account for variance in training) that reads all cases.   
 
Based on our initial tests, we do have some uncertainties about whether our ideas are implementable within the power analysis functionality of iMRMC. 
 
In the attachment, I have added the .csv file used for our simulated data. The data consists of +/- 30 simulated radiologists (based on radiologist consensus), reading in a 2 split-plot, and +/- 14 AI reads (to account for variance obtained during training) on the same dataset. Radiologist performance is binarized, so we are familiar that AUC analysis is not accurate and reliable, but for now, used to test our study design ideas. 
 
Many thanks for developing the software and hope to hear from you soon. Thanks in advance

[pilot_data_AI_r31_split_plot2.csv](https://github.com/DIDSR/iMRMC/files/8042419/pilot_data_AI_r31_split_plot2.csv)
.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sizing reader study (radiologists versus standalone CAD) #168

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sizing reader study (radiologists versus standalone CAD) #168

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions