Skip to content

DEIB-GECO/NCPA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NCPA

Novel nested conformal prediction analysis (NCPA) to unravel complexity in patient subtyping

Summary

This repository is related to the application of conformal prediction in a nested setting, allowing to highlight intra-sample heterogeneity in small datasets. The workflow was applied to TCGA-BRCA available at https://portal.gdc.cancer.gov/projects/TCGA-BRCA.

Code execution

  • 00_Branch1_PAM50 is a jupyter notebook which allows the execution of the first branch of the pipeline, applying NCPA allows to assign the samples to their multilabel counterparts
  • 01_Data_Preparation.py and 02_Main_Machine_Learning.py are a first test for Branch2, ensuring that the pipelines are correctly running before the application of true NCPA. The Main script needs to have as input parameters the machine learning model to run (either "LogisticRegression", "KNN", "RandomForest", "SVC") and the output folder.
  • 03_Conformal_Analysis is a jupyter notebook with the first conformal prediction analysis performed on these results.
  • 04_Branch2_Data_Preparation.py and 05_Branch2_Main_Machine_Learning.py are the true NCPA scripts. The Main script needs to have as input parameters the machine learning model to run (see point 2), the output folder, and the input folder for the data generated by the Data Preparation folder
  • 06_Conformal_Analysis_Multilabel is the final jupyter notebook which allows the cross analysis of Branch1 e Branch2

About

Unraveling the complexity of breast cancer subtype identification through nested conformal prediction analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors