Skip to content

Identify metastatic cancer using computational histopathology

Notifications You must be signed in to change notification settings

ixig/Kaggle_Histopathology

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kaggle Histopathologic Cancer Detection

Tackling the Kaggle Histopathologic Cancer Detection Challenge to evaluate different machine-learning algorithms for identifying metastatic cancer in small image patches taken from larger digital pathology scans.

Goal

For this project, in order to understand how far traditional Computer-Vision techniques have evolved with the advent of Deep Learning, I start from the very basic of algorithms and iteratively improve each model's performance one small step at a time. Not all steps are guaranteed to improve performance, but it's necessary to try them to build a working intuition of what might work.

I start off with hand-engineered CV features (Color-Space Transforms, LBP, Gabor, Scharr, Laplacian, Harris, etc.) that work well with Shallow-ML models, and compare their performance against the automatic feature-extraction of large DL models.

Results

Validation accuracy of the baseline model the started out at 53.2%. The best Shallow-ML model topped out at 87.2% using 60 hand-engineered features. The best CNN model topped out at 97.6%.

Journey


Step Notebook Description
1 Data_Exploration Exploratory Data Analysis
2 Data_HDF5 Generate Grayscale+HED HDF5 dataset volume
3 Data_1D Generate Naïve-1D flattened .npz from HDF5 for Shallow-ML
4 LogReg Baseline Naïve-1D with Logistic Regression
5 Create_LBP_Feat Generate LBP features and Evaluate on GBT classifier
6 LBP_Euclidean_vs_KLD LBP histogram Dissimilarity metrics: Euclidean vs KL-Divergence
7 LogReg Baseline LBP features with Logistic Regression
8 Find_Landmarks Develop/Test algorithmn for finding set of 'Landmarks'
9 Generate_Landmarks Generate Landmarks on Histopathology dataset LBP features
10 Create_LDist_Feat Generate Distance-to-Landmarks (identified above) features
11 LogReg Baseline Landmark features with Logistic Regression
12 PCA Evaluate effect of PCA transformation on i. LBP and ii. Landmark features
13 SVM Evaluate SVM model with i. LBP and ii. Landmark features
14 Create_5LBP_Feat Generate 5-cell overlapping LBPs: 64x64px centered and 32x32px on four corners
15 GBT Evaluate GBT model with 'Double-LBP' (scaling-pyramid: full-size 96x96px, half-size 48x48px) features
16 Create_COPOD_Feat Classification using COPOD scores on LBP features
17 Create_2x2LBP_Feat Add 2nd set of Rotation-Invariant LBP texture features
18 Create_Gabor_Feat Add Gabor Filters (16x 2-D kernels) features
19 Create_Gabor_Scharr_Feat Add Gabor+Scharr Gradient Filter features
20 Create_Laplacian_Feat Add Laplacian Edge-Detection Filter features
21 Create_Harris_Feat Add Harris Corner-Detection Filter features
22 GBT Re-evaluate GBT model on aggregation of best Shallow-ML features
23 TPOT.ipynb Evaluate TPOT Auto-ML on Shallow-ML features (Last of Shallow Models)
24 NN Evaluate Neural Network with Shallow-ML (LBP, Gabor, Scharr) features
25 CNN_ModelA Sequential CNN with Increasing # Conv2D filters
26 CNN_ModelB Sequential CNN with Decreasing # Conv2D filters
27 CNN_ModelA-BD CNN_ModelA on full 200k Train set
28 CNN_ModelD1-BD-AUG-N Added Augmentations, Gaussian Noise, more Dropout
29 CNN_ModelF Change last Conv2D from AvgPooling2D to Conv2D, Reduce Learning-Rate

About

Identify metastatic cancer using computational histopathology

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages