-
Patitents: uploading your cervical images and know the prodiction in few minutes:
-
Doctors: screening cervical images to save you a lot of time:
-
Labeling expert: Automatically label unlabeled data for you:
-
Machine learning enginers / Data scientists:
We do- How to find the dupicated images
- How to find the similar images
- How to find the bias images
- How to pick the important images
- How to train the imbalanced images
We don't
- address bias by adjusting layers, hyper-parameters
- get high score by fine turn anyting
- have any INNOVATION but leveraging existed/proved technologies such as PCA, SimCLR, etc.
- Semi-Self supervised learning: SimCLR(Google), SwAV(Facebook), Dino(Facebook)
- Why Semi-Self supervised learning: domain expert (doctor) is expensive, labling time is very long.
- Active learning: Detectron2(Facebook)
- Why active learning: Pixies to machine is diffrent to human beings, machine can do better to choose what they need.
- Dataset A: Google Drive
- Dataset B: Google Drive
- (original images: Google Drive)
The default path should be ./experiemnt/data. You can make new directory /experiment under the root, extract the data, then rename the directory name to data. You can also open nu_gan.py to change the default path.
Three tasks can be chosen using flags as follows.
- Unsupervised Cell-level Classification:
python nu_gan.py --task 'cell_representation'
- Unsupervised Image-level Classification:
python nu_gan.py --task 'image_classification'
- Neuclei Segmentation:
python nu_gan.py --task 'cell_segmentation'
For convenience, the parameters for training is stored in nu_gan.py, which can be changed easily.