Skip to content

Latest commit

 

History

History
10 lines (6 loc) · 571 Bytes

README.md

File metadata and controls

10 lines (6 loc) · 571 Bytes

Text-document-classification

Python, ML concepts, NLP concepts

The text document classification is done on one of the most famous datasets called Reuters corpus dataset.

Multilabel classification is performed using different classification algorithms like Random Forest, Decision Tree, Neural Network, and SVM in step 1.

The documents are clustered and foe each cluster a different classifier algorithm is applied and the quality of each cluster is evaluated.

Enhancement of feature extraction is done by applying the auto encoders and the performance is evaluated