Comparison of Nine Classification Models

This project is a comparison of 9 different techniques for text classification. The goal was to answer the question:

Does this text require simplification in order to make the text more easily understood?

The techniques involved both supervised and unsupervised models for classification. The training and test texts are included in the data folder, or can be referenced from the Kaggle web page, UMich SIADS 695 Fall21: Predicting text difficulty.

The models were examined in four stages:

Stage One used features derived from the texts, such as level of vocabulary or Flesch-Kincaid Readability rated difficulty.
Stage Two used the distribution of the vocabulary in the two classes.
Stage Three explored supervised and unsupervised techniques in a reduced feature space derived from the features in Stage One.
Stage Four explored a vector space model using cosine similarity with ideal class vectors representing each class.

A detailed discussion is available in the report.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
.gitignore		.gitignore
Models.py		Models.py
README.md		README.md
Stage2_bigrams.ipynb		Stage2_bigrams.ipynb
Stage2_simple_unigram.ipynb		Stage2_simple_unigram.ipynb
Stage2_unigram_closed_vocabulary.ipynb		Stage2_unigram_closed_vocabulary.ipynb
Stage_Four.ipynb		Stage_Four.ipynb
Stage_One.ipynb		Stage_One.ipynb
Stage_Three.ipynb		Stage_Three.ipynb
final_test_output.png		final_test_output.png
report.pdf		report.pdf
test_dataset.ipynb		test_dataset.ipynb
training_exploration.ipynb		training_exploration.ipynb
utilities.py		utilities.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparison of Nine Classification Models

About

Releases

Packages

Languages

Cameron-Grams/comparing_text_classification

Folders and files

Latest commit

History

Repository files navigation

Comparison of Nine Classification Models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages