Skip to content

Solvve/ml_english_level_bert_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

English level BERT classifier

License Python 3.7 scikit-learn 0.23.2 Solvve

Description

Text multilabel classification using BERT, word2vec, xgboost

We follow the next steps:

  1. EDA
  2. Data preprocessing
  3. Xgboost+word2vec+tf-idf Modeling
  4. BERT pretrained model

Dataset

1.https://huggingface.co/datasets/onestop_english

OneStopEnglish is a corpus of texts written at three reading levels, and demonstrates its usefulness for through two applications - automatic readability assessment and automatic text simplification.

Releases

No releases published

Packages

No packages published