Skip to content

NLP with various methodologies using consumer_complaints kaggle dataset

License

Notifications You must be signed in to change notification settings

schellrw/consumer_complaints

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

consumer_complaints

NLP with various methodologies using consumer_complaints kaggle dataset url: https://www.kaggle.com/datasets/kaggle/us-consumer-finance-complaints

00: EDA with pandas profiling report, data munging, and feature engineering.

00a: Data preparation for validation dataset.

00b: Data preparation for test dataset.

01: NLP with TfidfVectorizer from sklearn, product classification with several classification models (including feature-engineered variables from EDA), hyperparameter optimization with Optuna, and final ensemble model.

02: NLP with Gensim's Doc2Vec embedded vectors and product classification with several classification models.

03: NLP with simpleTransformers multi-classification model and GPU acceleration. LLMs evaluated: bert-base-uncased and bert-large-uncased.

About

NLP with various methodologies using consumer_complaints kaggle dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published