Amazon-ML-Challenge

Team : Elementals

Prateek | Manish | Adhesh | Shanmukh

Details of Files

amazon_ml_preprocessing.ipynb : code to preprocess text
amazon_ml_translation_csv.ipynb : code to translate non-english text
amazon_ml_mode.ipynb : code to create submission file from multiple submission files using mode technique
amazon_ml_training.ipynb : code to train embeddings and predict on test embeddings
amazon_ml_embeddings.ipynb : code to generate embeddings from csv file
submission_top-score.csv : submission file with top score [Accuracy : 66.85]

Details of Competition

Competition : Multi Class Text Classification
Host : Hacker-Earth
Metric : Accuracy
Time of Competition : 2days:23hrs:59min
Checkout competition here

Details of Data

Key column – PRODUCT_ID
Input features – TITLE, DESCRIPTION, BULLET_POINTS, BRAND
Target column – BROWSE_NODE_ID
Train dataset size – 2,903,024
Number of classes in Train – 9,919
Overall Test dataset size – 110,775

Data Preprocessing

Libraries used:

re
langdetect
deep-translator

Steps followed:

Removed special characters and emojis using re.
Translated non-english text to english using langdetect and deep-translator.
Removed stop-words.
Decontracted some of the words.

Our Approach

Libraries used:

Sentence_transoformer
RAPIDS

Steps followed:

First the text is converted to embeddings using pre-trained models such as paraphrase-mpnet-base-v2 , paraphrase-MiniLM-L6-v2 paraphrase-MiniLM-L3-v2
Dimension of Embeddings : 384
Embeddings of training data are sent into KNNClassifier present in CuML library
Then the trained KNNClassifier is used to predict on test embeddings.
We also used mode technique i.e. using most frequently predicted label obtained from different experiments
Cross Validation is also used to train KNNClassifier

Experiments

Used NearestNeighbour, SVM, RandomForest Classifier techniques but results are not better compared to KNNClassifier.
Different size embeddings (384,768)
Combined TITLE,DESCRIPTION and TITLE,DESCRIPTION, BULLET_POINTS and TITLE, DESCRIPTION, BULLET_POINTS

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
leaderboard		leaderboard
notebooks		notebooks
Amazon ML Challenge.pdf		Amazon ML Challenge.pdf
README.md		README.md
submission_top-score.csv		submission_top-score.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Amazon-ML-Challenge

Team : Elementals

Prateek | Manish | Adhesh | Shanmukh

Details of Files

Details of Competition

Details of Data

Data Preprocessing

Libraries used:

Steps followed:

Our Approach

Libraries used:

Steps followed:

Experiments

About

Releases

Packages

Languages

shanmukh05/Amazon-ML-Challenge

Folders and files

Latest commit

History

Repository files navigation

Amazon-ML-Challenge

Team : Elementals

Prateek | Manish | Adhesh | Shanmukh

Details of Files

Details of Competition

Details of Data

Data Preprocessing

Libraries used:

Steps followed:

Our Approach

Libraries used:

Steps followed:

Experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages