Skip to content

The goal is to do semantic analysis/ classification on the restaurant review texts via bag of words model.

Notifications You must be signed in to change notification settings

YaSuei88/NLP_Porject_Restaurant_reivew_semetic_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

NLP_Porject_Restaurant_review_sematic_classification- Project Overview

This is my learning project when taking Machine Learning A-Z course on Udemy. The goal is to do sematic analysis/ classification on the restaurant review texts via bag of words model.

1. Problem definition

To deterimine a given restaurant review text is negative or possitive.

2. Data

The data provide by the course material.

3. Code and Resource Used

Python version: 3.7 Packages: pandas, numpy, matplotlib.pyplot, re, nltk

4. EDA

Check how does the data look like:

image

5. Data Cleaning

Clean the data with the following steps:

  • replace anything that is not letters into space
  • make everything lower case
  • customized the stop word list: excluded the word "not"
  • stemming everything that is not in the stop word list

6. Modelling

Split the data into train and test set. The model that is used for current project: Naive Bayes

7. Evaluataion

confusion matrix:

image

accuracy: 0.67

About

The goal is to do semantic analysis/ classification on the restaurant review texts via bag of words model.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published