Skip to content

annapurna2003/Sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sentiment-analysis

NLP is the analysis of words and not sentences

A few questions you might have while reading the code

Why are we using utf-8 while reading the file?

utf-8 is used in reading files as while we copy text from any site or blog, the text copied could be encoded, using utf-8 while reading the text file helps read the proper file without encoded value. ( basically, decrypts and the encrypted file so we can read the plain text)

Why should we change all the letters or characters in the text file to lowercase?

Movie != movie [Case sensivity] So we converted all the cases to lowercase to have better insight into the sentiments in it

What does tokenization mean?

Tokenization - breaking the sentence into words. saves each word in a list

What are stop words?

Stop words are the words that add no meaning to the sentence and hence can be ignored from the list while we make tokenization of the file

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages