NLP is the analysis of words and not sentences
utf-8 is used in reading files as while we copy text from any site or blog, the text copied could be encoded, using utf-8 while reading the text file helps read the proper file without encoded value. ( basically, decrypts and the encrypted file so we can read the plain text) Movie != movie [Case sensivity] So we converted all the cases to lowercase to have better insight into the sentiments in it Tokenization - breaking the sentence into words. saves each word in a list Stop words are the words that add no meaning to the sentence and hence can be ignored from the list while we make tokenization of the file-
Notifications
You must be signed in to change notification settings - Fork 0
License
annapurna2003/Sentiment-analysis
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published