This repository contains a Sentiment Analysis project that utilizes TF-IDF (Term Frequency-Inverse Document Frequency), NLTK (Natural Language Toolkit), and Logistic Regression to analyze sentiment in textual data{tweet). It has been deployed using Streamlit.
Sentiment Analysis is the process of determining the sentiment or opinion expressed in text data. This project focuses on using machine learning techniques, specifically the TF-IDF vectorization method, NLTK library for natural language processing, and Logistic Regression as a classification algorithm, to perform sentiment analysis on textual data.
- TF-IDF Vectorization: Utilizes TF-IDF to convert text data into numerical vectors, representing the importance of words in a document relative to a collection of documents.
- NLTK (Natural Language Toolkit): Employs NLTK for various natural language processing tasks such as tokenization, stemming, and stop-word removal.
- Logistic Regression: Implements Logistic Regression, a popular classification algorithm, to predict sentiment based on the features extracted using TF-IDF.
-
Installation:
- Clone the repository:
git clone https://github.com/vishal91-hub/Sentiment-Analysis.git
- Install dependencies:
pip install -r requirements.txt
- Clone the repository:
-
Training and Evaluation:
- Train the model using provided dataset:
python train.py
- Evaluate the model:
python evaluate.py
- Train the model using provided dataset:
-
Usage:
- Use the trained model for sentiment analysis on new text data:
# Example code snippet from sentiment_analyzer import SentimentAnalyzer sa = SentimentAnalyzer(model_path='path/to/saved/model') result = sa.analyze_sentiment("Your text here") print(result)
- Use the trained model for sentiment analysis on new text data:
- Python 3.x
- NLTK
- scikit-learn
- Other necessary libraries (specified in
requirements.txt
)
This project is licensed under the MIT License - see the LICENSE file for details.