Fall Summary

sentiment-analysis-project 2020-2021

We have created a machine learning program along with TF-IDF preprocessing in order to determine whether the news is fake or not and use data science to guide our conclusions and explore our dataset (Fall 2020). We are currently pursuing more advanced models using PyTorch (2021). Group Members: Hirish Chandrasekaran, Isha Gokhale, Katie Huynh, Kevin Zhang, Mateo Wang, Priyasha Agarwal. Kennard Peters helped the group for understanding the implementation and theory behind the scikit-learn models.

Video: https://drive.google.com/file/d/1ezW-NzZMqaTOB-a-nXOfkvTgH7eWlAiB/view

General Plan

Use a dataset provided by DataFlare: https://data-flair.training/blogs/advanced-python-project-detecting-fake-news/ as a starting point for our models. Experiment with different models, starting with a Passive Aggressive Classification Algorithm for Fall. Explore more advanced models using PyTorch Winter and Spring.

Timeline

Fall Summary

Week 3:

Explore project group ideas, look at different data sets. Compare ideas.

Week 4:

Finalize project group members, finalize data set, finalize theme and scope of project as it relates to sentiment analysis.

Week 5:

Start unpacking data, analyzing with pandas/numpy.

Week 6 - 9:

Get aquianted with scikit-learn, divide group up into two: group (1) purused a naive bayesian classifier approach using scikit-learn (Isha, Priyasha), group (2)(Hirish, Katie, Kevin) pursued a support vector classifier. Completed a working model, tuned parameters, pickled the SVC model, and commited both model to repository in proper branch. At the end of each meeting both groups explained their respective model and implementation to the other group.

Winter Plan

Week 1

Begin learning PyTorch. Tutorials here: https://pytorch.org/tutorials/, neural networks and backpropogation: https://www.deeplearningbook.org/, in-depth explanation of PyTorch functions: https://www.deeplearningwizard.com/deep_learning/boosting_models_pytorch/forwardpropagation_backpropagation_gradientdescent/. RNN and CNN's on text data using TorchText and PyTorch: https://github.com/bentrevett/pytorch-sentiment-analysis.

Week 2

We continued to learn PyTorch, specifically gradient descent and loss functions. Simple feedforward networks and backpropogation were discussed. Tensors: https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#sphx-glr-beginner-blitz-tensor-tutorial-py. Tensors and autograd: https://pytorch.org/tutorials/beginner/pytorch_with_examples.html.

Week 3

Introduction to RNN's. Kevin/Katie attempted to process data using TorchText. Isha/Priyasha/Hirish focused on learning RNN's in PyTorch.

Week 4

Kevin/Katie implement a RNN on our DataFlare data set and get an accuracy score. Isha/Hirish continue to work on CNN and also get a an accuracy score.

Week 5

Kevin/Katie/Priyasha continue to improve RNN. Isha/Hirish get an accuracy score for CNN.

Week 6 - 9

The RNN is updated after running into bugs and the CNN has its code explained with comments. Done with our project!

Data Sources

DataFlare, Kaggle

Technologies

Python (pandas, scikit-learn, matplotlib) for algorithm and loading/manipulating data. PyTorch and TorchText for more customizable models.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
CNN		CNN
RNN		RNN
data		data
src		src
.gitignore		.gitignore
Presentation.pdf		Presentation.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sentiment-analysis-project 2020-2021

General Plan

Timeline

Fall Summary

Week 3:

Week 4:

Week 5:

Week 6 - 9:

Winter Plan

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6 - 9

Data Sources

Technologies

About

Releases

Packages

Contributors 7

Languages

hirish99/sentiment-analysis-project

Folders and files

Latest commit

History

Repository files navigation

sentiment-analysis-project 2020-2021

General Plan

Timeline

Fall Summary

Week 3:

Week 4:

Week 5:

Week 6 - 9:

Winter Plan

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6 - 9

Data Sources

Technologies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages