-
Notifications
You must be signed in to change notification settings - Fork 121
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #611 from prateek-1803/main
Added topic modelling
- Loading branch information
Showing
5 changed files
with
93,603 additions
and
0 deletions.
There are no files selected for viewing
30 changes: 30 additions & 0 deletions
30
Natural Language Processing/Topic modeling on college reviews/ReadME.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# PROJECT TITLE: Movie Classification Model | ||
|
||
## 🎯 Goal | ||
|
||
The main goal of this project is to develop a model that extracts major topics from a large amount of reviews. | ||
## 🧵 Dataset | ||
|
||
The dataset used for this project has been scraped from CollegeDunia.com, specifically the Movies Dataset. The dataset contains the following attributes: | ||
1. Name of the user who wrote the review (Name) | ||
2. College name (college) | ||
3. review | ||
4. rating | ||
|
||
## 🧾 Description | ||
|
||
This project involves building a machine learning model to identify major topics in college reviews. The model is trained on a dataset of reviews which has been scraped from CollegeDunia.com. | ||
|
||
## 🚀 Models Implemented | ||
|
||
Latent Dirichlet Allocation (LDA) model: LDA is a Bayesian network that analyzes documents to determine the topics they belong to. It does this by assigning each word in a document to different topics, and then mapping the document to a list of topics. | ||
|
||
|
||
## 📢 Conclusion | ||
|
||
The Latent Dirichlet Allocation (LDA) model provided the major topics of discussion in user reviews. | ||
|
||
## ✒️ Your Signature | ||
|
||
Prateek Khandelwal | ||
|
Oops, something went wrong.