Google Drive Link: https://drive.google.com/file/d/19XMHx8qVp6gJrUr5h6wEyDXxZLI7I0kE/view?usp=sharing
This is a movie recommendation website built using Streamlit. The project leverages Natural Language Processing (NLP) to recommend movies based on user input. The recommendations are generated by calculating the similarity between movies using cosine similarity. The closer the cosine similarity values, the more relevant the recommendations.
- User-friendly Interface: Powered by Streamlit for a smooth and interactive experience.
- NLP-based Recommendations: Utilizes advanced text processing techniques to compare movie overviews.
- Cosine Similarity: Measures the similarity between movies to recommend the most relevant ones.
- TMDB 5000 Dataset: Recommendations are based on the comprehensive TMDB 5000 movies dataset.
- Dataset: The TMDB 5000 dataset is used, which includes information about movies such as title, overview, genres, and more.
- Text Preprocessing: The movie overviews are processed using NLP techniques (e.g., tokenization, vectorization).
- Similarity Calculation:
- Cosine similarity is computed between the input movie and all other movies in the dataset.
- Movies with the highest similarity scores are recommended.
- Interactive Search:
- Users can type the name of a movie.
- The system will display a list of similar movies.
-
Clone the repository:
git clone https://github.com/yourusername/movie-recommendation-website.git cd movie-recommendation-website
-
Install Dependencies: Make sure you have Python installed (version 3.9 or later). Install the required libraries:
pip install -r requirements.txt
-
Run the Application: Start the Streamlit server:
streamlit run app.py
-
Access the Website: Open your browser and navigate to
http://localhost:8501
to use the application.
The movie recommendation system uses the TMDB 5000 movies dataset, which provides:
- Movie titles
- Overviews
- Genres
- Popularity and more.
You can find the dataset here.
- Streamlit: For building the web interface.
- Pandas: For data manipulation and preprocessing.
- Scikit-learn: For computing cosine similarity.
- Numpy: For numerical operations.
- Add user ratings to enhance the recommendations.
- Include collaborative filtering for personalized recommendations.
- Incorporate more metadata like cast, crew, and release year.
- TMDB Dataset: TMDB 5000 Movie Dataset
- Streamlit: Streamlit Documentation
This project is licensed under the MIT License. See the LICENSE
file for more details.