Welcome to the IMDb Movie Review Scraper project! This guide is designed to help beginners understand how to start and work on this project. Follow the steps below to set up the project, run the scraper, and use the data for analysis.📈
Before you begin, make sure you have the following software installed on your computer:
- Python 3.x: Download and install Python
- Git: Download and install Git
You will also need to install some Python libraries, which we will cover in the installation steps.
Fork the Semi-supervised-sequence-learning-Project
repository to your own GitHub account. This will create a copy of the repository under your GitHub account, which you can modify without affecting the original project.
Follow these instructions to fork a repository: GitHub Forking Guide
After forking the repository, clone it to your local machine. You can do this using SSH or HTTPS.
git clone [email protected]:your-username/Semi-supervised-sequence-learning-Project.git
git clone https://github.com/your-username/Semi-supervised-sequence-learning-Project.git
Change into the project directory using the cd command:
cd Semi-supervised-sequence-learning-Project
The project requires some Python libraries. Install them using pip:
pip install beautifulsoup4 pandas
If you encounter any issues, make sure you have pip installed and are using the correct version of Python.
The script Movie_review_imdb_scrapping.ipynb
is used to scrape movie reviews from IMDb.
- Open the Jupyter Notebook file
Movie_review_imdb_scrapping.ipynb
. - Follow the instructions in the notebook to scrape movie reviews. The script uses BeautifulSoup to extract data from IMDb's web pages.
- Customize the scraper to target specific time periods, ratings, or other parameters as needed.
The project includes a Streamlit app for a more interactive experience.
- Navigate to the Web_app directory:
cd Web_app
- Install the requirements:
pip install -r requirements.txt
- Run the Streamlit app:
streamlit run streamlit_app.py
The Streamlit app allows you to upload a CSV file containing the reviews for analysis.
- When prompted by the app, upload your CSV file.
- The app will process the file and display the results.
You can customize the scraper to target different movies, time periods, or review ratings. Edit the script in Movie_review_imdb_scrapping.ipynb
to suit your needs.
If you encounter any issues, feel free to open an issue on GitHub. We are happy to assist with any problems or inquiries you may have.
🎉Contributions are welcome! If you have any suggestions for improvements or new features, feel free to submit a pull request on GitHub. Your contributions help make this project better for everyone.
The final dataset containing the scraped IMDb movie reviews can be accessed from the provided Drive link. This dataset can be used for various analysis and research purposes.
Thank you for using the IMDb Movie Review Scraper project. We hope this guide helps you get started and successfully scrape and analyze movie reviews. Happy coding!
This project needs a ⭐️ from you. Don't forget to leave a star ⭐️