Skip to content

Scraping Tested episodes on WebToons using Python and Selenium

License

Notifications You must be signed in to change notification settings

Elhamyali/webtoons-comments-in-python

Repository files navigation

Scraping Tested Comic Episodes on Webtoons using Python and Selenium

The scripts output will contain the following columns:

  • Episode Name
  • Date
  • Loves
  • Episode Number
  • Comments Count
  • Comment Username
  • Comment Description
  • Comment Likes
  • Comment Dislikes
  • Reply Username
  • Reply Description
  • Reply Likes
  • Reply Dislikes

How to Run the Script on Windows

Clone the repository to your system as a ZIP File

Screenshot 2021-04-05 133004

Click the arrow on the folder and click "Show in folder"

Screenshot 2021-04-05 132235

Right click the ZIP file and click "Extract All..."

Screenshot 2021-04-05 133642

Input your desired directory to save the folder (you will need this later)

Click "Extract"

Screenshot 2021-04-05 134152

Install Python

Click Here to download Python3.8 (requires Python3.8 or lower)

Click to open the installer

Screenshot 2021-04-05 131101

Check the "Add Python 3.8 to PATH" box then click "Install Now"

Screenshot 2021-04-05 131239

Once the installation is complete, press the "Windows" key and search for Command Prompt by typing "CMD"

Click "Open" to open the Command Prompt

Screenshot 2021-04-05 135248

Type "python --version" and press enter to verify python is installed and in PATH

Screenshot 2021-04-05 135515

Navigate to the extracted folder using the "cd" command: Type "cd C:\YOUR\DIRECTORY\HERE\webtoons-comments-in-python-main" and press enter

Use the "dir" command to veridy you're in the correct folder

Screenshot 2021-04-05 141214

Run the command "py -m pip install -r requirements.txt" to install all of the required dependencies

Screenshot 2021-04-05 141640

Wait for the installations to complete, then run the command "python webtoons_scraping.py" to execute the script

Screenshot 2021-04-05 142001

The script will display data being actively scraped until the eventual message "EXECUTION COMPLETE"

After execution, 2 output files will appear in the directory, one in CSV and one in XLSX format

Disclaimer

We checked robots.txt file of the URL: https://www.webtoons.com/en/challenge/tested/list?title_no=231173&page=1 and learned that we are allowed to scrape comic data.

About

Scraping Tested episodes on WebToons using Python and Selenium

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages