Skip to content

CarsonBytes/python_scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is 1 of my previous projects that scraps website content with Python and headless Selenium ChromeDriver.

There are some difficulties developing the scripts for target sites, such as ajax pagination and image loading, random waiting time and redirection to Cloudflare page. With patient observation and proper refinements on specific functions, the problems could be solved effectively.

There are also cases that scraping is not quite feasible, such as when the captcha is needed, or there are limitation to the number of pages to view or the number of images to download for each registered user account.

The scraped website content would be stored in JSON format as well as image format. Please check the folder json for reference.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published