This is 1 of my previous projects that scraps website content with Python and headless Selenium ChromeDriver.
There are some difficulties developing the scripts for target sites, such as ajax pagination and image loading, random waiting time and redirection to Cloudflare page. With patient observation and proper refinements on specific functions, the problems could be solved effectively.
There are also cases that scraping is not quite feasible, such as when the captcha is needed, or there are limitation to the number of pages to view or the number of images to download for each registered user account.
The scraped website content would be stored in JSON format as well as image format. Please check the folder json for reference.