Skip to content

Latest commit

 

History

History
50 lines (38 loc) · 2 KB

File metadata and controls

50 lines (38 loc) · 2 KB

Booking.com Scraper

This scraper is using scrapfly.io and Python to scrape hotel data from booking.com.

Booking.com can be difficult to scrape because of scraper blocking so this scraper is using Scrapfly's Anti Scraping Protection Bypass feature.

Full tutorial: https://scrapfly.io/blog/how-to-scrape-bookingcom/

The scraping code is located in the bookingcom.py file. It's fully documented and simplified for educational purposes and the example scraper run code can be found in run.py file.

This scraper scrapes:

  • Booking.com search for finding hotels from search queries.
  • Booking.com hotel listing details:
    • hotel info: description, rating, features etc.
    • prices

For output examples see the ./results directory.

Fair Use Disclaimer

Note that this code is provided free of charge as is, and Scrapfly does not provide free web scraping support or consultation. For any bugs, see the issue tracker.

Setup and Use

This Booking scraper uses Python 3.10 with scrapfly-sdk package which is used to scrape and parse Booking.com's data.

  1. Ensure you have Python 3.10 and poetry Python package manager on your system.
  2. Retrieve your Scrapfly API key from https://scrapfly.io/dashboard and set SCRAPFLY_KEY environment variable:
    $ export SCRAPFLY_KEY="YOUR SCRAPFLY KEY"
  3. Clone and install Python environment:
    $ git clone https://github.com/scrapfly/scrapfly-scrapers.git
    $ cd scrapfly-scrapers/bookingcom-scraper
    $ poetry install
  4. Run example scrape:
    $ poetry run python run.py
  5. Run tests:
    $ poetry install --with dev
    $ poetry run pytest test.py
    # or specific scraping areas
    $ poetry run pytest test.py -k test_hotel_scraping
    $ poetry run pytest test.py -k test_search_scraping