Skip to content

Latest commit

 

History

History
137 lines (92 loc) · 5.33 KB

README.md

File metadata and controls

137 lines (92 loc) · 5.33 KB
FoodGrab_Scrapping

 

FoodGrab_Scrapping

Github top language Github language count Repository size License

About   |   Approach   |   Technologies   |   Requirements   |   Starting   |   License   |   Author


🎯 About

Web scraping of Food Grab website using XHR requests

✨ Approach

Approach - 1

{"searchResult":{"searchID":"f621f57c6c324a03a9f28eb4231c8395"

where 2-CYKCVZNZJTDFLE is ‍restaurant_id‍

0: {id: "SGDD00739", address: {name: "Lucky Saigon - North Canal Road"},…}
address: {name: "Lucky Saigon - North Canal Road"}
businessType: "FOOD"
chainID: "729_Lucky_Saigon"
chainName: "Lucky Saigon"
estimatedDeliveryFee: {currency: {code: "SGD", symbol: "SGD", exponent: 2}, price: 300, priceDisplay: "S$3.00",…}
estimatedDeliveryTime: 30
id: "SGDD00739"
latlng: {latitude: 1.2862877, longitude: 103.84841596}

but since use of selenium was requested i tried a diffrent way

Approach - 2

  • So since i have to capture a XHR(XMLHttpRequest) request, i have used selenium wire for this for capturing the XHR request, i have used chrome driver for this.
  • Solution Desgin
1. Load the python libraries needed
2. def load_more - Load the food.grab.com page and automatically activate the "Load More" button until the page contains all the restaurants in the Singapore area
3. def capture_post_response - Use driver to make a POST request for the "grab_internal_post_api" and then decode the data and store it in json format in post_data.
4. def get_restaurant_latlng - remove all the extra and keep name and location only, then store it in a list of dictionaries.
  • Given a base_url, capture all restaurants (based on user's submitted location, e.g., sg) latitude & longitude by intercepting grab-foods internal POST request. self.grab_internal_post_api is found by manually inspecting all XHR made my grab-foods, using chrome dev tools.

  • I think aprroach 1 will be easier but will have to pass recapta test i haven't thoufht about this yet.

  • I have taken help for various resources since i have to use selenium wire for this and get data from XHR request which i haven't done yet.

🚀 Technologies

The following tools were used in this project:

✅ Requirements

Before starting 🏁, you need to have Git and python installed.

🏁 Starting

# Clone this project
$ git clone https://github.com/{{YOUR_GITHUB_USERNAME}}/foodgrab_scrapping

# Access
$ cd foodgrab_scrapping

# Setup virtual environment
$ python3 -m venv venv

# Install dependencies
$ pip install -r requirements.txt

# Run the project
$ run XHR.py file 

📝 License

This project is under license from MIT. For more details, see the LICENSE file.

Made with ❤️ by Paritosh Tripathi

 

Back to top