Skip to content

TharinduMadhusanka/sri-lanka-addresses-web-scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

sri-lanka-addresses-web-scraping

A large dataset of Sri Lankan addresses

Web Scraping with Python

This repository contains Python code for web scraping using the requests and BeautifulSoup libraries. The code scrapes data from the "Rainbow Pages" website and saves it to a file named data.txt.

Prerequisites

Before running the code, make sure you have Python installed on your system. You can download Python from the official website: https://www.python.org/downloads/

Installation

  1. Clone this repository to your local machine using the following command:

git clone https://github.com/TharinduMadhusanka/sri-lanka-addresses-web-scraping.git

  1. Change the directory to the project folder: cd your-repo-name

  2. Install the required Python packages by running: pip install requests beautifulsoup4

Usage

To run the web scraping script, use the following command:

python main.py

The script will start scraping data from the "Rainbow Pages" website and save it to the data.txt file in the project folder. Please note that web scraping might be subject to website terms of service, so make sure to respect the website's policies and avoid making too many requests in a short time to avoid being blocked.

Here is my blog post. Check it out. 😊

Acknowledgments

  • The code in this repository is for educational purposes and may require additional modifications for other use cases.
  • Special thanks to the contributors of the requests and BeautifulSoup libraries for providing powerful tools for web scraping in Python.

About

A large dataset of Sri Lankan addresses

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages