Skip to content

Latest commit

 

History

History
59 lines (47 loc) · 2.51 KB

README.md

File metadata and controls

59 lines (47 loc) · 2.51 KB

Forbes Billionaires List 2021-2024

head_picture

About the Project

This project serves as a learning experience for web scraping, MySQL database management, and data analysis of the Forbes billionaires list. The dataset spans from 2021 to 2023 yearly and includes monthly data for the year 2024. Last update July 2024.

It began with monthly data scraping from Forbes.

Initial steps involved:

  • Data scraping using R or an appropriate tool.
  • Data cleaning and organization into CSV files.
  • Importing the CSV files into a MySQL database.
  • Querying the database and saving the output as CSV files.
  • Exploratory data analysis.

The repository contains the following folders:

The CSV files contain the following fields:

  • ID
  • User_ID
  • Table_rank
  • Person
  • net_worth_inBillionUSD
  • Age_of_Person
  • Date
  • Business
  • Industry
  • Country_of_Residence
  • Continent
  • Citzenship
  • Gender

Technologies Used

  • Primarily: R (for data analysis and manipulation). Refer to the forbes_analysis files for information about necessary packages and libraries (work in progress).
  • (Optional) Web scraping tool (depending on the scraping method)
  • (Optional) MySQL for querying data

Usage

Learning R and MySQL and experimenting with data analytics.

Contributors

[Add contributors' names here]

License

This project is licensed under the GNU General Public License v3.0 License - see the LICENSE file for details.

Badges

GitHub stars GitHub forks GitHub issues GitHub license

GitHub Repository

Link to the GitHub repository