This project serves as a learning experience for web scraping, MySQL database management, and data analysis of the Forbes billionaires list. The dataset spans from 2021 to 2023 yearly and includes monthly data for the year 2024. Last update July 2024.
It began with monthly data scraping from Forbes.
- Data scraping using R or an appropriate tool.
- Data cleaning and organization into CSV files.
- Importing the CSV files into a MySQL database.
- Querying the database and saving the output as CSV files.
- Exploratory data analysis.
- ID
- User_ID
- Table_rank
- Person
- net_worth_inBillionUSD
- Age_of_Person
- Date
- Business
- Industry
- Country_of_Residence
- Continent
- Citzenship
- Gender
- Primarily: R (for data analysis and manipulation). Refer to the forbes_analysis files for information about necessary packages and libraries (work in progress).
- (Optional) Web scraping tool (depending on the scraping method)
- (Optional) MySQL for querying data
Learning R and MySQL and experimenting with data analytics.
[Add contributors' names here]
This project is licensed under the GNU General Public License v3.0 License - see the LICENSE file for details.