Skip to content

Latest commit

 

History

History
60 lines (45 loc) · 3.92 KB

README.md

File metadata and controls

60 lines (45 loc) · 3.92 KB

Airbnb Data Analysis

Since 2008, guests and hosts have used Airbnb to expand on traveling possibilities and present a more unique, personalized way of experiencing the world. Today, Airbnb became one of a kind service that is used and recognized by the whole world. Data analysis on millions of listings provided through Airbnb is a crucial factor for the company. These millions of listings generate a lot of data - data that can be analyzed and used for security, business decisions, understanding of customers' and providers' (hosts) behavior and performance on the platform, guiding marketing initiatives, implementation of innovative additional services and much more.

Project Overview

This dataset contains around 49,000 observations with 16 columns, encompassing a mix of categorical and numeric values. The analysis aims to explore and derive key understandings, including insights into different hosts and areas, predictions on locations, prices, and reviews, identification of busiest hosts, evaluation of traffic among different areas, and possible reasons for variations.

Python Pandas Matplotlib Seaborn Scikit Learn NumPy SciPy Feature Engineering Data Cleaning Data Preprocessing

Jupyter Notebook Google Colab GitHub

Key Findings

  • Guests preferring Entire homes or Apartments tend to stay longer in a particular neighborhood.
  • Guests opting for Private rooms generally have shorter stays compared to Entire homes or Apartments.
  • Most guests prefer accommodations with lower prices.
  • Neighborhoods with higher numbers of reviews are likely tourist hotspots.
  • Single-night stays are indicative of travelers.
  • "Williamsburg" stands out as the neighborhood with the highest number of listings.
  • The top host, "Sonder" with host ID 219517861, has 327 listings.
  • No strong correlation was observed between price, reviews, and location.
  • Manhattan emerges as the most crowded location, with 44.3% of listings.
  • The busiest hosts are Dona, Jj, Maya, Carol, and Danielle.

Tools and Skills

  • Python: Used for data analysis and manipulation.
  • Pandas: Employed for data manipulation and analysis.
  • Matplotlib and Seaborn: Utilized for data visualization to create insightful plots and graphs.
  • Google Colab: Used as the primary environment for conducting the analysis and documenting the process.

Takeaways

  • Gain insights into guest preferences and behaviors to inform marketing strategies and improve user experience.
  • Identify popular neighborhoods and hosts for targeted promotions and partnerships.
  • Enhance pricing strategies based on guest preferences and market trends.
  • Optimize resources and services to cater to traveler needs effectively.

Acknowledgments

This project was completed as part of the Data Science Trainee program at AlmaBetter.

LinkedIn