Skip to content

Latest commit

 

History

History
30 lines (23 loc) · 1.39 KB

File metadata and controls

30 lines (23 loc) · 1.39 KB

Passengers Satisfaction Data Analysis

Big Data Analytics project in the Senior Year at Computer Engineering Department of Cairo University

Brief Problem Description

The problem we tackle in our project is a business one. Airline companies collect a lot of data about their passengers. After each trip, passengers are asked about their overall satisfaction as well as their rating of various services. Companies want to use this data to further enhance their services to maximize satisfaction. This isn’t a very straightforward task; first glances may lead to entirely inaccurate decisions as there could be hidden correlations at play. This is the task we handle in the project.

Project Pipeline

  • Data visualization
  • Data Preprocessing
  • Data Splitting - Split the data into training & testing sets (70:30)
  • Training 6 models on relationship between satisfaction level and the most correlated features.
    • Naïve Bayes
    • Random Forest
    • Decision Tree
    • K-Nearest Neighbours
    • Logistic Regression
    • Gradient Boosting
  • Comparing the 6 models using 10-fold cross validation
  • Association Rule mining

We use R in Data Visualization and Python in the rest of the project

You may check Project Documentation for further details