In the modern world, it is common to see people be absent from work for various reasons. Absenteeism at a workplace affects the accountability and availability of a person, leads to negative consequences in a team or a company such as the decline in work reputation, and the decrease in work coordination across departments. In this report, we are analyzing an existing dataset, from Kaggle, records of absenteeism at work from July 2007 to July 2010 at a courier company in Brazil. There are 740 observations and 21 variables. Important variables exist such as Distance from Residence to Work, Work load Average/day, which are important factors to determine why a worker is absent. The central question in this article is: What is the main determinant of whether someone is absent or not from work?
To read the analysis report: please click on 390RProject_YongyeTan.pdf
To check out the R code: please click on Absent.R or AbsenteeismATWork.Rmd
This is my first independent data science project using R and statistical analysis.