Course Project for Gettting & Celaning Data based on Human Activity Recognition Using Smartphones Dataset
This README
file explains details around what files are included and what are their features.
Data for analysis is downloaded from the below URL
https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
This repo includes following files
run_analysis.R
CodeBook.md
avgSujectActivities.txt
This is the script used to perform analysis on raw data to create a tidy datafile called avgSujectActivities.txt
- Downloads the dataset from the URL mentioned above and unzips it to create UCI HAR Dataset folder
- Imports
test
andtrain
datsets and creates data frames from then and then Merges the training and the test sets to create one data frame. - Extracts a subset of data with only the measurements on the mean
mean()
and standard deviationstd()
for each measurement. Also excludesmeanFreq()-X
measurements or angle measurements where the term mean exists resulting in66
measurement variables. - Updates the variable names in dataframe variable names for data fame to improve readibility
- Appropriately labels the data set with descriptive activity names in place of activity Ids
- Reshapes dataset to create a data frame with average of each measurement variable for each activity and each subject
- Writes new tidy data frame to a text file to create the required tidy data set file of
180
observations and68
columns(2
columns foractivityName
andsubjectID
and66
columns for measurement variables)
To run the script, you just have to download the script and source the script from your working directory in R.
source(run_analysis.R)
The code book file describes the variables, the data, and any transformations and work performed to clean up the data.
This is the tidy data file created after after running run_analysis.R
script on the original data downloaded from
this URL
180
observations and68
columns(2
columns foractivityName
andsubjectID
and66
columns for measurement variables)- Each measurement variable column is average value for each combination of
subjectId
andactivityName