From c091199fb6e8b515021287c26c39ed372b8456d5 Mon Sep 17 00:00:00 2001 From: Qi Yang Date: Sat, 29 Feb 2020 18:54:59 -0800 Subject: [PATCH 1/3] fixing bugs --- docs/milestone1.Rmd | 2 +- docs/milestone1.html | 4 ++-- docs/milestone1.md | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/milestone1.Rmd b/docs/milestone1.Rmd index 9f8c8ea..b1535ac 100644 --- a/docs/milestone1.Rmd +++ b/docs/milestone1.Rmd @@ -50,7 +50,7 @@ df<-read.csv("https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PR ``` ```{r} -sum(is.na(df$PM2.5))/length(df$PM2.5) +sum(is.na(df$pm2.5))/length(df$pm2.5) ``` So there are 4.73% missing values in the `PM2.5` variable, which shows the data quality is reasonably good. We generated a new dataset for some plots by omitting the missing values. diff --git a/docs/milestone1.html b/docs/milestone1.html index 362e320..990fff8 100644 --- a/docs/milestone1.html +++ b/docs/milestone1.html @@ -468,8 +468,8 @@

Data Description

Dataset loading

df<-read.csv("https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PRSA_data_2010.1.1-2014.12.31.csv")
-
sum(is.na(df$PM2.5))/length(df$PM2.5)
-
## [1] NaN
+
sum(is.na(df$pm2.5))/length(df$pm2.5)
+
## [1] 0.04716594

So there are 4.73% missing values in the PM2.5 variable, which shows the data quality is reasonably good. We generated a new dataset for some plots by omitting the missing values.

df_clean<- na.omit(df)
diff --git a/docs/milestone1.md b/docs/milestone1.md index 472efef..3579e75 100644 --- a/docs/milestone1.md +++ b/docs/milestone1.md @@ -45,11 +45,11 @@ df<-read.csv("https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PR ```r -sum(is.na(df$PM2.5))/length(df$PM2.5) +sum(is.na(df$pm2.5))/length(df$pm2.5) ``` ``` -## [1] NaN +## [1] 0.04716594 ``` So there are 4.73% missing values in the `PM2.5` variable, which shows the data quality is reasonably good. We generated a new dataset for some plots by omitting the missing values. From 6da4b75889e37336dbe2ce7260cd45cd7d725569 Mon Sep 17 00:00:00 2001 From: Qi Yang Date: Sat, 29 Feb 2020 19:00:11 -0800 Subject: [PATCH 2/3] edit readme --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index ce1115b..dc72b0c 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@ This is the project repository for Group 12 in STAT547M in University of British Margot Chen, Qi Yang ## Links to milestones -The links below will update through the course. A release will be created when a milestone is completed. +The links below will update through the course. A release will be tagged when a milestone is completed. __Milestone 1:__ The HTML version of the project proposal can be found [here](https://stat547-ubc-2019-20.github.io/group_12_qiyangqd_xiaoyuanf/docs/milestone1.html) __Milestone 2:__ From e198de48eaabf519459df1f11294c9cd3293e6f0 Mon Sep 17 00:00:00 2001 From: Qi Yang Date: Sat, 29 Feb 2020 19:08:37 -0800 Subject: [PATCH 3/3] Edit a line in .rmd and knitted to see why the comparison request didn't appear on Github --- docs/milestone1.Rmd | 2 +- docs/milestone1.html | 2 +- docs/milestone1.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/milestone1.Rmd b/docs/milestone1.Rmd index b1535ac..c594123 100644 --- a/docs/milestone1.Rmd +++ b/docs/milestone1.Rmd @@ -24,7 +24,7 @@ Beijing, the capital city of China, is fighting against `PM2.5` pollution in rec Previous studies showed that __meteorological conditions__, such as wind and humidity, could contribute to the formation of `PM2.5`. Therefore, we speculate that there could be correlations between Beijing’s `PM2.5` concentration and the meteorological conditions in a sufficient period of time. If so, knowing the meteorological conditions can support the assessment and even prediction of air quality in Beijing. ### Data Description -The [dataset](https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data#) used in our project was obtained from University of California Irvine Machine learning Repository. It was originally uploaded by Songxi Chen in Peking University, China. It is an hourly dataset containing 1) the `PM2.5` of US Embassy in Beijing and 2) __meteorological statistics__ from Beijing Capital International Airport. The data was collected from Jan 1st, 2010 to Dec 31st, 2014. The original purpose of the dataset was to assess the effect of Chinese government’s pollution reduction plan which started from 2012. The dataset can be downloaded [here](https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PRSA_data_2010.1.1-2014.12.31.csv). +The [dataset](https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data#) used in our project was obtained from University of California Irvine Machine learning Repository. It was originally uploaded by Songxi Chen in Peking University, China. This is an hourly dataset containing 1) the `PM2.5` of US Embassy in Beijing and 2) __meteorological statistics__ from Beijing Capital International Airport. The data was collected from Jan 1st, 2010 to Dec 31st, 2014. The original purpose of the dataset was to assess the effect of Chinese government’s pollution reduction plan which started from 2012. The dataset can be downloaded [here](https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PRSA_data_2010.1.1-2014.12.31.csv). Below are the variables in the dataset: diff --git a/docs/milestone1.html b/docs/milestone1.html index 990fff8..7e750ba 100644 --- a/docs/milestone1.html +++ b/docs/milestone1.html @@ -391,7 +391,7 @@

Introduction

Data Description

-

The dataset used in our project was obtained from University of California Irvine Machine learning Repository. It was originally uploaded by Songxi Chen in Peking University, China. It is an hourly dataset containing 1) the PM2.5 of US Embassy in Beijing and 2) meteorological statistics from Beijing Capital International Airport. The data was collected from Jan 1st, 2010 to Dec 31st, 2014. The original purpose of the dataset was to assess the effect of Chinese government’s pollution reduction plan which started from 2012. The dataset can be downloaded here.

+

The dataset used in our project was obtained from University of California Irvine Machine learning Repository. It was originally uploaded by Songxi Chen in Peking University, China. This is an hourly dataset containing 1) the PM2.5 of US Embassy in Beijing and 2) meteorological statistics from Beijing Capital International Airport. The data was collected from Jan 1st, 2010 to Dec 31st, 2014. The original purpose of the dataset was to assess the effect of Chinese government’s pollution reduction plan which started from 2012. The dataset can be downloaded here.

Below are the variables in the dataset:

diff --git a/docs/milestone1.md b/docs/milestone1.md index 3579e75..b738d76 100644 --- a/docs/milestone1.md +++ b/docs/milestone1.md @@ -17,7 +17,7 @@ Beijing, the capital city of China, is fighting against `PM2.5` pollution in rec Previous studies showed that __meteorological conditions__, such as wind and humidity, could contribute to the formation of `PM2.5`. Therefore, we speculate that there could be correlations between Beijing’s `PM2.5` concentration and the meteorological conditions in a sufficient period of time. If so, knowing the meteorological conditions can support the assessment and even prediction of air quality in Beijing. ### Data Description -The [dataset](https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data#) used in our project was obtained from University of California Irvine Machine learning Repository. It was originally uploaded by Songxi Chen in Peking University, China. It is an hourly dataset containing 1) the `PM2.5` of US Embassy in Beijing and 2) __meteorological statistics__ from Beijing Capital International Airport. The data was collected from Jan 1st, 2010 to Dec 31st, 2014. The original purpose of the dataset was to assess the effect of Chinese government’s pollution reduction plan which started from 2012. The dataset can be downloaded [here](https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PRSA_data_2010.1.1-2014.12.31.csv). +The [dataset](https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data#) used in our project was obtained from University of California Irvine Machine learning Repository. It was originally uploaded by Songxi Chen in Peking University, China. This is an hourly dataset containing 1) the `PM2.5` of US Embassy in Beijing and 2) __meteorological statistics__ from Beijing Capital International Airport. The data was collected from Jan 1st, 2010 to Dec 31st, 2014. The original purpose of the dataset was to assess the effect of Chinese government’s pollution reduction plan which started from 2012. The dataset can be downloaded [here](https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PRSA_data_2010.1.1-2014.12.31.csv). Below are the variables in the dataset: